I am getting warnings in hadoop.log that segments.gen and segments_2 are not
directories, and as you can see by the listing, they are in fact files not
directories. I'm not sure what stage of the process this is happening in, as
I just now stumbled on them, but it concerns me that it says it is skipping
something. Any ideas before I start digging further?
2009-11-30 08:28:56,344 WARN mapred.FileInputFormat - Can't open index at
hdfs://nn1:9000/user/nutch/crawl/index1/segments.gen:0+2147483647, skipping.
(hdfs://nn1:9000/user/nutch/crawl/index1/segments.gen not a directory)
2009-11-30 08:29:00,509 WARN mapred.FileInputFormat - Can't open index at
hdfs://nn1:9000/user/nutch/crawl/index2/segments.gen:0+2147483647, skipping.
(hdfs://nn1:9000/user/nutch/crawl/index2/segments.gen not a directory)
2009-11-30 08:29:04,314 WARN mapred.FileInputFormat - Can't open index at
hdfs://nn1:9000/user/nutch/crawl/index2/segments_2:0+2147483647, skipping.
(hdfs://nn1:9000/user/nutch/crawl/index2/segments_2 not a directory)
[nu...@nn1 logs]$ cd ~/crawl/search/
[nu...@nn1 search]$ bin/hadoop dfs -ls crawl/index1
Found 10 items
-rw-r--r-- 1 nutch supergroup 454257 2009-11-30 08:28
/user/nutch/crawl/index1/_0.fdt
-rw-r--r-- 1 nutch supergroup 20300 2009-11-30 08:28
/user/nutch/crawl/index1/_0.fdx
-rw-r--r-- 1 nutch supergroup 81 2009-11-30 08:28
/user/nutch/crawl/index1/_0.fnm
-rw-r--r-- 1 nutch supergroup 2641385 2009-11-30 08:28
/user/nutch/crawl/index1/_0.frq
-rw-r--r-- 1 nutch supergroup 15226 2009-11-30 08:28
/user/nutch/crawl/index1/_0.nrm
-rw-r--r-- 1 nutch supergroup 5122161 2009-11-30 08:28
/user/nutch/crawl/index1/_0.prx
-rw-r--r-- 1 nutch supergroup 30777 2009-11-30 08:28
/user/nutch/crawl/index1/_0.tii
-rw-r--r-- 1 nutch supergroup 2199031 2009-11-30 08:28
/user/nutch/crawl/index1/_0.tis
-rw-r--r-- 1 nutch supergroup 20 2009-11-30 08:28
/user/nutch/crawl/index1/segments.gen
-rw-r--r-- 1 nutch supergroup 58 2009-11-30 08:28
/user/nutch/crawl/index1/segments_2
Jesse
int GetRandomNumber()
{
return 4; // Chosen by fair roll of dice
// Guaranteed to be random
} // xkcd.com