Hello, using 'cygwin' running 'crawl' command as in
nutch-1.2/bin/nutch crawl seed/urls -dir c1 -depth 3 -threads 1 >& c1.log everything works as expected. In particular the 'linkdb' is created and populated correctly. The 'hadoop' logs read: 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: starting at 2011-01-07 11:51:55 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: linkdb: c4/linkdb 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: URL normalize: true 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: URL filter: true 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c4/segments/20110107114838* 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c4/segments/20110107114949* 2011-01-07 11:51:55,129 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c4/segments/20110107115101* 2011-01-07 11:51:55,144 WARN mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2011-01-07 11:52:12,270 INFO crawl.LinkDb - LinkDb: finished at 2011-01-07 11:52:12, elapsed: 00:00:17 On the contrary using 'cygwin' running 'invertlinks' as in nutch-1.2/bin/nutch invertlinks c1/linkdb -dir c1/segments over the same or any other input segments the resulting 'linkdb' is created correctly but remains empty. Then the 'hadoop' logs read: 2011-01-07 11:45:37,126 INFO crawl.LinkDb - LinkDb: starting at 2011-01-07 11:45:37 2011-01-07 11:45:37,126 INFO crawl.LinkDb - LinkDb: linkdb: c1/linkdb6 2011-01-07 11:45:37,126 INFO crawl.LinkDb - LinkDb: URL normalize: true 2011-01-07 11:45:37,126 INFO crawl.LinkDb - LinkDb: URL filter: true 2011-01-07 11:45:37,142 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c1/segments/20110106153349* 2011-01-07 11:45:37,142 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c1/segments/20110106153544* 2011-01-07 11:45:37,142 INFO crawl.LinkDb - LinkDb: adding segment: * file:/D:/mynutch/c1/segments/20110106154120* 2011-01-07 11:45:53,314 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2011-01-07 11:45:54,236 INFO crawl.LinkDb - LinkDb: finished at 2011-01-07 11:45:54, elapsed: 00:00:17 Notice the difference in the 'WARN' message. Some path issue i suspect. Any ideas? Thx

