Hi, The fetch command returns immediately without downloading any urls. At least according to my experience. Can somebody else try to fetch some urls to make sure, see if I am in the wrong or not?
I use the following process to run the command: $ export NUTCH_ROOT=./nutch $ svn co http://svn.apache.org/repos/asf/nutch/trunk/ $NUTCH_ROOT $ ant $ export NUTCH_HOME=$NUTCH_ROOT/runtime/local Then a little bit of configuration: http.agent.name and http.robots.agents properties in $NUTCH_HOME/conf/nutch-default.xml, as well as Gora in $NUTCH_HOME/conf/gora.properties. Finally: $ $NUTCH_HOME/bin/nutch inject seeds InjectorJob: starting InjectorJob: urlDir: seeds InjectorJob: finished $ $NUTCH_HOME/bin/nutch generate GeneratorJob: Selecting best-scoring urls due for fetch. GeneratorJob: starting GeneratorJob: filtering: true GeneratorJob: done GeneratorJob: generated batch id: 1291539079-2006862361 $ $NUTCH_HOME/bin/nutch fetch 1291539079-2006862361 FetcherJob: starting FetcherJob : timelimit set for : -1 FetcherJob: threads: 10 FetcherJob: parsing: false FetcherJob: resuming: false FetcherJob: batchId: 1291539079-2006862361 FetcherJob: done $ Nothing gets fetched. This is the relatively immediate fix: Index: src/java/org/apache/nutch/fetcher/FetcherJob.java =================================================================== --- src/java/org/apache/nutch/fetcher/FetcherJob.java (revision 1042291) +++ src/java/org/apache/nutch/fetcher/FetcherJob.java (working copy) @@ -174,6 +174,7 @@ } else { currentJob.setNumReduceTasks(numTasks); } + currentJob.waitForCompletion(true); ToolUtil.recordJobStatus(null, currentJob, results); return results; } Alexis

