The correct order is: inject loop generate fetch parse updatedb end loop solr
The nutch tutorial [0] and the crawl script are using the same. [0] : http://wiki.apache.org/nutch/NutchTutorial [1] : http://svn.apache.org/viewvc/nutch/trunk/src/bin/crawl?view=markup On Wed, Jul 3, 2013 at 8:46 AM, h b <[email protected]> wrote: > On most documents and email list, I have seen that the order of crawl for > nutch-solr is > > inject > loop > generate > fetch > updatedb > parse > end loop > solr > > When I follow this path I always see solr has 0 docs, even if i run solr > inside the loop, i still get 0 docs in solr. > > However, if I switch the order of updatedb and parse, then it works as I > expect it to. > > Would be nice to know what could be going on here. >

