This is possible. Moreover you can run crawler more than one. You can research Apache Oozie 21 Mar 2014 13:44 tarihinde "reddibabu" <[email protected]> yazdı:
> My requirement is to crawl and index urls based on -depth 100 and -topN > 100. > The Nutch crawl command crawls all the urls first and then indexes them and > sends the data all at once to Solr. As the depth and topN are 100 each, the > whole process (crawling and indexing) takes around 4-5 hours. > > I would like to know if there is a way where crawling and indexing can be > done in parallel so that some data can be seen in the Solr admin screen > while the Nutch crawl job is still in progress. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-crawl-and-index-parallel-way-from-Nutch-into-Solr-tp4125990.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

