My requirement is to crawl and index urls based on -depth 100 and -topN 100. The Nutch crawl command crawls all the urls first and then indexes them and sends the data all at once to Solr. As the depth and topN are 100 each, the whole process (crawling and indexing) takes around 4-5 hours.
I would like to know if there is a way where crawling and indexing can be done in parallel so that some data can be seen in the Solr admin screen while the Nutch crawl job is still in progress. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-crawl-and-index-parallel-way-from-Nutch-into-Solr-tp4125990.html Sent from the Nutch - User mailing list archive at Nabble.com.

