On Fri, Jul 29, 2011 at 8:56 PM, abhayd <ajdabhol...@hotmail.com> wrote: > hi > > I have a xml file which has url, category,subcategory, title kind of > details. > > and we crawl the urls in xml using Nutch. Anyway for use to merge both? [...]
Not sure that I follow your requirements, and it has been some time since I used Nutch. But, if I understand correctly, you should be able to do the following: * Populate the Solr index from the XML file, leaving the crawl_data_summary and crawl_data_body_content fields blank. * Crawl the URLs with Nutch, and use its solrindex command to fill these two fields. Regards, Gora