On Fri, Jul 29, 2011 at 8:56 PM, abhayd <ajdabhol...@hotmail.com> wrote:
> hi
>
> I have a xml file which has url, category,subcategory, title kind of
> details.
>
> and we crawl the urls in xml using Nutch. Anyway for use to merge both?
[...]

Not sure that I follow your requirements, and it has
been some time since I used Nutch. But, if I understand
correctly, you should be able to do the following:
* Populate the Solr index from the XML file, leaving
  the crawl_data_summary and crawl_data_body_content
  fields blank.
* Crawl the URLs with Nutch, and use its solrindex command
  to fill these two fields.

Regards,
Gora

Reply via email to