Hi all

I'm new with nutch.

I have a running System (Solr 4, Nutch 1.6), currently indexing about 
360000 Documents. In order to execute kind of a source specific search,
I'd like to store the original seed-url in Solr as well.

My crawl is limited to the domain: db.ignore.external.links=true

Currently, I'm solving the problem by limiting the search to the same domain
as the seed-url. That works (mostly) quite fine.

But I have several seed urls starting in the same domain, which cannot
be seperated using this way. 

Any suggestions?
Thanks
Hofer


Reply via email to