Dear Julien Thanks for the hint.
So, an entry in the seed file could look like: http://www.nutch.org \t metatag=http://www.nutch.org And the property <name>urlmeta.tags</name> should have the value <value>metatag</value> And the field should be added to the solr schema and the solr mapping configuration. Right? This would be indeed exact the thing I need. Best Urs Am 02.05.2013 um 14:30 schrieb Julien Nioche <[email protected]>: > Hi Urs, > > The plugin urlMeta can be used for that. You can add a custom feature to > entries in your seed list and configure the parameters used by urlMeta so > that the metadata value gets transferred to the outlinks. See discussion > on http://markmail.org/message/lyk7pnbovabvcezv > > J. > > > On 2 May 2013 12:45, Urs Hofer <[email protected]> wrote: > >> Hi all >> >> I'm new with nutch. >> >> I have a running System (Solr 4, Nutch 1.6), currently indexing about >> 360000 Documents. In order to execute kind of a source specific search, >> I'd like to store the original seed-url in Solr as well. >> >> My crawl is limited to the domain: db.ignore.external.links=true >> >> Currently, I'm solving the problem by limiting the search to the same >> domain >> as the seed-url. That works (mostly) quite fine. >> >> But I have several seed urls starting in the same domain, which cannot >> be seperated using this way. >> >> Any suggestions? >> Thanks >> Hofer >> >> >> > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble

