Hi, Do you mean you need to store the domain of a page in solr? For example, if *http://www.xyz.com/intro.html *is* *indexed, do you wish to store * http://www.xyz.com *in the solr document as well? If so, you can simply write a Nutch plugin and add a custom indexing filter in which you can add the field to the document. Refer to http://wiki.apache.org/nutch/WritingPluginExample
You need to change the Solr schema.xml and define this new field as well. Thanks Chethan On Thu, May 2, 2013 at 5:15 PM, Urs Hofer <[email protected]> wrote: > Hi all > > I'm new with nutch. > > I have a running System (Solr 4, Nutch 1.6), currently indexing about > 360000 Documents. In order to execute kind of a source specific search, > I'd like to store the original seed-url in Solr as well. > > My crawl is limited to the domain: db.ignore.external.links=true > > Currently, I'm solving the problem by limiting the search to the same > domain > as the seed-url. That works (mostly) quite fine. > > But I have several seed urls starting in the same domain, which cannot > be seperated using this way. > > Any suggestions? > Thanks > Hofer > > >

