Katrina, If I am understanding you correctly, you could do this with the index-static plugin which is configured with the following property:
<property> <name>index.static</name> <value> fieldname:fieldcontent </value> <description> A simple plugin called at indexing that adds fields with static data. You can specify a list of fieldname:fieldcontent per nutch job. It can be useful when collections can't be created by urlpatterns, like in subcollection, but on a job-basis. </description> </property> Use crawlname as your fieldname and use a different config directory for each of your crawls with an appropriate value for fieldcontent set in each. Iain -----Original Message----- From: Katrina Riehl [mailto:[email protected]] Sent: Wednesday, April 8, 2015 9:41 AM To: [email protected] Subject: Re: Adding field to Nutch / Solr Right, I can create multiple collections no problem... but, what I'd really love is to put them into the same collection, just adding a field like "crawl_name" to the index. Any way I can do that? Thanks! On Wed, Apr 8, 2015 at 9:15 AM, Iain Lopata <[email protected]> wrote: > Katrina, > > When you specify the solr instance as the third parameter to bin/crawl > try specifying the collection name in the path e.g. > http://localhost:8080/solr/collection1 > > Iain > > -----Original Message----- > From: Katrina Riehl [mailto:[email protected]] > Sent: Wednesday, April 8, 2015 8:51 AM > To: [email protected] > Subject: Adding field to Nutch / Solr > > Hello, > > I am new to using Nutch. I'm developing an application that crawls > websites, and then indexes information about those websites into a > Solr instance. The problem is, it's putting all the crawled documents > into the same Solr collection. > > Is there a way for me to add a field specifying which crawl the index > came from? Is there a command line option I can add when I start the crawl? > > Thank you so much for your help. > > -- > Katrina Riehl > Continuum Analytics > [email protected] > >

