Hi Chris, You should be able to do this quick-and-dirty with a relatively simple modification to Nutch’s integrated Elasticsearch indexer plugin (called indexer-elastic). Within the org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.write() method, try changing the index name (specifically the line IndexRequestBuilder request = client.prepareIndex(defaultIndex, type, id);) from defaultIndex to the domain name of the document that you’re indexing.
And to answer Markus’s question, I think that the ElasticIndexWriter opens a single ES client connection, so you shouldn’t have to worry about a separate connection for each host. But maybe somebody with more know-how can give you a more affirmative answer. Cheers Jake On Jun 18, 2014, at 2:54 PM, Chris Mielke <[email protected]> wrote: > Hey all, > > Pretty new to Nutch and getting it integrated with Elasticsearch. I've > managed to finally get it working. Ideally, I'd like to have a separate > Elasticsearch index for each site that is crawled, or a separate > Elasticsearch index type for each site. > > For example: > Site abc.com ends up in the index "abc" in Elasticsearch > Site xyz.com ends up in the index "xyz" in Elasticsearch > > Is there a way to do this? > > Thanks! > > ..Chris > > Chris Mielke > Web Developer

