Hi Lewis, Looking through the logs I did find the error below. I was reading that if nutch cant find elasticsearch it will default to Solr (which explains the last portion of the error) I dont understand why nutch cant find the ES node. I have verified that:
1) ES port is 9300 (in nutch-site.xml) 2) ES clustername is same (in nutch-site.xml and http://localhost:9200) 3) I have static IPs in my docker-compose.yml and from nutch container i can ping 172.20.128.4 (E.S. container ip). Thanks 2017-12-28 14:01:32,986 INFO elastic2.ElasticIndexWriter - Processing remaining requests [docs = 116, length = 1133796, total docs = 116] 2017-12-28 14:01:32,987 INFO elastic2.ElasticIndexWriter - Processing remaining requests [docs = 116, length = 1133796, total docs = 116] 2017-12-28 14:01:32,988 WARN mapred.LocalJobRunner - job_local387272193_0001 java.lang.Exception: NoNodeAvailableException[None of the configured nodes are available: [{#transport#-1}{172.20.128.4}{172.20.128.4:9300}]] at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: NoNodeAvailableException[None of the configured nodes are available: [{#transport#-1}{172.20.128.4}{172.20.128.4:9300}]] at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:290) at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:207) at org.elasticsearch.client.transport.support.TransportProxyClient.execute(TransportProxyClient.java:55) at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:286) at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:351) at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85) at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59) at org.apache.nutch.indexwriter.elastic2.ElasticIndexWriter.commit(ElasticIndexWriter.java:208) at org.apache.nutch.indexwriter.elastic2.ElasticIndexWriter.close(ElasticIndexWriter.java:226) at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:116) at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2017-12-28 14:01:33,847 ERROR indexer.IndexingJob - SolrIndexerJob: java.lang.RuntimeException: job failed: name=apache-nutch-2.4-SNAPSHOT.jar, jobid=job_local387272193_0001 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120) at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:158) at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:197) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:206) Sent: Tuesday, December 26, 2017 at 9:34 AM From: "lewis john mcgibbney" <[email protected]> To: [email protected] Subject: Re: Nutch 2.x does not send index to ElasticSearch 2.3.3 Hi Devil, Do your logs indicate any issues? Lewis On Mon, Dec 25, 2017 at 5:41 PM, <[email protected]> wrote: > > ---------- Forwarded message ---------- > From: devil devil <[email protected]> > To: [email protected] > Cc: > Bcc: > Date: Fri, 22 Dec 2017 21:24:51 +0100 > Subject: Nutch 2.x does not send index to ElasticSearch 2.3.3 > Hello, > I am running nutch 2.x and elasticsearch 2.3.3 in two containers. I > can log into nutch container and curl E.S. so connectivity is there. > Inject/Fetch/etc all work fine. However when i get to nutch index > elasticsearch, all i get is: > > root@b211135e1be5:~/nutch/bin# ./nutch index elasticsearch -all > IndexingJob: starting > Active IndexWriters : > ElasticIndexWriter > elastic.cluster : elastic prefix cluster > elastic.host : hostname > elastic.port : port (default 9300) > elastic.index : elastic index command > elastic.max.bulk.docs : elastic bulk index doc counts. (default > 250) > elastic.max.bulk.size : elastic bulk index length. (default > 2500500 ~2.5MB) > > I tried various E.S. versions and various combinations of settings, but > still getting nowhere. > My elasticsearch.conf is empty (should I have something here?) > Below is my nutch-site.xml (I was using indexer-elastic before but was > getting the "No indexwriters found" errors. Then I saw there is > indexer-elastic2 plugin) > >

