Hi Lewis,

   Looking through the logs I did find the error below. I was reading that if 
nutch cant find elasticsearch it will default to Solr (which explains the last 
portion of the error)
   I dont understand why nutch cant find the ES node.
   I have verified that:

      1) ES port is 9300 (in nutch-site.xml)
      2) ES clustername is same (in nutch-site.xml and http://localhost:9200)
      3) I have static IPs in my docker-compose.yml and from nutch container i 
can ping 172.20.128.4 (E.S. container ip).

Thanks
2017-12-28 14:01:32,986 INFO  elastic2.ElasticIndexWriter - Processing 
remaining requests [docs = 116, length = 1133796, total docs = 116]
2017-12-28 14:01:32,987 INFO  elastic2.ElasticIndexWriter - Processing 
remaining requests [docs = 116, length = 1133796, total docs = 116]
2017-12-28 14:01:32,988 WARN  mapred.LocalJobRunner - job_local387272193_0001
java.lang.Exception: NoNodeAvailableException[None of the configured nodes are 
available: [{#transport#-1}{172.20.128.4}{172.20.128.4:9300}]]
    at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: NoNodeAvailableException[None of the configured nodes are available: 
[{#transport#-1}{172.20.128.4}{172.20.128.4:9300}]]
    at 
org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:290)
    at 
org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:207)
    at 
org.elasticsearch.client.transport.support.TransportProxyClient.execute(TransportProxyClient.java:55)
    at 
org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:286)
    at 
org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:351)
    at 
org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
    at 
org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
    at 
org.apache.nutch.indexwriter.elastic2.ElasticIndexWriter.commit(ElasticIndexWriter.java:208)
    at 
org.apache.nutch.indexwriter.elastic2.ElasticIndexWriter.close(ElasticIndexWriter.java:226)
    at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:116)
    at 
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
    at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2017-12-28 14:01:33,847 ERROR indexer.IndexingJob - SolrIndexerJob: 
java.lang.RuntimeException: job failed: name=apache-nutch-2.4-SNAPSHOT.jar, 
jobid=job_local387272193_0001
    at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
    at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:158)
    at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:197)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:206)
 

Sent: Tuesday, December 26, 2017 at 9:34 AM
From: "lewis john mcgibbney" <[email protected]>
To: [email protected]
Subject: Re: Nutch 2.x does not send index to ElasticSearch 2.3.3
Hi Devil,
Do your logs indicate any issues?
Lewis

On Mon, Dec 25, 2017 at 5:41 PM, <[email protected]> wrote:

>
> ---------- Forwarded message ----------
> From: devil devil <[email protected]>
> To: [email protected]
> Cc:
> Bcc:
> Date: Fri, 22 Dec 2017 21:24:51 +0100
> Subject: Nutch 2.x does not send index to ElasticSearch 2.3.3
> Hello,
> I am running nutch 2.x and elasticsearch 2.3.3 in two containers. I
> can log into nutch container and curl E.S. so connectivity is there.
> Inject/Fetch/etc all work fine. However when i get to nutch index
> elasticsearch, all i get is:
>
> root@b211135e1be5:~/nutch/bin# ./nutch index elasticsearch -all
> IndexingJob: starting
> Active IndexWriters :
> ElasticIndexWriter
> elastic.cluster : elastic prefix cluster
> elastic.host : hostname
> elastic.port : port (default 9300)
> elastic.index : elastic index command
> elastic.max.bulk.docs : elastic bulk index doc counts. (default
> 250)
> elastic.max.bulk.size : elastic bulk index length. (default
> 2500500 ~2.5MB)
>
> I tried various E.S. versions and various combinations of settings, but
> still getting nowhere.
> My elasticsearch.conf is empty (should I have something here?)
> Below is my nutch-site.xml (I was using indexer-elastic before but was
> getting the "No indexwriters found" errors. Then I saw there is
> indexer-elastic2 plugin)
>
>

Reply via email to