Hi All,

I am trying to send data from Hive to ES. My job keeps on getting 
connection timeouts. On the ES side, i do not see any errors though. 

2014-04-29 21:11:08,783 INFO 
org.apache.commons.httpclient.HttpMethodDirector: I/O exception 
(java.net.ConnectException) caught when processing request: Connection 
timed out: connect

2014-04-29 21:11:08,783 INFO 
org.apache.commons.httpclient.HttpMethodDirector: Retrying request

2014-04-29 21:11:29,807 INFO 
org.apache.commons.httpclient.HttpMethodDirector: I/O exception 
(java.net.ConnectException) caught when processing request: Connection 
timed out: connect

2014-04-29 21:11:29,807 INFO 
org.apache.commons.httpclient.HttpMethodDirector: Retrying request


...... 


org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {"col1":7225,"col2":27041,"col3":0.93}

        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:673)

        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)

        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)

        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:266)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1233)

        at org.apache.hadoop.mapred.Child.main(Child.java:260)

Caused by: org.elasticsearch.hadoop.rest.EsHadoopProtocolException: Connection 
error (check network and/or proxy settings) - out of nodes and retries

        at 
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:96)

        at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:275)

        at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:267)

        at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:326)

        at 
org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:268)

        at 
org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.initSingleIndex(EsOutputFormat.java:210)

        at 
org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:199)

        at 
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:58)

        at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:637)

        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)

        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)

        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)

        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)

        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)

        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)

        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)

        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)

        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:654)

        ... 9 more

Caused by: java.net.ConnectException: Connection timed out: connect

        at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)

        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)

        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)

        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:157)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)

        at java.net.Socket.connect(Socket.java:579)

        at java.net.Socket.connect(Socket.java:528)

        at java.net.Socket.<init>(Socket.java:425)

        at java.net.Socket.<init>(Socket.java:280)

        at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:79)

        at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:121)

        at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:706)

        at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:386)

        at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)

        at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)

        at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)

        at 
org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:298)

        at 
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:80)

        ... 26 more



I have tried different batch sizes and the result has been the same. ES 
machines at times become CPU bound but memory never comes under strain. I have 
tried tweaking below settings:


'es.batch.size.entries'='1000',

'es.http.timeout'='10m',

'es.batch.write.refresh'='false',

'es.action.heart.beat.lead'='60s'


Can this be a ES issue where it cannot keepup with incoming data load? 


Thanks!

Ravi

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/32542a4c-dc62-442b-a7aa-65d4976c62ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to