Hi,

I've run into a ZooKeeper connection error during the execution of a Nutch hadoop job. The tasks stall on connection error to ZooKeeper server. Here's what I know:

1. ZK connection error is the only known problem, other logs report no issue

2. Error message on YARN NodeManager on one of the slaves is:

2017-08-16 19:03:42,280 INFO [main-SendThread(localhost:2181)] 
org.apache.zookeeper.ClientCnxn: Opening socket connection to server 
localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error)
2017-08-16 19:03:42,281 WARN [main-SendThread(localhost:2181)] 
org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, 
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused

The connection keeps failing until it hits the 10min limit and the task fails.

3. ZooKeeper Server is deployed only on master

4. Cluster managed by CloudEra Manager 5.12.

Could a configuration on Nutch side or CloudEra Manager side be missing? There are no ZK servers on the slaves and the NodeManager should be connecting to the ZK server on the master, instead of localhost:2181.

Any suggestion or help is greatly appreciated!

Thank you,

Michael

Reply via email to