[
https://issues.apache.org/jira/browse/HADOOP-16763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999663#comment-16999663
]
Masatake Iwasaki edited comment on HADOOP-16763 at 12/19/19 2:33 AM:
---------------------------------------------------------------------
[~elgoiri], I should have set
{{yarn.resourcemanager.ha.curator-leader-elector.enabled}} to {{true}} to
reproduce the issue. I got the error below with zookeeper-3.5.6.jar on the
classpath of RM:
{noformat}
2019-12-19 02:23:25,754 ERROR
org.apache.curator.framework.recipes.leader.LeaderLatch: getChildren() failed.
rc = -6
2019-12-19 02:23:25,853 INFO
org.apache.curator.framework.state.ConnectionStateManager: State change:
SUSPENDED
2019-12-19 02:23:25,901 INFO
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding
protocol
org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB to
the server
2019-12-19 02:23:25,915 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2019-12-19 02:23:25,916 INFO org.apache.hadoop.ipc.Server: IPC Server listener
on 8033: starting
2019-12-19 02:23:26,269 INFO org.apache.zookeeper.ClientCnxn: Opening socket
connection to server 98e7b66e95e3/172.18.0.11:2181. Will not attempt to
authenticate using SASL (unknown error)
2019-12-19 02:23:26,270 INFO org.apache.zookeeper.ClientCnxn: Socket connection
established, initiating session, client: /172.18.0.11:57042, server:
98e7b66e95e3/172.18.0.11:2181
2019-12-19 02:23:26,282 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server 98e7b66e95e3/172.18.0.11:2181, sessionid =
0x2000dc764480006, negotiated timeout = 10000
2019-12-19 02:23:26,282 INFO
org.apache.curator.framework.state.ConnectionStateManager: State change:
RECONNECTED
2019-12-19 02:23:26,287 WARN org.apache.zookeeper.ClientCnxn: Session
0x2000dc764480006 for server 98e7b66e95e3/172.18.0.11:2181, unexpected error,
closing socket connection and attempting reconnect
java.io.IOException: Xid out of order. Got Xid 5 with err -6 expected Xid 4 for
a packet with details: clientPath:/zookeeper/config
serverPath:/zookeeper/config finished:false header:: 4,4 replyHeader:: 0,0,-4
request:: '/zookeeper/config,T response::
at
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:907)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
2019-12-19 02:23:26,391 INFO org.apache.curator.framework.state.ConnectionStateM
{noformat}
If I replace the zookeeper (client) jar on the classpath, it worked.
{noformat}
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e
's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
$ docker exec hadoop01 /hadoop/bin/yarn rmadmin -getServiceState rm1
2019-12-19 01:52:27,597 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
active
{noformat}
If both of zookeeper-3.4.14.jar and zookeeper-3.5.6.jar are on the classpath, I
get another error below.
{noformat}
$ docker exec hadoop01 cat /hadoop/etc/hadoop/hadoop-env.sh | grep
'^export.*CLASSPATH'
export HADOOP_CLASSPATH="/zookeeper/zookeeper-3.4.14.jar"
export HADOOP_USER_CLASSPATH_FIRST="yes"
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e
's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
/hadoop/share/hadoop/common/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/common/lib/zookeeper-jute-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-jute-3.5.6.jar
{noformat}
{noformat}
2019-12-19 02:26:40,040 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting
ResourceManager
java.lang.NoSuchMethodError:
org.apache.zookeeper.server.quorum.flexible.QuorumMaj.<init>(Ljava/util/Map;)V
at
org.apache.curator.framework.imps.EnsembleTracker.<init>(EnsembleTracker.java:57)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:159)
at
org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:165)
at
org.apache.hadoop.util.curator.ZKCuratorManager.start(ZKCuratorManager.java:154)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndStartZKManager(ResourceManager.java:419)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createEmbeddedElector(ResourceManager.java:385)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:333)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
{noformat}
was (Author: iwasakims):
[~elgoiri], I should have set
{{yarn.resourcemanager.ha.curator-leader-elector.enabled}} to {{true}} to
reproduce the issue. I got the error below with zookeeper-3.5.6.jar on the
classpath of RM:
{noformat}
2019-12-19 02:23:25,754 ERROR
org.apache.curator.framework.recipes.leader.LeaderLatch: getChildren() failed.
rc = -6
2019-12-19 02:23:25,853 INFO
org.apache.curator.framework.state.ConnectionStateManager: State change:
SUSPENDED
2019-12-19 02:23:25,901 INFO
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding
protocol
org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB to
the server
2019-12-19 02:23:25,915 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2019-12-19 02:23:25,916 INFO org.apache.hadoop.ipc.Server: IPC Server listener
on 8033: starting
2019-12-19 02:23:26,269 INFO org.apache.zookeeper.ClientCnxn: Opening socket
connection to server 98e7b66e95e3/172.18.0.11:2181. Will not attempt to
authenticate using SASL (unknown error)
2019-12-19 02:23:26,270 INFO org.apache.zookeeper.ClientCnxn: Socket connection
established, initiating session, client: /172.18.0.11:57042, server:
98e7b66e95e3/172.18.0.11:2181
2019-12-19 02:23:26,282 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server 98e7b66e95e3/172.18.0.11:2181, sessionid =
0x2000dc764480006, negotiated timeout = 10000
2019-12-19 02:23:26,282 INFO
org.apache.curator.framework.state.ConnectionStateManager: State change:
RECONNECTED
2019-12-19 02:23:26,287 WARN org.apache.zookeeper.ClientCnxn: Session
0x2000dc764480006 for server 98e7b66e95e3/172.18.0.11:2181, unexpected error,
closing socket connection and attempting reconnect
java.io.IOException: Xid out of order. Got Xid 5 with err -6 expected Xid 4 for
a packet with details: clientPath:/zookeeper/config
serverPath:/zookeeper/config finished:false header:: 4,4 replyHeader:: 0,0,-4
request:: '/zookeeper/config,T response::
at
org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:907)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
2019-12-19 02:23:26,391 INFO org.apache.curator.framework.state.ConnectionStateM
{noformat}
If I replace the zookeeper (client) jar on the classpath, it worked.
{noformat}
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e
's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
$ docker exec hadoop01 /hadoop/bin/yarn rmadmin -getServiceState rm1
2019-12-19 01:52:27,597 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
active
{noformat}
If both of zookeeper-3.4.14.jar and zookeeper-3.5.6.jar are on the classpath, I
got the error above.
{noformat}
$ docker exec hadoop01 cat /hadoop/etc/hadoop/hadoop-env.sh | grep
'^export.*CLASSPATH'
export HADOOP_CLASSPATH="/zookeeper/zookeeper-3.4.14.jar"
export HADOOP_USER_CLASSPATH_FIRST="yes"
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e
's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
/hadoop/share/hadoop/common/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/common/lib/zookeeper-jute-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-jute-3.5.6.jar
{noformat}
> Make Curator 4 run in soft-compatibility mode with ZooKeeper 3.4
> ----------------------------------------------------------------
>
> Key: HADOOP-16763
> URL: https://issues.apache.org/jira/browse/HADOOP-16763
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Íñigo Goiri
> Priority: Major
>
> HADOOP-16579 changed Curator to 4.2 and ZooKeeper to 3.5.
> This change relate to the client libraries used by the components.
> However, the ensemble in most deployments is 3.4 (default in Ubuntu for
> example).
> To allow this mode, there is a soft-compatibility mode described in
> http://curator.apache.org/zk-compatibility.html
> We should enable this soft-compatibility mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]