karthick-rn commented on issue #1578: Accumulo master hangs after TLS on ZK URL: https://github.com/apache/accumulo/issues/1578#issuecomment-609731131 > @karthick-rn Have you looked at the SetGoalState to verify that Accumulo isn't leaving any ZK objects unclosed? @ctubbsii Sorry for the delay. There is a 'goal_state' node that exist in ZK before starting Accumulo master and the value is 'NORMAL' as shown below. This 'goal_state' does exist on a non-TLS ZK cluster as well where Accumulo master doesn't hang. Fyi Before starting Accumulo master: ``` [zk: kn-fix-0:2281(CONNECTED) 2] get -s -w /accumulo/7a7a53a5-077b-4506-a674-a010a273ba5b/masters/goal_state NORMAL cZxid = 0x10000006a ctime = Tue Mar 24 16:38:32 UTC 2020 mZxid = 0x500000015 mtime = Fri Apr 03 15:50:49 UTC 2020 pZxid = 0x10000006a cversion = 0 dataVersion = 39 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 6 numChildren = 0 [zk: kn-fix-0:2281(CONNECTED) 3] WATCHER:: ``` Trying to start Accumulo master overwrites the 'goal_state' which can be noticed through the change in "mtime" as shown below ``` WatchedEvent state:SyncConnected type:NodeDataChanged path:/accumulo/7a7a53a5-077b-4506-a674-a010a273ba5b/masters/goal_state [zk: kn-fix-0:2281(CONNECTED) 3] get -s -w /accumulo/7a7a53a5-077b-4506-a674-a010a273ba5b/masters/goal_state NORMAL cZxid = 0x10000006a ctime = Tue Mar 24 16:38:32 UTC 2020 mZxid = 0x50000b7f6 mtime = Mon Apr 06 09:31:43 UTC 2020 pZxid = 0x10000006a cversion = 0 dataVersion = 40 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 6 numChildren = 0 ``` Fyi, below is the console output when starting Accumulo master. The hang happens exactly after "Connected to HDFS" and the last line is a result of killing that intermediate process. ``` [knarendran@kn-fix-0 ~]$ accumulo-service master start Starting master on kn-fix-0 OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/muchos/install/accumulo-2.0.0/lib/slf4j-log4j12-1.7.26.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/muchos/install/apache-zookeeper-3.5.7-bin/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-04-06 09:31:41,210 [conf.SiteConfiguration] INFO : Found Accumulo configuration on classpath at /opt/muchos/install/accumulo-2.0.0/conf/accumulo.properties 2020-04-06 09:31:42,095 [conf.ConfigurationTypeHelper] DEBUG: Loaded class : org.apache.accumulo.server.fs.RandomVolumeChooser 2020-04-06 09:31:43,417 [server.ServerUtil] INFO : Attempting to talk to zookeeper 2020-04-06 09:31:43,939 [server.ServerUtil] INFO : ZooKeeper connected and initialized, attempting to talk to HDFS 2020-04-06 09:31:43,958 [server.ServerUtil] INFO : Connected to HDFS /opt/muchos/install/accumulo-2.0.0/bin/accumulo-service: line 57: 116969 Killed "${bin}/accumulo" org.apache.accumulo.master.state.SetGoalState NORMAL ``` Let me know if there is any further checks you want me to perform? Thanks
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
