[ 
https://issues.apache.org/jira/browse/GOBBLIN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-1382:
-----------------------------
    Description: 
This is a log from localhost, this does not happen all the time, but when I 
subsequently stop and start the yarn, it fails due to existing znode.

 This shows the failed start:
{code:java}
==> logs/yarn.out <====> logs/yarn.out <==2021-02-07 16:58:44 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection 
string: localhost:21812021-02-07 16:58:45 PST WARN  [main] 
org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable2021-02-07 
16:58:46 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - 
Creating Helix cluster GobblinYarn-2 with overwrite: true2021-02-07 16:58:46 
PST ERROR [main] org.apache.helix.manager.zk.ZKHelixAdmin  - Error creating 
cluster:GobblinYarn-2org.I0Itec.zkclient.exception.ZkNodeExistsException: 
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /GobblinYarn-2/CONTROLLER at 
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:55) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1161)
 at org.apache.helix.manager.zk.zookeeper.ZkClient.create(ZkClient.java:535) at 
org.apache.helix.manager.zk.client.SharedZkClient.create(SharedZkClient.java:85)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:362)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:338)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:317)
 at 
org.apache.helix.manager.zk.ZKHelixAdmin.createZKPaths(ZKHelixAdmin.java:750) 
at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:715) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)Caused
 by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode 
= NodeExists for /GobblinYarn-2/CONTROLLER at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:122) at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792) at 
org.apache.helix.manager.zk.zookeeper.ZkConnection.create(ZkConnection.java:114)
 at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:538) at 
org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:535) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1151)
 ... 11 more
==> logs/yarn.err <==Exception in thread "main" 
org.apache.helix.HelixException: cluster GobblinYarn-2 is not setup yet at 
org.apache.helix.manager.zk.ZKHelixAdmin.addStateModelDef(ZKHelixAdmin.java:989)
 at org.apache.helix.tools.ClusterSetup.addStateModelDef(ClusterSetup.java:361) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:165) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)
==> logs/yarn.out <==2021-02-07 16:58:46 PST INFO  [Thread-5] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Stopping the 
GobblinYarnAppLauncher2021-02-07 16:58:46 PST INFO  [Thread-5] 
org.apache.gobblin.util.ExecutorsUtils  - Attempting to shutdown 
ExecutorService: 
java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.util.ExecutorsUtils  - 
Successfully shutdown ExecutorService: 
java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.yarn.GobblinYarnAppLauncher  
- Disabling all live Helix instances..
==> logs/yarn.err <==Exception in thread "Thread-5" 
org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. Call 
HelixManager#connect() at 
org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363)
 at 
org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.disableLiveHelixInstances(GobblinYarnAppLauncher.java:544)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.stop(GobblinYarnAppLauncher.java:447)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher$2.run(GobblinYarnAppLauncher.java:1107)
{code}
 

This shows the successful next run


{code:java}
==> logs/yarn.out <==
2021-02-07 17:02:30 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection 
string: localhost:2181
2021-02-07 17:02:31 PST WARN  [main] org.apache.hadoop.util.NativeCodeLoader  - 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
2021-02-07 17:02:32 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Creating Helix cluster 
GobblinYarn-2 with overwrite: true
2021-02-07 17:02:32 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Created Helix cluster 
GobblinYarn-2
2021-02-07 17:02:32 PST INFO  [main] org.apache.hadoop.yarn.client.RMProxy  - 
Connecting to ResourceManager at /0.0.0.0:8032
2021-02-07 17:02:33 PST WARN  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Found 0 live instances in the 
cluster.
2021-02-07 17:02:33 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - No reconnectable application 
found so submitting a new application
2021-02-07 17:02:33 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - creating new yarn application
2021-02-07 17:02:33 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - created new yarn application: 
3
2021-02-07 17:02:33 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Configured 
GobblinApplicationMaster work directory to: 
hdfs://localhost:8020/tmp/gobblin-yarn/GobblinYarn-2/application_1612697249959_0003/appmaster
{code}
 

  was:
This is a log from localhost, this does not happen all the time, but when I 
subsequently stop and start the yarn, it fails due to existing znode.

 
{code:java}
==> logs/yarn.out <====> logs/yarn.out <==2021-02-07 16:58:44 PST INFO  [main] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection 
string: localhost:21812021-02-07 16:58:45 PST WARN  [main] 
org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable2021-02-07 
16:58:46 PST INFO  [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - 
Creating Helix cluster GobblinYarn-2 with overwrite: true2021-02-07 16:58:46 
PST ERROR [main] org.apache.helix.manager.zk.ZKHelixAdmin  - Error creating 
cluster:GobblinYarn-2org.I0Itec.zkclient.exception.ZkNodeExistsException: 
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /GobblinYarn-2/CONTROLLER at 
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:55) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1161)
 at org.apache.helix.manager.zk.zookeeper.ZkClient.create(ZkClient.java:535) at 
org.apache.helix.manager.zk.client.SharedZkClient.create(SharedZkClient.java:85)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:362)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:338)
 at 
org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:317)
 at 
org.apache.helix.manager.zk.ZKHelixAdmin.createZKPaths(ZKHelixAdmin.java:750) 
at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:715) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)Caused
 by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode 
= NodeExists for /GobblinYarn-2/CONTROLLER at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:122) at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792) at 
org.apache.helix.manager.zk.zookeeper.ZkConnection.create(ZkConnection.java:114)
 at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:538) at 
org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:535) at 
org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1151)
 ... 11 more
==> logs/yarn.err <==Exception in thread "main" 
org.apache.helix.HelixException: cluster GobblinYarn-2 is not setup yet at 
org.apache.helix.manager.zk.ZKHelixAdmin.addStateModelDef(ZKHelixAdmin.java:989)
 at org.apache.helix.tools.ClusterSetup.addStateModelDef(ClusterSetup.java:361) 
at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:165) at 
org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)
==> logs/yarn.out <==2021-02-07 16:58:46 PST INFO  [Thread-5] 
org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Stopping the 
GobblinYarnAppLauncher2021-02-07 16:58:46 PST INFO  [Thread-5] 
org.apache.gobblin.util.ExecutorsUtils  - Attempting to shutdown 
ExecutorService: 
java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.util.ExecutorsUtils  - 
Successfully shutdown ExecutorService: 
java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
 16:58:46 PST INFO  [Thread-5] org.apache.gobblin.yarn.GobblinYarnAppLauncher  
- Disabling all live Helix instances..
==> logs/yarn.err <==Exception in thread "Thread-5" 
org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. Call 
HelixManager#connect() at 
org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363)
 at 
org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.disableLiveHelixInstances(GobblinYarnAppLauncher.java:544)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher.stop(GobblinYarnAppLauncher.java:447)
 at 
org.apache.gobblin.yarn.GobblinYarnAppLauncher$2.run(GobblinYarnAppLauncher.java:1107)
{code}


> gobblin yarn fails to clean up old ZK node sometime
> ---------------------------------------------------
>
>                 Key: GOBBLIN-1382
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1382
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-yarn
>    Affects Versions: 0.15.0
>            Reporter: Jay Sen
>            Assignee: Abhishek Tiwari
>            Priority: Major
>             Fix For: 0.16.0
>
>
> This is a log from localhost, this does not happen all the time, but when I 
> subsequently stop and start the yarn, it fails due to existing znode.
>  This shows the failed start:
> {code:java}
> ==> logs/yarn.out <====> logs/yarn.out <==2021-02-07 16:58:44 PST INFO  
> [main] org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper 
> connection string: localhost:21812021-02-07 16:58:45 PST WARN  [main] 
> org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where 
> applicable2021-02-07 16:58:46 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Creating Helix cluster 
> GobblinYarn-2 with overwrite: true2021-02-07 16:58:46 PST ERROR [main] 
> org.apache.helix.manager.zk.ZKHelixAdmin  - Error creating 
> cluster:GobblinYarn-2org.I0Itec.zkclient.exception.ZkNodeExistsException: 
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /GobblinYarn-2/CONTROLLER at 
> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:55) at 
> org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1161)
>  at org.apache.helix.manager.zk.zookeeper.ZkClient.create(ZkClient.java:535) 
> at 
> org.apache.helix.manager.zk.client.SharedZkClient.create(SharedZkClient.java:85)
>  at 
> org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:362)
>  at 
> org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:338)
>  at 
> org.apache.helix.manager.zk.zookeeper.ZkClient.createPersistent(ZkClient.java:317)
>  at 
> org.apache.helix.manager.zk.ZKHelixAdmin.createZKPaths(ZKHelixAdmin.java:750) 
> at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:715) 
> at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:162) at 
> org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)Caused
>  by: org.apache.zookeeper.KeeperException$NodeExistsException: 
> KeeperErrorCode = NodeExists for /GobblinYarn-2/CONTROLLER at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:122) at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
> org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792) at 
> org.apache.helix.manager.zk.zookeeper.ZkConnection.create(ZkConnection.java:114)
>  at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:538) 
> at org.apache.helix.manager.zk.zookeeper.ZkClient$1.call(ZkClient.java:535) 
> at 
> org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1151)
>  ... 11 more
> ==> logs/yarn.err <==Exception in thread "main" 
> org.apache.helix.HelixException: cluster GobblinYarn-2 is not setup yet at 
> org.apache.helix.manager.zk.ZKHelixAdmin.addStateModelDef(ZKHelixAdmin.java:989)
>  at 
> org.apache.helix.tools.ClusterSetup.addStateModelDef(ClusterSetup.java:361) 
> at org.apache.helix.tools.ClusterSetup.addCluster(ClusterSetup.java:165) at 
> org.apache.gobblin.cluster.HelixUtils.createGobblinHelixCluster(HelixUtils.java:91)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.launch(GobblinYarnAppLauncher.java:346)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.main(GobblinYarnAppLauncher.java:1120)
> ==> logs/yarn.out <==2021-02-07 16:58:46 PST INFO  [Thread-5] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Stopping the 
> GobblinYarnAppLauncher2021-02-07 16:58:46 PST INFO  [Thread-5] 
> org.apache.gobblin.util.ExecutorsUtils  - Attempting to shutdown 
> ExecutorService: 
> java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
>  16:58:46 PST INFO  [Thread-5] org.apache.gobblin.util.ExecutorsUtils  - 
> Successfully shutdown ExecutorService: 
> java.util.concurrent.Executors$DelegatedScheduledExecutorService@1a5b4edc2021-02-07
>  16:58:46 PST INFO  [Thread-5] org.apache.gobblin.yarn.GobblinYarnAppLauncher 
>  - Disabling all live Helix instances..
> ==> logs/yarn.err <==Exception in thread "Thread-5" 
> org.apache.helix.HelixException: HelixManager (ZkClient) is not connected. 
> Call HelixManager#connect() at 
> org.apache.helix.manager.zk.ZKHelixManager.checkConnected(ZKHelixManager.java:363)
>  at 
> org.apache.helix.manager.zk.ZKHelixManager.getClusterManagmentTool(ZKHelixManager.java:908)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.disableLiveHelixInstances(GobblinYarnAppLauncher.java:544)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher.stop(GobblinYarnAppLauncher.java:447)
>  at 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher$2.run(GobblinYarnAppLauncher.java:1107)
> {code}
>  
> This shows the successful next run
> {code:java}
> ==> logs/yarn.out <==
> 2021-02-07 17:02:30 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Using ZooKeeper connection 
> string: localhost:2181
> 2021-02-07 17:02:31 PST WARN  [main] org.apache.hadoop.util.NativeCodeLoader  
> - Unable to load native-hadoop library for your platform... using 
> builtin-java classes where applicable
> 2021-02-07 17:02:32 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Creating Helix cluster 
> GobblinYarn-2 with overwrite: true
> 2021-02-07 17:02:32 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Created Helix cluster 
> GobblinYarn-2
> 2021-02-07 17:02:32 PST INFO  [main] org.apache.hadoop.yarn.client.RMProxy  - 
> Connecting to ResourceManager at /0.0.0.0:8032
> 2021-02-07 17:02:33 PST WARN  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Found 0 live instances in 
> the cluster.
> 2021-02-07 17:02:33 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - No reconnectable 
> application found so submitting a new application
> 2021-02-07 17:02:33 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - creating new yarn 
> application
> 2021-02-07 17:02:33 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - created new yarn 
> application: 3
> 2021-02-07 17:02:33 PST INFO  [main] 
> org.apache.gobblin.yarn.GobblinYarnAppLauncher  - Configured 
> GobblinApplicationMaster work directory to: 
> hdfs://localhost:8020/tmp/gobblin-yarn/GobblinYarn-2/application_1612697249959_0003/appmaster
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to