[ https://issues.apache.org/jira/browse/TWILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399164#comment-16399164 ]
ASF GitHub Bot commented on TWILL-61: ------------------------------------- Github user anew commented on a diff in the pull request: https://github.com/apache/twill/pull/67#discussion_r174583711 --- Diff: twill-yarn/src/main/java/org/apache/twill/internal/appmaster/ApplicationMasterMain.java --- @@ -165,9 +168,47 @@ private ApplicationKafkaService(ZKClient zkClient, String kafkaZKConnect) { @Override protected void startUp() throws Exception { - ZKOperations.ignoreError( - zkClient.create(kafkaZKPath, null, CreateMode.PERSISTENT), - KeeperException.NodeExistsException.class, kafkaZKPath).get(); + // Create the ZK node for Kafka to use. If the node already exists, delete it to make sure there is + // no left over content from previous AM attempt. + final SettableOperationFuture<String> completion = SettableOperationFuture.create(kafkaZKPath, + Threads.SAME_THREAD_EXECUTOR); + LOG.info("Preparing Kafka ZK path {}{}", zkClient.getConnectString(), kafkaZKPath); + Futures.addCallback(zkClient.create(kafkaZKPath, null, CreateMode.PERSISTENT), new FutureCallback<String>() { + + final FutureCallback<String> thisCallback = this; + + @Override + public void onSuccess(String result) { + completion.set(result); + } + + @Override + public void onFailure(final Throwable createFailure) { + if (!(createFailure instanceof KeeperException.NodeExistsException)) { + completion.setException(createFailure); + } --- End diff -- return here? > Second launch attempt of AM always failed > ----------------------------------------- > > Key: TWILL-61 > URL: https://issues.apache.org/jira/browse/TWILL-61 > Project: Apache Twill > Issue Type: Bug > Components: yarn > Reporter: Terence Yim > Assignee: Terence Yim > Priority: Major > Fix For: 0.5.0-incubating > > > YARN would make multiple attempts to launch an application. Currently second > or above attempts would always fail due to creation of /runId/state node in > ZK fail (node exists) because runId is generated on client side and doesn't > change between attempts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)