[ 
https://issues.apache.org/jira/browse/SAMZA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390263#comment-16390263
 ] 

ASF GitHub Bot commented on SAMZA-1607:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/samza/pull/437


> SAMZA-1607: Handle ZkNoNodeExistsException in zkUtils.readProcessorData
> -----------------------------------------------------------------------
>
>                 Key: SAMZA-1607
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1607
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>
> Existing implementation of reading the data of ephemeral processor nodes in 
> zookeeper happens in two steps.
>    A. Fetch the list of ephemeral processor nodes.
>    B. Read the data of each processor node from the list. 
> A ephemeral zookeeper node present in step A might be unavailable in the step 
> B. This exception in unhandled currently and can kill the leader processor 
> unnecessarily. Here's the related exception observed in a dev setup.
> {code:java}
> org.apache.samza.SamzaException: Cannot read ZK node: 
> /app-test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0/test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-coordinationData/processors/0000000001
> at org.apache.samza.zk.ZkUtils.readProcessorData(ZkUtils.java:232)
> at org.apache.samza.zk.ZkUtils.getActiveProcessorsIDs(ZkUtils.java:255)
> at 
> org.apache.samza.zk.ZkJobCoordinator.getActualProcessorIds(ZkJobCoordinator.java:292)
> at 
> org.apache.samza.zk.ZkJobCoordinator.doOnProcessorChange(ZkJobCoordinator.java:194)
> at 
> org.apache.samza.zk.ZkJobCoordinator.lambda$onProcessorChange$1(ZkJobCoordinator.java:188)
> at 
> org.apache.samza.zk.ScheduleAfterDebounceTime.lambda$getScheduleableAction$0(ScheduleAfterDebounceTime.java:134)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.I0Itec.zkclient.exception.ZkNoNodeException: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for 
> /app-test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0/test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-coordinationData/processors/0000000001
> at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1001)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1100)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1095)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1084)
> at org.apache.samza.zk.ZkUtils.readProcessorData(ZkUtils.java:226)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to