[ 
https://issues.apache.org/jira/browse/SAMZA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389275#comment-16389275
 ] 

ASF GitHub Bot commented on SAMZA-1607:
---------------------------------------

GitHub user shanthoosh opened a pull request:

    https://github.com/apache/samza/pull/437

    SAMZA-1607: Handle ZkNodeNotExistsException in zkUtils.readProcessorData

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shanthoosh/samza 
fix_zkutils_get_processor_data

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/437.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #437
    
----
commit 2c6d5f9cee4d833d8f63823ee4078f41a726203f
Author: Shanthoosh Venkataraman <svenkataraman@...>
Date:   2018-02-12T23:25:36Z

    SAMZA-1607: Handle ZkNodeNotExists exception in zkUtils.readProcessorData().

----


> Handle ZkNodeNotExists exception in zkUtils.readProcessorData
> -------------------------------------------------------------
>
>                 Key: SAMZA-1607
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1607
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>
> Existing implementation of reading the data of ephemeral processor nodes in 
> zookeeper happens in two steps.
>    A. Fetch the list of ephemeral processor nodes.
>    B. Read the data of each processor node from the list. 
> A ephemeral zookeeper node present in step A might be unavailable in the step 
> B. This exception in unhandled currently and can kill the leader processor 
> unnecessarily. Here's the related exception observed in a dev setup.
> {code:java}
> org.apache.samza.SamzaException: Cannot read ZK node: 
> /app-test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0/test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-coordinationData/processors/0000000001
> at org.apache.samza.zk.ZkUtils.readProcessorData(ZkUtils.java:232)
> at org.apache.samza.zk.ZkUtils.getActiveProcessorsIDs(ZkUtils.java:255)
> at 
> org.apache.samza.zk.ZkJobCoordinator.getActualProcessorIds(ZkJobCoordinator.java:292)
> at 
> org.apache.samza.zk.ZkJobCoordinator.doOnProcessorChange(ZkJobCoordinator.java:194)
> at 
> org.apache.samza.zk.ZkJobCoordinator.lambda$onProcessorChange$1(ZkJobCoordinator.java:188)
> at 
> org.apache.samza.zk.ScheduleAfterDebounceTime.lambda$getScheduleableAction$0(ScheduleAfterDebounceTime.java:134)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.I0Itec.zkclient.exception.ZkNoNodeException: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for 
> /app-test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0/test-app-name-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-test-app-id-9fba7675-36e3-4a6e-8934-4cad6a8ebab0-coordinationData/processors/0000000001
> at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1001)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1100)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1095)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1084)
> at org.apache.samza.zk.ZkUtils.readProcessorData(ZkUtils.java:226)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to