[
https://issues.apache.org/jira/browse/GIRAPH-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886989#comment-13886989
]
Eli Reisman commented on GIRAPH-747:
------------------------------------
Hey, reviewing this. I recall this issue I thought I was shimming this number
somewhere else? The reason is that BspServiceMaster is also used by non-YARN
and I didn't want to break or alter the shared code.
Could another non-YARN Giraph committer take a look and see if this change is
safe? If not we should def commit this. If so, maybe another (ugh) munge flag
here will suffice?
> BspServiceMaster finishes ZooKeeper cleanup without waiting for all workers
> to complete
> ---------------------------------------------------------------------------------------
>
> Key: GIRAPH-747
> URL: https://issues.apache.org/jira/browse/GIRAPH-747
> Project: Giraph
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Chuan Lei
> Assignee: Chuan Lei
> Fix For: 1.0.0
>
> Attachments: GIRAPH-747.v1.patch
>
>
> In BspServiceMaster, the function cleanUpZooKeeper should wait for the number
> of workers and masters to complete. However, it appears that maxTasks only
> takes workers into consideration. Consequently, the worker straggler may fail
> to report to the ZooKeeper due to the path gets removed too early. This will
> cause No lease on path File does not exist exception at runtime.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)