[ https://issues.apache.org/jira/browse/MAPREDUCE-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745421#comment-13745421 ]
Siddharth Seth commented on MAPREDUCE-5384: ------------------------------------------- Karthik, apologies, I haven't been able to get to this earlier. With the latest patch, I'm not sure which race is being fixed. It looks like a cancel while a RenewalTimer is running, will still lead to an additional Renewal being scheduled for the same token, which is the same behaviour without the patch. Have some concerns with synchronization / thread safety in the patch as well - leaving those out. Instead of the relatively large changes in the patch, I think it'll be a lot simpler to just associate a 'cancel'/'intentToCancel' flag with the token to prevent renewal attempts after a cancel is called. The in-process renew can check these till the last moment before invoking the actual renew, and subsequent renewals will not attempt a renew (maybe even not schedule a renew). The changes to the unit test will likely still be required - to allow the one extra in-flight renew. Do you know if this problem exists in the 2.x renewer as well ? > Races in DelegationTokenRenewal > ------------------------------- > > Key: MAPREDUCE-5384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5384 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 1.2.0, 1.1.2, 1.2.1 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Attachments: mr-5384-0.patch, mr-5384-1.patch, mr-5384-2.patch > > > There are a couple of races in DelegationTokenRenewal. > One of them was addressed by MAPREDUCE-4860, which introduced a deadlock > while fixing this race. Opening a new JIRA per discussion in MAPREDUCE-5364, > since MAPREDUCE-4860 is already shipped in a release. > Races to fix: > # TimerTask#cancel() disallows future invocations of run(), but doesn't abort > an already scheduled/started run(). > # In the context of DelegationTokenRenewal, RenewalTimerTask#cancel() only > cancels that TimerTask instance. However, it has no effect on any other > TimerTasks created for that token. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira