[
https://issues.apache.org/jira/browse/MAPREDUCE-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745421#comment-13745421
]
Siddharth Seth commented on MAPREDUCE-5384:
-------------------------------------------
Karthik, apologies, I haven't been able to get to this earlier.
With the latest patch, I'm not sure which race is being fixed. It looks like a
cancel while a RenewalTimer is running, will still lead to an additional
Renewal being scheduled for the same token, which is the same behaviour without
the patch. Have some concerns with synchronization / thread safety in the patch
as well - leaving those out.
Instead of the relatively large changes in the patch, I think it'll be a lot
simpler to just associate a 'cancel'/'intentToCancel' flag with the token to
prevent renewal attempts after a cancel is called. The in-process renew can
check these till the last moment before invoking the actual renew, and
subsequent renewals will not attempt a renew (maybe even not schedule a renew).
The changes to the unit test will likely still be required - to allow the one
extra in-flight renew.
Do you know if this problem exists in the 2.x renewer as well ?
> Races in DelegationTokenRenewal
> -------------------------------
>
> Key: MAPREDUCE-5384
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5384
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.2.0, 1.1.2, 1.2.1
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Attachments: mr-5384-0.patch, mr-5384-1.patch, mr-5384-2.patch
>
>
> There are a couple of races in DelegationTokenRenewal.
> One of them was addressed by MAPREDUCE-4860, which introduced a deadlock
> while fixing this race. Opening a new JIRA per discussion in MAPREDUCE-5364,
> since MAPREDUCE-4860 is already shipped in a release.
> Races to fix:
> # TimerTask#cancel() disallows future invocations of run(), but doesn't abort
> an already scheduled/started run().
> # In the context of DelegationTokenRenewal, RenewalTimerTask#cancel() only
> cancels that TimerTask instance. However, it has no effect on any other
> TimerTasks created for that token.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira