[
https://issues.apache.org/jira/browse/MAPREDUCE-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777472#action_12777472
]
Hemanth Yamijala commented on MAPREDUCE-1140:
---------------------------------------------
bq. This is done, because getLocalCache increments referenceCount first and
then localizes. Reference count should be decremented for the one just failed
also. So, it should be added to the list before the getLocalCache call.
Umm. But (atleast theoretically), it is still possible that a call to
getLocalCache fails before referenceCount is incremented. For e.g. makeRelative
throws IOException; so does getLocalCacheForWrite. Hence, we still have a
situation where we record a file as being localized (by storing it in
localizedCacheFiles), but the reference count is not actually incremented. And
releaseCache would have the bug this JIRA is talking about still.
One more point I am slightly uncomfortable about is the duplication of state
because of the new list localizedCacheFiles.
Here's an alternate proposal:
- Modify CacheFile to have a boolean saying isLocalized. By default, this is
false. This will be set to true if distributedCacheManager.getLocalCache
returns successfully.
- To handle the case you have mentioned above, where a failure can happen after
referenceCount is incremented in getLocalCache, I would suggest we catch
exceptions inside getLocalCache, and on an exception, decrement the
referenceCount and re-throw the exception. This seems right to me - because if
the getLocalCache doesn't complete, shouldn't we be consistent by decrementing
the reference count ?
Would this work ?
> Per cache-file refcount can become negative when tasks release
> distributed-cache files
> --------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1140
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1140
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: tasktracker
> Affects Versions: 0.20.2, 0.21.0, 0.22.0
> Reporter: Vinod K V
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-1140-1.txt, patch-1140-ydist.txt, patch-1140.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.