[
https://issues.apache.org/jira/browse/MAPREDUCE-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768221#action_12768221
]
Iyappan Srinivasan commented on MAPREDUCE-1098:
-----------------------------------------------
+1 from QA for patch-1098-0.20.txt
1) Brought up cluster, made sure that the file uploaded is around 2 GB (using
-files option). Submitted two jobs which acceses these files.
Before patch, the first job finished uploading the file and then only the
second job file's uploading starts, as
clearly seen from logs. After patch, both upload starts independently, as seen
from logs.
2) Ran sleep jobs and also streaming jobs to test this behaviour.
3) Ran with one slave cluster and made sure that two jobs access same file/
different file using -files and -cacheFile. In all
cases it went fine.
After patch, when different files are given with -files option, then uploading
happens independently. When same files are
provided with -files option, still it happens independently because jt places
them on different directories for each job, as seen from the conf file of the
job.
with -cacheFile and with only one TT, the first file is localized by the first
job and the second job just access this localized file, as
soon as the lock over that file is removed.
> Incorrect synchronization in DistributedCache causes TaskTrackers to freeze
> up during localization of Cache for tasks.
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1098
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: tasktracker
> Reporter: Sreekanth Ramakrishnan
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-1098-0.20.txt, patch-1098-1.txt, patch-1098-2.txt,
> patch-1098.txt
>
>
> Currently {{org.apache.hadoop.filecache.DistributedCache.getLocalCache(URI,
> Configuration, Path, FileStatus, boolean, long, Path, boolean)}} allows only
> one {{TaskRunner}} thread in TT to localize {{DistributedCache}} across jobs.
> Current way of synchronization is across baseDir this has to be changed to
> lock on the same baseDir.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.