[
https://issues.apache.org/jira/browse/MAPREDUCE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthik Kambatla updated MAPREDUCE-4595:
----------------------------------------
Description:
The source for occasional failure of TestLostTracker seems like the following:
On job completion, JobHistoryFilesManager#run() spawns another thread to move
history files to done folder. TestLostTracker waits for job completion, before
checking the file format of the history file. However, the history files move
might be in the process or might not have started in the first place.
The attachment (force-TestLostTracker-failure.patch) helps reproducing the
error locally, by increasing the chance of hitting this race.
was:
The source for occasional failure of TestLostTracker seems like the following:
On job completion, JobHistoryFilesManager#run() spawns another thread to move
history files to done folder. TestLostTracker waits for job completion, before
checking the file format of the history file. However, the history files move
might be in the process or might not have started in the first place.
I am uploading a patch that significantly increases the chance of hitting this
race.
> TestLostTracker failing - possibly due to a race in
> JobHistory.JobHistoryFilesManager#run()
> -------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4595
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4595
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Priority: Critical
> Labels: test
> Attachments: force-TestLostTracker-failure.patch, MR-4595.patch
>
>
> The source for occasional failure of TestLostTracker seems like the following:
> On job completion, JobHistoryFilesManager#run() spawns another thread to move
> history files to done folder. TestLostTracker waits for job completion,
> before checking the file format of the history file. However, the history
> files move might be in the process or might not have started in the first
> place.
> The attachment (force-TestLostTracker-failure.patch) helps reproducing the
> error locally, by increasing the chance of hitting this race.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira