RaidNode should submit one job per Raid policy
----------------------------------------------
Key: MAPREDUCE-1819
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1819
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: contrib/raid
Affects Versions: 0.20.1
Reporter: Ramkumar Vadali
The RaidNode currently computes parity files as follows:
1. Using RaidNode.selectFiles() to figure out what files to raid for a policy
2. Using #1 repeatedly for each configured policy to accumulate a list of
files.
3. Submitting a mapreduce job with the list of files from #2 using
DistRaid.doDistRaid()
This task addresses the fact that #2 and #3 happen sequentially. The proposal
is to submit a separate mapreduce job for the list of files for each policy and
use another thread to track the progress of the submitted jobs. This will help
reduce the time taken for files to be raided.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.