Task cwds should be distributed across partitions
-------------------------------------------------
Key: HADOOP-2115
URL: https://issues.apache.org/jira/browse/HADOOP-2115
Project: Hadoop
Issue Type: Improvement
Components: mapred
Affects Versions: 0.14.3
Environment: All
Reporter: Milind Bhandarkar
Fix For: 0.16.0
Even when mapred.local.dir specifies a comma-separated list of partitions
(typically one per physical disk), all tasks of the same job have current
working directories that belong to only one partition. For side-effect tasks,
that use local cwd as a scratch space, this overloads a single disk while other
disks may be idle. Idially, each task should get a cwd on different partition.
This is related to HADOOP-1991, but emphasizes performance impact.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.