[ 
https://issues.apache.org/jira/browse/MAPREDUCE-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738196#action_12738196
 ] 

Devaraj Das commented on MAPREDUCE-801:
---------------------------------------

I like the idea of truncating the number of locations to some fixed number like 
5, and ignoring the others. It's a simple fix in the framework to limit the 
number of locations to read per split. If the split generation code is buggy 
w.r.t generating the locations for the splits, then we can't do much anyway. 
The location information is only used for creating the cache in the JobTracker 
for doing optimal task assignments.

The other thing is the split bytes (the raw bytes corresponding to the 
serialized split object). If the split data is too large and there are many 
splits, then the JT again becomes vulnerable. The JT reads the split bytes, and 
stores it in memory, so that it can be sent as part of the task object to the 
tasktracker chosen to run the task. There are multiple approaches to solve the 
problem:

1) Limit the size of the split file 

2) Back the splits on disk. The idea here is to create an index file while the 
JobTracker is reading the split file. The splits are read one by one and their 
offsets in the file are stored in the index file. The split data is discarded; 
the location information is retained (after truncating maybe) and the location 
info is used to create the cache as is already done. The index file is kept in 
memory. When a map task is to be handed out to a TT, the JobTracker reads the 
split data by looking up the index and seeking into the split file (similar to 
the way we handle map outputs during shuffle).

We could have a cap on the max split size per split (instead of a cap on the 
total split size) so that we don't use up too much RPC bandwidth while 
transferring the split data to the tasktracker. The alternative here would be 
to have the JT just pass the index information to the TT, and have the TT read 
the split data from the hdfs directly while localizing the task before the 
launch.. 

Thoughts?

> MAPREDUCE framework should issue warning with too many locations for a split
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-801
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-801
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Hong Tang
>
> Customized input-format may be buggy and report misleading locations through 
> input-split, an example of which is PIG-878. When an input split returns too 
> many locations, it would not only artificially inflate the percentage of data 
> local or rack local maps, but also force scheduler to use more memory and 
> work harder to conduct task assignment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to