[jira] Commented: (HADOOP-3293) When an input split spans cross block boundary, the split location should be the host having most of bytes on it.

Runping Qi (JIRA) Thu, 30 Oct 2008 09:41:06 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644025#action_12644025
 ]


Runping Qi commented on HADOOP-3293:
------------------------------------

In the above case, I'd say that the prefer hosts for the split should be in the 
order of A,D,B,E,C,F.
We should also aggregate the bytes over the racks of those hosts.
For example, suppose C,E,F share  the same rack while other nodes are on 
different rack.
Then host E (F, and even C) will offer better rack locality than other hosts.
In practice, rack locality is almost as good as node locality.
 

> When an input split spans cross block boundary, the split location should be 
> the host having most of bytes on it. 
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3293
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3293
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Jothi Padmanabhan
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3293) When an input split spans cross block boundary, the split location should be the host having most of bytes on it.

Reply via email to