[ 
https://issues.apache.org/jira/browse/TEZ-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated TEZ-3207:
-------------------------
    Attachment: TEZ-3207.patch

Thanks [~sseth] for the good point. Here is the draft patch. As part of the 
fix, it separates partition id from the host port per suggestion. The patch 
also has some minor clean up along the way. It appears there is more 
refactoring we can do to combine shuffle and unordered components, such as 
MapHost and InputHost. I will leave that to a separate jira.

> Add support for fetching multiple partitions from the same source task to 
> UnorderedKVInput
> ------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3207
>                 URL: https://issues.apache.org/jira/browse/TEZ-3207
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Ming Ma
>         Attachments: TEZ-3207.patch
>
>
> The ordered grouped {{ShuffleScheduler}} can support fetching multiple 
> partitions from the same source task. But for the unordered ShuffleManager, 
> it only supports one partition per source task due to the following issue 
> where {{identifier}} doesn't take partition id into account.
> {noformat}
>   public void addKnownInput(String hostName, int port,
>       InputAttemptIdentifier srcAttemptIdentifier, int srcPhysicalIndex) {
>     String identifier = InputHost.createIdentifier(hostName, port);
>     InputHost host = knownSrcHosts.get(identifier);
>     ....
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to