[
https://issues.apache.org/jira/browse/TEZ-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250846#comment-15250846
]
Ming Ma commented on TEZ-3207:
------------------------------
[~sseth], ShuffleManager should be able to fetch all partitions. I will update
the patch with your other comments.
* AM gets the expected # of PhysicalInputs via
{{EdgeManagerPluginOnDemand#getNumDestinationTaskPhysicalInputs}} and use that
spec to schedule the task.
* ShuffleManager is constructed with the expected # of PhysicalInputs. Then it
compares the # of completed inputs with the expected # to decide if it has
completed the fetch of all PhysicalInputs.
> Add support for fetching multiple partitions from the same source task to
> UnorderedKVInput
> ------------------------------------------------------------------------------------------
>
> Key: TEZ-3207
> URL: https://issues.apache.org/jira/browse/TEZ-3207
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Ming Ma
> Assignee: Ming Ma
> Attachments: TEZ-3207.patch
>
>
> The ordered grouped {{ShuffleScheduler}} can support fetching multiple
> partitions from the same source task. But for the unordered ShuffleManager,
> it only supports one partition per source task due to the following issue
> where {{identifier}} doesn't take partition id into account.
> {noformat}
> public void addKnownInput(String hostName, int port,
> InputAttemptIdentifier srcAttemptIdentifier, int srcPhysicalIndex) {
> String identifier = InputHost.createIdentifier(hostName, port);
> InputHost host = knownSrcHosts.get(identifier);
> ....
> }
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)