[
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546010#comment-15546010
]
Jonathan Eagles commented on TEZ-3440:
--------------------------------------
Thanks for the patch, [~nroberts]. The patch addresses the issue completely.
One minor point I would like to address before this gets checked in. Can we
explicitly use the GzipCodec in the new test case. Relying on the DefaultCodec
may be fragile in case the default changes in the future.
> Shuffling to memory can get out-of-sync when fetching multiple compressed map
> outputs
> -------------------------------------------------------------------------------------
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Nathan Roberts
> Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch
> failures that long-running upstream tasks had to be re-run, which makes job
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on
> part 2-n (i.e. never the first one).
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)