[
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546457#comment-15546457
]
Nathan Roberts commented on TEZ-3440:
-------------------------------------
Thanks Jon for the comments. I changed the test to specifically get the
DefaultCodec (i.e. zlib) instead of relying on setup(), just in case someone
decides to change it down the road. The test was poorly named with Gzip, I
fixed that as well since there is a Gzip codec but the specific input is
intended to be used with DefaultCodec.
> Shuffling to memory can get out-of-sync when fetching multiple compressed map
> outputs
> -------------------------------------------------------------------------------------
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Nathan Roberts
> Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch
> failures that long-running upstream tasks had to be re-run, which makes job
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on
> part 2-n (i.e. never the first one).
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)