[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2024-01-22 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809302#comment-17809302
 ] 

Ayush Saxena commented on TEZ-3440:
---

Hi [~nroberts], [~jeagles] 

We are having discussion around the file  in TEZ-4533 added in this ticket for 
test, that looks to be a violation of apache by laws. Can you folks share some 
pointers, how this file was generated or any other pointers

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
>Priority: Major
> Fix For: 0.7.2, 0.9.0, 0.8.5
>
> Attachments: TEZ-3440-v1.patch, TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-10-04 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546709#comment-15546709
 ] 

TezQA commented on TEZ-3440:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12831604/TEZ-3440-v1.patch
  against master revision ad1fb62.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2012//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2012//console

This message is automatically generated.

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440-v1.patch, TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-10-04 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546457#comment-15546457
 ] 

Nathan Roberts commented on TEZ-3440:
-

Thanks Jon for the comments. I changed the test to specifically get the 
DefaultCodec (i.e. zlib) instead of relying on setup(), just in case someone 
decides to change it down the road. The test was poorly named with Gzip, I 
fixed that as well since there is a Gzip codec but the specific input is 
intended to be used with DefaultCodec.


> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-10-04 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546465#comment-15546465
 ] 

Jonathan Eagles commented on TEZ-3440:
--

+1 pending Hadoop QA. Planning on putting this in 0.8 and 0.7 lines as this is 
a critical bug needed there as well.

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440-v1.patch, TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-10-04 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546010#comment-15546010
 ] 

Jonathan Eagles commented on TEZ-3440:
--

Thanks for the patch, [~nroberts]. The patch addresses the issue completely. 
One minor point I would like to address before this gets checked in. Can we 
explicitly use the GzipCodec in the new test case. Relying on the DefaultCodec 
may be fragile in case the default changes in the future.

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-09-30 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536211#comment-15536211
 ] 

Nathan Roberts commented on TEZ-3440:
-

Verified the fix resolved the problem for a large job that was using gzip for 
intermediate data.  Originally the job was seeing many thousands of fetch 
failures due to the compression stream getting out-of-sync, after the fix there 
were no fetch failures due to this particular issue.

I think it's good to go if someone has cycles to review.




> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-09-23 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15518006#comment-15518006
 ] 

TezQA commented on TEZ-3440:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12830120/TEZ-3440.patch
  against master revision 5c2f893.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1988//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1988//console

This message is automatically generated.

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: TEZ-3440.patch
>
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3440) Shuffling to memory can get out-of-sync when fetching multiple compressed map outputs

2016-09-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510699#comment-15510699
 ] 

Hitesh Shah commented on TEZ-3440:
--

\cc [~rajesh.balamohan] [~sseth]

> Shuffling to memory can get out-of-sync when fetching multiple compressed map 
> outputs
> -
>
> Key: TEZ-3440
> URL: https://issues.apache.org/jira/browse/TEZ-3440
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Nathan Roberts
>
> Haven't verified yet but certainly looks like tez needs same fix as 
> MAPREDUCE-5308 in IFile.
> Specifically saw this because downstream tasks were reporting enough fetch 
> failures that long-running upstream tasks had to be re-run, which makes job 
> run for much longer than it needs.
> Usually shows itself as an "Invalid map id" error on a multi-map fetch on 
> part 2-n (i.e. never the first one).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)