[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2020-01-22 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021570#comment-17021570
 ] 

Jonathan Turner Eagles commented on TEZ-3391:
-

Actually, [~ahussein], can you update the summary to better reflect this? That 
way the commit message will have the correct summary as well and puts you in 
control of what the summary says. I'll commit once summary is updated.

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch
>
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2020-01-22 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021379#comment-17021379
 ] 

Jonathan Turner Eagles commented on TEZ-3391:
-

+1. Let's put this in. I spent some time verifying this doesn't break pig or 
hive. I'm going to change the summary to better reflect the new purpose of this 
jira.

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch
>
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2020-01-22 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021148#comment-17021148
 ] 

TezQA commented on TEZ-3391:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
18s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
12s{color} | {color:green} tez-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 base: 
https://builds.apache.org/job/PreCommit-TEZ-Build/251/artifact/out/Dockerfile |
| JIRA Issue | TEZ-3391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991529/TEZ-3391.002.patch |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
checkstyle compile |
| uname | Linux 5681a9c7b74a 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 5b81017 |
| Default Java | 1.8.0_232 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/251/testReport/ |
| Max. process+thread count | 208 (vs. ulimit of 5500) |
| modules | C: tez-mapreduce U: tez-mapreduce |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/251/console |
| versions | git=2.7.4 maven=3.3.9 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |


This message was automatically generated.



> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch
>
>
>   We had a 

[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2020-01-22 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021119#comment-17021119
 ] 

Ahmed Hussein commented on TEZ-3391:


I agree with [~rohini] that the implementation is not efficient.
The ideal fix is to read the object array {{TaskSplitMetaInfo[]}} only once and 
do all the validation in the AM, then pass the {{TaskSplitMetaInfo[index]}} to 
the task initializer. This may imply significant code changes.
The existing code also has significant space overhead. Because each task 
creates an array of meta split. This means the code is {{n^2}} space 
complexity. The patch will reduce the space complexity but it each task needs 
to go through the entire meta file.

[~jeagles], Can you please take a look at the patch and merge it at your 
convenience?

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch
>
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2020-01-22 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021100#comment-17021100
 ] 

TezQA commented on TEZ-3391:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
11s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
8s{color} | {color:green} tez-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 base: 
https://builds.apache.org/job/PreCommit-TEZ-Build/250/artifact/out/Dockerfile |
| JIRA Issue | TEZ-3391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991477/TEZ-3391.001.patch |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
checkstyle compile |
| uname | Linux 4a78af525a8f 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 5b81017 |
| Default Java | 1.8.0_232 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/250/testReport/ |
| Max. process+thread count | 209 (vs. ulimit of 5500) |
| modules | C: tez-mapreduce U: tez-mapreduce |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/250/console |
| versions | git=2.7.4 maven=3.3.9 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |


This message was automatically generated.



> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: TEZ-3391.001.patch
>
>
>   We had a case  where Split 

[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2018-05-31 Thread Rohini Palaniswamy (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496968#comment-16496968
 ] 

Rohini Palaniswamy commented on TEZ-3391:
-

bq. 1) Moving the validation checks to AM
 Yo can skip this. Looking at the code, it is not a easy thing as the AM will 
have to deconstruct it to MRInput and then perform the check.  It would be 
wasteful to do that just for this purpose. The validation check is better done 
by clients like Pig which create the file.  They can do it even before 
submitting the DAG which is even more better.

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Nishant Dash
>Priority: Major
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2018-05-29 Thread Nishant Dash (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493991#comment-16493991
 ] 

Nishant Dash commented on TEZ-3391:
---

[~rohini] In regards to point 1 of moving the validation checks to AM, I was 
hoping if you could shed some more light on where or at which point/step couldĀ 
the checks be included?

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Priority: Major
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2017-05-16 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013138#comment-16013138
 ] 

Rohini Palaniswamy commented on TEZ-3391:
-

Doing these two things will save couple of millis in each map vertex.

1) Moving the validation checks to AM
2) In the vertex construct TaskSplitMetaInfo only for the split of that task 
instead of constructing for all splits. ie change
public static TaskSplitMetaInfo[] readSplitMetaInfo(Configuration conf, 
FileSystem fs) to public static TaskSplitMetaInfo 
getSplitMetaInfo(Configuration conf, FileSystem fs, int index) and skip reading 
splits below the index. If there are 1000 splits, the first task will read 1 
split, second task will read 2 splits and so on instead of each task reading 
all the 1000 splits as is happening now.

SplitMetaInfoReaderTez.java
{code}
try {
  JobSplit.SplitMetaInfo splitMetaInfo = new JobSplit.SplitMetaInfo();
  for (int i = 0; i < numSplits; i++) {
splitMetaInfo.readFields(in);
if (i == index) {
return new JobSplit.TaskSplitMetaInfo(splitIndex,
splitMetaInfo.getLocations(), splitMetaInfo.getInputDataLength());
}
  }
} finally {
  in.close();
}
{code}


> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2016-08-01 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402982#comment-15402982
 ] 

Jason Lowe commented on TEZ-3391:
-

bq. The Tez AM also reported that too many containers where running, while in 
practice it was not.

This was technically "correct" in the sense that the DAG status is reporting 
how many *tasks*, not attempts, are running.  One cannot assume that the 
counter being shown for "Running: " means how many task attempts or containers 
are currently executing.

A task goes into the running state as soon as the first attempt for it is 
launched.  In this particular case a large number of tasks all had one attempt 
start and then promptly fail.  That left the tasks in the running state.  Most 
were waiting for another attempt to launch with no attempt for them currently 
running.  The key distinction is task vs. attempt.  A task can be in the 
running state with no attempt currently running for it.  A separate 
RunningTaskAttempts counter being reported in the DAG status would have made 
this more explicit.

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM

2016-08-01 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402926#comment-15402926
 ] 

Rohini Palaniswamy commented on TEZ-3391:
-

The Tez AM also reported that too many containers where running, while in 
practice it was not.

{code}
2016-07-28 23:33:00,162 [Timer-1] INFO  
org.apache.pig.backend.hadoop.executionengine.tez.TezJob  - DAG Status: 
status=RUNNING, progress=TotalTasks: 311387 Succeeded: 69939 Running: 164954 
Failed: 0 Killed: 0 FailedTaskAttempts: 156417 KilledTaskAttempts: 4807, 
diagnostics=, counters=null 
{code}

{code}
016-07-28 23:02:55,170 [INFO] [Dispatcher thread {Central}] 
|history.HistoryEventHandler|: 
[HISTORY][DAG:Dag_1468638337805_452514_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=scope-6, taskAttemptId=attempt_1468638337805_452514_1_03_13_0, 
creationTime=1469746848974, allocationTime=1469746936804, 
startTime=1469746969091, finishTime=1469746975099, timeTaken=6008, 
status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
running task:java.io.IOException: Split metadata size exceeded 1000. 
Aborting job 
at 
org.apache.hadoop.mapreduce.split.SplitMetaInfoReaderTez.readSplitMetaInfo(SplitMetaInfoReaderTez.java:79)
at 
org.apache.tez.mapreduce.lib.MRInputUtils.readSplits(MRInputUtils.java:53)
at 
org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:470)
at org.apache.tez.mapreduce.input.MRInput.initialize(MRInput.java:443)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:446)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:429)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:414)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

It would be good to move this validation to MRInputSplitDistributor

> MR split file validation should be done in the AM
> -
>
> Key: TEZ-3391
> URL: https://issues.apache.org/jira/browse/TEZ-3391
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>
>   We had a case  where Split metadata size exceeded 1000. Instead of job 
> failing from validation during initialization in AM like mapreduce, each of 
> the tasks failed doing that validation during initialization.
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)