[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021570#comment-17021570 ] Jonathan Turner Eagles commented on TEZ-3391: - Actually, [~ahussein], can you update the summary to better reflect this? That way the commit message will have the correct summary as well and puts you in control of what the summary says. I'll commit once summary is updated. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Ahmed Hussein >Priority: Major > Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch > > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021379#comment-17021379 ] Jonathan Turner Eagles commented on TEZ-3391: - +1. Let's put this in. I spent some time verifying this doesn't break pig or hive. I'm going to change the summary to better reflect the new purpose of this jira. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Ahmed Hussein >Priority: Major > Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch > > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021148#comment-17021148 ] TezQA commented on TEZ-3391: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 18s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s{color} | {color:green} tez-mapreduce in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 18m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/PreCommit-TEZ-Build/251/artifact/out/Dockerfile | | JIRA Issue | TEZ-3391 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12991529/TEZ-3391.002.patch | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile | | uname | Linux 5681a9c7b74a 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/tez.sh | | git revision | master / 5b81017 | | Default Java | 1.8.0_232 | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/251/testReport/ | | Max. process+thread count | 208 (vs. ulimit of 5500) | | modules | C: tez-mapreduce U: tez-mapreduce | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/251/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.0.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Ahmed Hussein >Priority: Major > Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch > > > We had a
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021119#comment-17021119 ] Ahmed Hussein commented on TEZ-3391: I agree with [~rohini] that the implementation is not efficient. The ideal fix is to read the object array {{TaskSplitMetaInfo[]}} only once and do all the validation in the AM, then pass the {{TaskSplitMetaInfo[index]}} to the task initializer. This may imply significant code changes. The existing code also has significant space overhead. Because each task creates an array of meta split. This means the code is {{n^2}} space complexity. The patch will reduce the space complexity but it each task needs to go through the entire meta file. [~jeagles], Can you please take a look at the patch and merge it at your convenience? > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Ahmed Hussein >Priority: Major > Attachments: TEZ-3391.001.patch, TEZ-3391.002.patch > > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021100#comment-17021100 ] TezQA commented on TEZ-3391: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 11s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 8s{color} | {color:green} tez-mapreduce in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/PreCommit-TEZ-Build/250/artifact/out/Dockerfile | | JIRA Issue | TEZ-3391 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12991477/TEZ-3391.001.patch | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile | | uname | Linux 4a78af525a8f 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/tez.sh | | git revision | master / 5b81017 | | Default Java | 1.8.0_232 | | Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/250/testReport/ | | Max. process+thread count | 209 (vs. ulimit of 5500) | | modules | C: tez-mapreduce U: tez-mapreduce | | Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/250/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.0.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Ahmed Hussein >Priority: Major > Attachments: TEZ-3391.001.patch > > > We had a case where Split
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496968#comment-16496968 ] Rohini Palaniswamy commented on TEZ-3391: - bq. 1) Moving the validation checks to AM Yo can skip this. Looking at the code, it is not a easy thing as the AM will have to deconstruct it to MRInput and then perform the check. It would be wasteful to do that just for this purpose. The validation check is better done by clients like Pig which create the file. They can do it even before submitting the DAG which is even more better. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Assignee: Nishant Dash >Priority: Major > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493991#comment-16493991 ] Nishant Dash commented on TEZ-3391: --- [~rohini] In regards to point 1 of moving the validation checks to AM, I was hoping if you could shed some more light on where or at which point/step couldĀ the checks be included? > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy >Priority: Major > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013138#comment-16013138 ] Rohini Palaniswamy commented on TEZ-3391: - Doing these two things will save couple of millis in each map vertex. 1) Moving the validation checks to AM 2) In the vertex construct TaskSplitMetaInfo only for the split of that task instead of constructing for all splits. ie change public static TaskSplitMetaInfo[] readSplitMetaInfo(Configuration conf, FileSystem fs) to public static TaskSplitMetaInfo getSplitMetaInfo(Configuration conf, FileSystem fs, int index) and skip reading splits below the index. If there are 1000 splits, the first task will read 1 split, second task will read 2 splits and so on instead of each task reading all the 1000 splits as is happening now. SplitMetaInfoReaderTez.java {code} try { JobSplit.SplitMetaInfo splitMetaInfo = new JobSplit.SplitMetaInfo(); for (int i = 0; i < numSplits; i++) { splitMetaInfo.readFields(in); if (i == index) { return new JobSplit.TaskSplitMetaInfo(splitIndex, splitMetaInfo.getLocations(), splitMetaInfo.getInputDataLength()); } } } finally { in.close(); } {code} > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402982#comment-15402982 ] Jason Lowe commented on TEZ-3391: - bq. The Tez AM also reported that too many containers where running, while in practice it was not. This was technically "correct" in the sense that the DAG status is reporting how many *tasks*, not attempts, are running. One cannot assume that the counter being shown for "Running: " means how many task attempts or containers are currently executing. A task goes into the running state as soon as the first attempt for it is launched. In this particular case a large number of tasks all had one attempt start and then promptly fail. That left the tasks in the running state. Most were waiting for another attempt to launch with no attempt for them currently running. The key distinction is task vs. attempt. A task can be in the running state with no attempt currently running for it. A separate RunningTaskAttempts counter being reported in the DAG status would have made this more explicit. > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3391) MR split file validation should be done in the AM
[ https://issues.apache.org/jira/browse/TEZ-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402926#comment-15402926 ] Rohini Palaniswamy commented on TEZ-3391: - The Tez AM also reported that too many containers where running, while in practice it was not. {code} 2016-07-28 23:33:00,162 [Timer-1] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 311387 Succeeded: 69939 Running: 164954 Failed: 0 Killed: 0 FailedTaskAttempts: 156417 KilledTaskAttempts: 4807, diagnostics=, counters=null {code} {code} 016-07-28 23:02:55,170 [INFO] [Dispatcher thread {Central}] |history.HistoryEventHandler|: [HISTORY][DAG:Dag_1468638337805_452514_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=scope-6, taskAttemptId=attempt_1468638337805_452514_1_03_13_0, creationTime=1469746848974, allocationTime=1469746936804, startTime=1469746969091, finishTime=1469746975099, timeTaken=6008, status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while running task:java.io.IOException: Split metadata size exceeded 1000. Aborting job at org.apache.hadoop.mapreduce.split.SplitMetaInfoReaderTez.readSplitMetaInfo(SplitMetaInfoReaderTez.java:79) at org.apache.tez.mapreduce.lib.MRInputUtils.readSplits(MRInputUtils.java:53) at org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:470) at org.apache.tez.mapreduce.input.MRInput.initialize(MRInput.java:443) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:446) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:429) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeInputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:414) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} It would be good to move this validation to MRInputSplitDistributor > MR split file validation should be done in the AM > - > > Key: TEZ-3391 > URL: https://issues.apache.org/jira/browse/TEZ-3391 > Project: Apache Tez > Issue Type: Bug >Reporter: Rohini Palaniswamy > > We had a case where Split metadata size exceeded 1000. Instead of job > failing from validation during initialization in AM like mapreduce, each of > the tasks failed doing that validation during initialization. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)