Failed: TEZ-3767 PreCommit Build #2549
Jira: https://issues.apache.org/jira/browse/TEZ-3767 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2549/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.39 KB...] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 55:13 min [INFO] Finished at: 2017-06-27T13:30:30Z [INFO] Final Memory: 89M/1409M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874676/TEZ-3767.3.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in : org.apache.tez.test.TestTezJobs Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2549//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2549//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. a830ea6a26557b17e315819930c869d557771a7f logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064830#comment-16064830 ] TezQA commented on TEZ-3767: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874676/TEZ-3767.3.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in : org.apache.tez.test.TestTezJobs Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2549//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2549//console This message is automatically generated. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3777 PreCommit Build #2550
Jira: https://issues.apache.org/jira/browse/TEZ-3777 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2550/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 111.84 KB...] Running tests /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess cat: /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt: No such file or directory awk: cannot open /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt (No such file or directory) {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874679/TEZ-3777.2.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2550//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2550//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 6a069e3c81c828edb6c6c8714cff0d69986804b8 logged out == == Finished build. == == Archiving artifacts ERROR: No artifacts found that match the file pattern "patchprocess/*.*". Configuration error? ERROR: ?patchprocess/*.*? doesn?t match anything, but ?*.*? does. Perhaps that?s what you mean? Build step 'Archive the artifacts' changed build result to FAILURE [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065275#comment-16065275 ] Siddharth Seth commented on TEZ-3605: - Thanks for the updated patch. Fixes large records as well. +1, with one minor fix before committing. In PipelinedSorter - if (combiner != null) will run into an NPE. A simple hasNext check there as well fixes this. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, TEZ-3605.012.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065257#comment-16065257 ] Siddharth Seth commented on TEZ-3767: - +1. Not sure if the test failure is related. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065331#comment-16065331 ] Siddharth Seth commented on TEZ-3274: - Line 173: log should include the value for the source vertex instead of 0 (can be negative as well) I don't think the dependency on hadoop-mapreduce-client-core is requried in tez-dag. (If it is required, the VM would need to move to to tez-runtime-library module) Otherwise looks good to me, after addressing [~jeagles] comments. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (TEZ-3779) Tez query failed with OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/TEZ-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved TEZ-3779. - Resolution: Not A Bug Please send a mail to the hive-user list for issues like this. > Tez query failed with OutOfMemoryError: Java heap space > --- > > Key: TEZ-3779 > URL: https://issues.apache.org/jira/browse/TEZ-3779 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.5 >Reporter: Xin Yang > > Tez query failed with OutOfMemoryError > Query: > {code:java} > select a11.ISSR_CTRY_CD CTRY_CD, > a14.DMSTC_INTL_IND DMSTC_INTL_IND, > a11.ISSR_USR_BUS_ID bus_id, > ' ' CustCol_73, > a11.CPD_MNTH_ID CPD_MONTH_ID, > a11.prod_afs_cd_vcis prod_acct_fund_srce_cd_vcis, > sum((Case when a13.card_prsnt_cd in (1) then a11.auth_tran_us_amt > else NULL end)) AUTHTRANAMTUSD, > sum((Case when a13.card_prsnt_cd in (1) then a11.CS_TRAN_CNT else > NULL end)) AUTHTRANCNT, > (Case when max((Case when a13.card_prsnt_cd in (1) then 1 else 0 > end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (1) then > a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS1, > max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) > GODWFLAG1_1, > sum((Case when a13.card_prsnt_cd in (0) then a11.auth_tran_us_amt > else NULL end)) AUTHTRANAMTUSD1, > sum((Case when a13.card_prsnt_cd in (0) then a11.CS_TRAN_CNT else > NULL end)) AUTHTRANCNT1, > (Case when max((Case when a13.card_prsnt_cd in (0) then 1 else 0 > end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (0) then > a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS2, > max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) > GODWFLAG4_1 > fromopebi_bi.tcaef_auth_dtl_h a11 > joinOPCODE.TEDC_ECI_MOTOa12 > on(a11.ECI_MOTO_CD = a12.ECI_MOTO_CD) > joinOPCODE.TEDC_CARD_PRSNT_EBI a13 > on(a11.POS_ENTRY_MODE_CD = a13.POS_ENTRY_MODE_CD and > a11.POS_ENV_CD = a13.POS_ENV_CD and > a12.eci_moto_grp_cd = a13.eci_moto_grp_cd) > joinOPCODE.TEDC_ACCT_MRCH_JRSDCTN_CDa14 > on(a11.VCIS_ACCT_MRCH_JRSDCTN_CD = a14.ACCT_MRCH_JRSDCTN_CD) > joinOPCODE.TEDC_GLBL_PROD_IDa15 > on(a11.ALP_ACCT_PROD_ID = a15.PROD_ID_CD) > joinOPCODE.TEDC_AUTH_RESP_CDa16 > on(a11.resp_cd = a16.AUTH_RESP_CD) > where (a11.MRCH_CATG_CD not in (6010, 6011) > and a11.CPD_MNTH_ID BETWEEN 201602 and 201602 > and a11.PROC_TRAN_CD in ('00') > and a11.ISSR_CTRY_CD in (76) > and a11.reqst_msg_typ_cd in ('0100', '0200') > and a16.AUTH_RESP_RLUP_CD in (0, 1, 4, 5) > and a11.resp_cd not in ('13', '--') > and a11.reqst_msg_typ_cd in ('0100', '0200', '') > and a11.stip_advc_cd in ('1', '2', '3', '4', '5', '6') > and a11.ACQR_BIN_NUM not in (746922) > and a15.PROD_BRND_CD in ('VISA') > and a15.PROD_ID_PLTFRM_CD in ('BZ', ' ', 'CN', 'GV', 'CO') > and a11.acqr_pcr_num not in ('8088', '9088') > and (a13.card_prsnt_cd in (1) > or a13.card_prsnt_cd in (0))) > group bya11.ISSR_CTRY_CD, > a14.DMSTC_INTL_IND, > a11.ISSR_USR_BUS_ID, > a11.CPD_MNTH_ID, > a11.prod_afs_cd_vcis; > {code} > Stacktrace: > {code:java} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, > diagnostics=[Task failed, taskId=task_1495595408051_21107_2_03_00, > diagnostics=[TaskAttempt 0 failed, info=[Error: exceptio > nThrown=java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) > at > org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) > at > org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.(MemoryFetchedInput.java:38) > at > org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at >
Failed: TEZ-3775 PreCommit Build #2557
Jira: https://issues.apache.org/jira/browse/TEZ-3775 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2557/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 345.82 KB...] [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-ui [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874768/TEZ-3775.3.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2557//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2557//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 856faccd051556fd2bc5052126a13f7ea23d03da logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Compressed 3.51 MB of artifacts by 29.4% relative to #2555 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Comment Edited] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065231#comment-16065231 ] Jonathan Eagles edited comment on TEZ-3775 at 6/27/17 8:09 PM: --- [~rohini], changed to 1) Application Configuration and 2) Vertex Task Attempts. Left the ":" \-> "-" change, but could be change if needed. was (Author: jeagles): [~rohini], changed to 1) Application Configuration and 2) Vertex Task Attempts. Left the ":" -> "-" change but could be change if needed. > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch, TEZ-3775.2.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated TEZ-3274: - Attachment: TEZ-3274.006.patch Thanks for the review, [~sseth]! Addressed your comments in this new patch. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, TEZ-3274.006.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3605: - Attachment: TEZ-3605.013.patch Thanks a lot [~sseth] for the review comments. Uploading new patch with just the minor change to PipelinedSorter. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3775 PreCommit Build #2554
Jira: https://issues.apache.org/jira/browse/TEZ-3775 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2554/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 345.73 KB...] [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-ui [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874719/TEZ-3775.2.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2554//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2554//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. a88dc6df709f599325888e20e94d5db6ee7a14bf logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065436#comment-16065436 ] TezQA commented on TEZ-3775: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874719/TEZ-3775.2.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2554//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2554//console This message is automatically generated. > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch, TEZ-3775.2.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065151#comment-16065151 ] TezQA commented on TEZ-3778: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874705/TEZ-3778.001.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.rm.TestTaskScheduler Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2552//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2552//console This message is automatically generated. > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch, TEZ-3778.002.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3778 PreCommit Build #2552
Jira: https://issues.apache.org/jira/browse/TEZ-3778 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2552/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 331.35 KB...] [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-dag [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874705/TEZ-3778.001.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.rm.TestTaskScheduler Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2552//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2552//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 09a9e7a5771edb9e0fe989de164b52111f2faa2a logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.dag.app.rm.TestTaskScheduler.testTaskSchedulerDetermineMinHeldContainers Error Message: AppFinalStatus cannot be returned by isSession() isSession() should return boolean *** If you're unsure why you're getting above error read on. Due to the nature of the syntax above problem might occur because: 1. This exception *might* occur in wrongly written multi-threaded tests. Please refer to Mockito FAQ on limitations of concurrency testing. 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. Stack Trace: org.mockito.exceptions.misusing.WrongTypeOfReturnValue: AppFinalStatus cannot be returned by isSession() isSession() should return boolean *** If you're unsure why you're getting above error read on. Due to the nature of the syntax above problem might occur because: 1. This exception *might* occur in wrongly written multi-threaded tests. Please refer to Mockito FAQ on limitations of concurrency testing. 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. at org.apache.tez.dag.app.rm.TestTaskScheduler.testTaskSchedulerDetermineMinHeldContainers(TestTaskScheduler.java:965)
Failed: TEZ-3274 PreCommit Build #2551
Jira: https://issues.apache.org/jira/browse/TEZ-3274 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2551/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 334.04 KB...] [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-mapreduce [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874704/TEZ-3274.005.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 24 javac compiler warnings (more than the master's current 21 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. d1f0aff34b9120d0b4e5edb4224e043ed9ea logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys.verifyTezOverridenKeys Error Message: expected:<0.95> but was:<0.0> Stack Trace: java.lang.AssertionError: expected:<0.95> but was:<0.0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:519) at org.junit.Assert.assertEquals(Assert.java:609) at org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys.verifyTezOverridenKeys(TestDeprecatedKeys.java:137)
[jira] [Commented] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065159#comment-16065159 ] TezQA commented on TEZ-3274: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874704/TEZ-3274.005.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 24 javac compiler warnings (more than the master's current 21 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2551//console This message is automatically generated. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065165#comment-16065165 ] Rohini Palaniswamy commented on TEZ-3775: - Noticed two things different from 0.7 - We are replacing : in 0.7 with - in 0.9 as a separator. Not a big deal. - 0.7 did not have Details/Counters/Tasks/Attempts for vertex, task, attempt. This patch adds it and it is good. Just couple of minor comments. 1) Application Configurations -> Application Configuration 2) Vertex Attempts -> Task Attempts or Vertex Task Attempts > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065167#comment-16065167 ] Kuhu Shukla commented on TEZ-3778: -- The test failure seems unrelated and passes locally. > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch, TEZ-3778.002.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065516#comment-16065516 ] TezQA commented on TEZ-3605: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874729/TEZ-3605.013.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: TEZ-3605 PreCommit Build #2555
Jira: https://issues.apache.org/jira/browse/TEZ-3605 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2555/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.55 KB...] [INFO] Tez SUCCESS [ 0.020 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 53:39 min [INFO] Finished at: 2017-06-27T21:36:20Z [INFO] Final Memory: 93M/1370M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874729/TEZ-3605.013.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. a3a2ee220f567f975bdf36f564c9855222b87648 logged out == == Finished build. == == Archiving artifacts [description-setter] Description set: TEZ-3605 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065525#comment-16065525 ] Kuhu Shukla commented on TEZ-3605: -- [~sseth], with the latest patch running a clean pre-commit, request for one last review if needed, else I will commit this tomorrow if there are no objections from the community till then. Thanks! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065583#comment-16065583 ] TezQA commented on TEZ-3274: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874745/TEZ-3274.006.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 24 javac compiler warnings (more than the master's current 21 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.history.TestHistoryParser org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//console This message is automatically generated. > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, TEZ-3274.006.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3274 PreCommit Build #2556
Jira: https://issues.apache.org/jira/browse/TEZ-3274 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2556/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 333.69 KB...] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-mapreduce [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874745/TEZ-3274.006.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 24 javac compiler warnings (more than the master's current 21 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.history.TestHistoryParser org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2556//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 6109cbf0f98fbbafcf366c119894432a273604fd logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 3 tests failed. FAILED: org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys.verifyTezOverridenKeys Error Message: expected:<0.95> but was:<0.0> Stack Trace: java.lang.AssertionError: expected:<0.95> but was:<0.0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:519) at org.junit.Assert.assertEquals(Assert.java:609) at org.apache.tez.mapreduce.hadoop.TestDeprecatedKeys.verifyTezOverridenKeys(TestDeprecatedKeys.java:137) FAILED: org.apache.tez.history.TestHistoryParser.testParserWithFailedJob Error Message: null Stack Trace: java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.history.TestHistoryParser.testParserWithFailedJob(TestHistoryParser.java:383) FAILED: org.apache.tez.history.TestHistoryParser.testParserWithSuccessfulJob Error Message: null Stack Trace: java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.history.TestHistoryParser.testParserWithSuccessfulJob(TestHistoryParser.java:207)
[jira] [Commented] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065589#comment-16065589 ] Rohini Palaniswamy commented on TEZ-3775: - Couple more comments: 1) Tez UI in each title seems redundant. Have to increase tab width more than before (using https://addons.mozilla.org/en-US/firefox/addon/tree-style-tab/) to view full title. If we are removing "Tez UI:" , can we change - to : in 0.9 as well? Consistent when dealing with multiple versions and easier on eyes like mine with OCD . 2) Details can also be removed for same reason. DAG/Vertex/Task/Task Attempt is good enough for the details page. 3) 0.9 shows vertex title as "Tez UI: Vertex Details - vertex_1492628984747_2502482_7_00" . 0.7 shows the vertex title as "Vertex: File Merge - vertex_1492628984747_2502482_7_00" where "File Merge" is the name of vertex in this hive DAG. In Pig DAG's you will have scope- in there. Name of the vertex is the most useful info. That needs to be added back. Thanks for fixing this. I have been going crazy without this when opening lot of tabs and struggling to switch between them without a clue. > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch, TEZ-3775.2.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3274) Vertex with MRInput and broadcast input does not respect slow start
[ https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065624#comment-16065624 ] Eric Badger commented on TEZ-3274: -- The test failure is related. Mapping the deprecated MR key to both the RootInput and Shuffle VertexManagers doesn't work. This means that the double mapping for {{MRJobConfig.GROUP_COMPARATOR_CLASS}} probably doesn't work either. {code} registerMRToRuntimeKeyTranslation(MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART, ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_MIN_SRC_FRACTION); registerMRToRuntimeKeyTranslation(MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART, RootInputVertexManager.TEZ_ROOT_INPUT_VERTEX_MANAGER_MIN_SRC_FRACTION); {code} [~jeagles], how should we address this? > Vertex with MRInput and broadcast input does not respect slow start > --- > > Key: TEZ-3274 > URL: https://issues.apache.org/jira/browse/TEZ-3274 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Eric Badger > Attachments: TEZ-3274.001.patch, TEZ-3274.002.patch, > TEZ-3274.003.patch, TEZ-3274.004.patch, TEZ-3274.005.patch, TEZ-3274.006.patch > > > Vertices with shuffle input and MRInput choose RootInputVertexManager (and > not ShuffleVertexManager) and start containers and tasks immediately. In this > scenario, resources can be wasted since they do not respect > tez.shuffle-vertex-manager.min-src-fraction > tez.shuffle-vertex-manager.max-src-fraction. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3775) Tez UI: Show DAG context in document title
[ https://issues.apache.org/jira/browse/TEZ-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3775: - Attachment: TEZ-3775.3.patch [~rohini], I have addressed all you comments regarding formatting. Thank you a lot for the feedback. [~Sreenath], please have a review when you have a chance. I would like this to make the Tez 0.9 release if possible. > Tez UI: Show DAG context in document title > --- > > Key: TEZ-3775 > URL: https://issues.apache.org/jira/browse/TEZ-3775 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3775.1.patch, TEZ-3775.2.patch, TEZ-3775.3.patch > > > In Tez UI 0.7, DAG (vertex, app, task, attempt) context was shown in the > document title. This was lost in the 0.9 UI migration. This jira attempts to > bring that feature back. This feature is essential when supporting large > clusters where a dev or support person may have dozens of dags open at the > same time. Having context in the document title (the tab title), will allow > us to quickly navigate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065652#comment-16065652 ] Zhiyuan Yang commented on TEZ-3769: --- +1 after fixing javac warning. One nit: it seems unnecessary to create a new ArrayList here: {code} SpillCallable spillCallable = new SpillCallable(new ArrayList(filledBuffers), {code} > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065653#comment-16065653 ] Zhiyuan Yang commented on TEZ-3769: --- General discussion beyond this patch: 1. about counter ADDITIONAL_SPILLS_BYTES_WRITTEN, there are difference between the usage (final spill stats) and documentation(bytes written due to unnecessary spills). 2. Think we should refactor this unordered writer later sometime. Right now it's stuffed with too many things and so many code path was multiplexed. It'll be harder and harder to modify or review. > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065653#comment-16065653 ] Zhiyuan Yang edited comment on TEZ-3769 at 6/27/17 11:19 PM: - General discussion beyond this patch: 1. about counter ADDITIONAL_SPILLS_BYTES_WRITTEN, there are difference between the usage (final spill stats) and documentation(bytes written due to unnecessary spills). If final spill size is not useful, we can merge it into normal counter. Or we just fix the documentation/comments. 2. Think we should refactor this unordered writer later sometime. Right now it's stuffed with too many things and so many code path was multiplexed. It'll be harder and harder to modify or review. was (Author: aplusplus): General discussion beyond this patch: 1. about counter ADDITIONAL_SPILLS_BYTES_WRITTEN, there are difference between the usage (final spill stats) and documentation(bytes written due to unnecessary spills). 2. Think we should refactor this unordered writer later sometime. Right now it's stuffed with too many things and so many code path was multiplexed. It'll be harder and harder to modify or review. > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3778: - Attachment: TEZ-3778.002.patch Added a comment to the pom.xml. > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch, TEZ-3778.002.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3778: - Attachment: TEZ-3778.001.patch v1 patch that simply takes out the SecurityInfo service from META-INF > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3779) Tez OutOfMemoryError: Java heap space
Xin Yang created TEZ-3779: - Summary: Tez OutOfMemoryError: Java heap space Key: TEZ-3779 URL: https://issues.apache.org/jira/browse/TEZ-3779 Project: Apache Tez Issue Type: Bug Affects Versions: 0.8.5 Reporter: Xin Yang Query: {code:java} select a11.ISSR_CTRY_CD CTRY_CD, a14.DMSTC_INTL_IND DMSTC_INTL_IND, a11.ISSR_USR_BUS_ID bus_id, ' ' CustCol_73, a11.CPD_MNTH_ID CPD_MONTH_ID, a11.prod_afs_cd_vcis prod_acct_fund_srce_cd_vcis, sum((Case when a13.card_prsnt_cd in (1) then a11.auth_tran_us_amt else NULL end)) AUTHTRANAMTUSD, sum((Case when a13.card_prsnt_cd in (1) then a11.CS_TRAN_CNT else NULL end)) AUTHTRANCNT, (Case when max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (1) then a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS1, max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) GODWFLAG1_1, sum((Case when a13.card_prsnt_cd in (0) then a11.auth_tran_us_amt else NULL end)) AUTHTRANAMTUSD1, sum((Case when a13.card_prsnt_cd in (0) then a11.CS_TRAN_CNT else NULL end)) AUTHTRANCNT1, (Case when max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (0) then a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS2, max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) GODWFLAG4_1 fromopebi_bi.tcaef_auth_dtl_h a11 joinOPCODE.TEDC_ECI_MOTOa12 on(a11.ECI_MOTO_CD = a12.ECI_MOTO_CD) joinOPCODE.TEDC_CARD_PRSNT_EBI a13 on(a11.POS_ENTRY_MODE_CD = a13.POS_ENTRY_MODE_CD and a11.POS_ENV_CD = a13.POS_ENV_CD and a12.eci_moto_grp_cd = a13.eci_moto_grp_cd) joinOPCODE.TEDC_ACCT_MRCH_JRSDCTN_CDa14 on(a11.VCIS_ACCT_MRCH_JRSDCTN_CD = a14.ACCT_MRCH_JRSDCTN_CD) joinOPCODE.TEDC_GLBL_PROD_IDa15 on(a11.ALP_ACCT_PROD_ID = a15.PROD_ID_CD) joinOPCODE.TEDC_AUTH_RESP_CDa16 on(a11.resp_cd = a16.AUTH_RESP_CD) where (a11.MRCH_CATG_CD not in (6010, 6011) and a11.CPD_MNTH_ID BETWEEN 201602 and 201602 and a11.PROC_TRAN_CD in ('00') and a11.ISSR_CTRY_CD in (76) and a11.reqst_msg_typ_cd in ('0100', '0200') and a16.AUTH_RESP_RLUP_CD in (0, 1, 4, 5) and a11.resp_cd not in ('13', '--') and a11.reqst_msg_typ_cd in ('0100', '0200', '') and a11.stip_advc_cd in ('1', '2', '3', '4', '5', '6') and a11.ACQR_BIN_NUM not in (746922) and a15.PROD_BRND_CD in ('VISA') and a15.PROD_ID_PLTFRM_CD in ('BZ', ' ', 'CN', 'GV', 'CO') and a11.acqr_pcr_num not in ('8088', '9088') and (a13.card_prsnt_cd in (1) or a13.card_prsnt_cd in (0))) group bya11.ISSR_CTRY_CD, a14.DMSTC_INTL_IND, a11.ISSR_USR_BUS_ID, a11.CPD_MNTH_ID, a11.prod_afs_cd_vcis; {code} Stacktrace: {code:java} Status: Failed Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, diagnostics=[Task failed, taskId=task_1495595408051_21107_2_03_00, diagnostics=[TaskAttempt 0 failed, info=[Error: exceptio nThrown=java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) at org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.(MemoryFetchedInput.java:38) at org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141) at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) , errorMessage=Fetch failed:java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) at
[jira] [Updated] (TEZ-3779) Tez query failed with OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/TEZ-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Yang updated TEZ-3779: -- Summary: Tez query failed with OutOfMemoryError: Java heap space (was: Tez OutOfMemoryError: Java heap space) > Tez query failed with OutOfMemoryError: Java heap space > --- > > Key: TEZ-3779 > URL: https://issues.apache.org/jira/browse/TEZ-3779 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.5 >Reporter: Xin Yang > > Query: > {code:java} > select a11.ISSR_CTRY_CD CTRY_CD, > a14.DMSTC_INTL_IND DMSTC_INTL_IND, > a11.ISSR_USR_BUS_ID bus_id, > ' ' CustCol_73, > a11.CPD_MNTH_ID CPD_MONTH_ID, > a11.prod_afs_cd_vcis prod_acct_fund_srce_cd_vcis, > sum((Case when a13.card_prsnt_cd in (1) then a11.auth_tran_us_amt > else NULL end)) AUTHTRANAMTUSD, > sum((Case when a13.card_prsnt_cd in (1) then a11.CS_TRAN_CNT else > NULL end)) AUTHTRANCNT, > (Case when max((Case when a13.card_prsnt_cd in (1) then 1 else 0 > end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (1) then > a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS1, > max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) > GODWFLAG1_1, > sum((Case when a13.card_prsnt_cd in (0) then a11.auth_tran_us_amt > else NULL end)) AUTHTRANAMTUSD1, > sum((Case when a13.card_prsnt_cd in (0) then a11.CS_TRAN_CNT else > NULL end)) AUTHTRANCNT1, > (Case when max((Case when a13.card_prsnt_cd in (0) then 1 else 0 > end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (0) then > a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS2, > max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) > GODWFLAG4_1 > fromopebi_bi.tcaef_auth_dtl_h a11 > joinOPCODE.TEDC_ECI_MOTOa12 > on(a11.ECI_MOTO_CD = a12.ECI_MOTO_CD) > joinOPCODE.TEDC_CARD_PRSNT_EBI a13 > on(a11.POS_ENTRY_MODE_CD = a13.POS_ENTRY_MODE_CD and > a11.POS_ENV_CD = a13.POS_ENV_CD and > a12.eci_moto_grp_cd = a13.eci_moto_grp_cd) > joinOPCODE.TEDC_ACCT_MRCH_JRSDCTN_CDa14 > on(a11.VCIS_ACCT_MRCH_JRSDCTN_CD = a14.ACCT_MRCH_JRSDCTN_CD) > joinOPCODE.TEDC_GLBL_PROD_IDa15 > on(a11.ALP_ACCT_PROD_ID = a15.PROD_ID_CD) > joinOPCODE.TEDC_AUTH_RESP_CDa16 > on(a11.resp_cd = a16.AUTH_RESP_CD) > where (a11.MRCH_CATG_CD not in (6010, 6011) > and a11.CPD_MNTH_ID BETWEEN 201602 and 201602 > and a11.PROC_TRAN_CD in ('00') > and a11.ISSR_CTRY_CD in (76) > and a11.reqst_msg_typ_cd in ('0100', '0200') > and a16.AUTH_RESP_RLUP_CD in (0, 1, 4, 5) > and a11.resp_cd not in ('13', '--') > and a11.reqst_msg_typ_cd in ('0100', '0200', '') > and a11.stip_advc_cd in ('1', '2', '3', '4', '5', '6') > and a11.ACQR_BIN_NUM not in (746922) > and a15.PROD_BRND_CD in ('VISA') > and a15.PROD_ID_PLTFRM_CD in ('BZ', ' ', 'CN', 'GV', 'CO') > and a11.acqr_pcr_num not in ('8088', '9088') > and (a13.card_prsnt_cd in (1) > or a13.card_prsnt_cd in (0))) > group bya11.ISSR_CTRY_CD, > a14.DMSTC_INTL_IND, > a11.ISSR_USR_BUS_ID, > a11.CPD_MNTH_ID, > a11.prod_afs_cd_vcis; > {code} > Stacktrace: > {code:java} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, > diagnostics=[Task failed, taskId=task_1495595408051_21107_2_03_00, > diagnostics=[TaskAttempt 0 failed, info=[Error: exceptio > nThrown=java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) > at > org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) > at > org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.(MemoryFetchedInput.java:38) > at > org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195) > at > org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at >
[jira] [Updated] (TEZ-3779) Tez query failed with OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/TEZ-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Yang updated TEZ-3779: -- Description: Tez query failed with OutOfMemoryError Query: {code:java} select a11.ISSR_CTRY_CD CTRY_CD, a14.DMSTC_INTL_IND DMSTC_INTL_IND, a11.ISSR_USR_BUS_ID bus_id, ' ' CustCol_73, a11.CPD_MNTH_ID CPD_MONTH_ID, a11.prod_afs_cd_vcis prod_acct_fund_srce_cd_vcis, sum((Case when a13.card_prsnt_cd in (1) then a11.auth_tran_us_amt else NULL end)) AUTHTRANAMTUSD, sum((Case when a13.card_prsnt_cd in (1) then a11.CS_TRAN_CNT else NULL end)) AUTHTRANCNT, (Case when max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (1) then a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS1, max((Case when a13.card_prsnt_cd in (1) then 1 else 0 end)) GODWFLAG1_1, sum((Case when a13.card_prsnt_cd in (0) then a11.auth_tran_us_amt else NULL end)) AUTHTRANAMTUSD1, sum((Case when a13.card_prsnt_cd in (0) then a11.CS_TRAN_CNT else NULL end)) AUTHTRANCNT1, (Case when max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) = 1 then count(distinct (Case when a13.card_prsnt_cd in (0) then a11.pymt_crd_acct_num_norm else NULL end)) else NULL end) WJXBFS2, max((Case when a13.card_prsnt_cd in (0) then 1 else 0 end)) GODWFLAG4_1 fromopebi_bi.tcaef_auth_dtl_h a11 joinOPCODE.TEDC_ECI_MOTOa12 on(a11.ECI_MOTO_CD = a12.ECI_MOTO_CD) joinOPCODE.TEDC_CARD_PRSNT_EBI a13 on(a11.POS_ENTRY_MODE_CD = a13.POS_ENTRY_MODE_CD and a11.POS_ENV_CD = a13.POS_ENV_CD and a12.eci_moto_grp_cd = a13.eci_moto_grp_cd) joinOPCODE.TEDC_ACCT_MRCH_JRSDCTN_CDa14 on(a11.VCIS_ACCT_MRCH_JRSDCTN_CD = a14.ACCT_MRCH_JRSDCTN_CD) joinOPCODE.TEDC_GLBL_PROD_IDa15 on(a11.ALP_ACCT_PROD_ID = a15.PROD_ID_CD) joinOPCODE.TEDC_AUTH_RESP_CDa16 on(a11.resp_cd = a16.AUTH_RESP_CD) where (a11.MRCH_CATG_CD not in (6010, 6011) and a11.CPD_MNTH_ID BETWEEN 201602 and 201602 and a11.PROC_TRAN_CD in ('00') and a11.ISSR_CTRY_CD in (76) and a11.reqst_msg_typ_cd in ('0100', '0200') and a16.AUTH_RESP_RLUP_CD in (0, 1, 4, 5) and a11.resp_cd not in ('13', '--') and a11.reqst_msg_typ_cd in ('0100', '0200', '') and a11.stip_advc_cd in ('1', '2', '3', '4', '5', '6') and a11.ACQR_BIN_NUM not in (746922) and a15.PROD_BRND_CD in ('VISA') and a15.PROD_ID_PLTFRM_CD in ('BZ', ' ', 'CN', 'GV', 'CO') and a11.acqr_pcr_num not in ('8088', '9088') and (a13.card_prsnt_cd in (1) or a13.card_prsnt_cd in (0))) group bya11.ISSR_CTRY_CD, a14.DMSTC_INTL_IND, a11.ISSR_USR_BUS_ID, a11.CPD_MNTH_ID, a11.prod_afs_cd_vcis; {code} Stacktrace: {code:java} Status: Failed Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, diagnostics=[Task failed, taskId=task_1495595408051_21107_2_03_00, diagnostics=[TaskAttempt 0 failed, info=[Error: exceptio nThrown=java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) at org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.(MemoryFetchedInput.java:38) at org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141) at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489) at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195) at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) , errorMessage=Fetch failed:java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56) at org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46) at org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.(MemoryFetchedInput.java:38) at
[jira] [Comment Edited] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065167#comment-16065167 ] Kuhu Shukla edited comment on TEZ-3778 at 6/27/17 6:01 PM: --- The test failure seems unrelated and passes locally. Investigating... was (Author: kshukla): The test failure seems unrelated and passes locally. > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch, TEZ-3778.002.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3778 PreCommit Build #2553
Jira: https://issues.apache.org/jira/browse/TEZ-3778 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2553/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.26 KB...] [INFO] Total time: 01:01 h [INFO] Finished at: 2017-06-27T18:10:25Z [INFO] Final Memory: 80M/1461M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874712/TEZ-3778.002.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2553//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2553//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 76597fce883b6b6bc18f536ccbdecb0d0efcdb1e logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Compressed 3.50 MB of artifacts by 25.0% relative to #2548 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3778) Remove SecurityInfo from tez-auxservices shaded jar
[ https://issues.apache.org/jira/browse/TEZ-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065229#comment-16065229 ] TezQA commented on TEZ-3778: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874712/TEZ-3778.002.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2553//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2553//console This message is automatically generated. > Remove SecurityInfo from tez-auxservices shaded jar > --- > > Key: TEZ-3778 > URL: https://issues.apache.org/jira/browse/TEZ-3778 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Blocker > Attachments: TEZ-3778.001.patch, TEZ-3778.002.patch > > > After removing the yarn-client depedencies, DAGClientSecurityInfo in the > SecurityInfo META-INF services can cause RM and NMs to not come up as the > service is not part of the jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3777) Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter
[ https://issues.apache.org/jira/browse/TEZ-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-3777: -- Attachment: TEZ-3777.1.patch > Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter > - > > Key: TEZ-3777 > URL: https://issues.apache.org/jira/browse/TEZ-3777 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3777.1.patch > > > RLE information computed in pipelinedSorter can be passed on to > {{TezMerger.merge}} in {{PipelinedSorter::flush}}. Depending on dataset, > this can save lot of data copies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3777) Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter
Rajesh Balamohan created TEZ-3777: - Summary: Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter Key: TEZ-3777 URL: https://issues.apache.org/jira/browse/TEZ-3777 Project: Apache Tez Issue Type: Bug Reporter: Rajesh Balamohan RLE information computed in pipelinedSorter can be passed on to {{TezMerger.merge}} in {{PipelinedSorter::flush}}. Depending on dataset, this can save lot of data copies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (TEZ-3777) Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter
[ https://issues.apache.org/jira/browse/TEZ-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned TEZ-3777: - Assignee: Rajesh Balamohan > Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter > - > > Key: TEZ-3777 > URL: https://issues.apache.org/jira/browse/TEZ-3777 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > > RLE information computed in pipelinedSorter can be passed on to > {{TezMerger.merge}} in {{PipelinedSorter::flush}}. Depending on dataset, > this can save lot of data copies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3718) Better handling of 'bad' nodes
[ https://issues.apache.org/jira/browse/TEZ-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064683#comment-16064683 ] TezQA commented on TEZ-3718: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874660/TEZ-3718.2.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2548//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2548//console This message is automatically generated. > Better handling of 'bad' nodes > -- > > Key: TEZ-3718 > URL: https://issues.apache.org/jira/browse/TEZ-3718 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Zhiyuan Yang > Attachments: TEZ-3718.1.patch, TEZ-3718.2.patch > > > At the moment, the default behaviour in case of a node being marked bad is to > do nothing other than not schedule new tasks on this node. > The alternate, via config, is to retroactively kill every task which ran on > the node, which causes far too many unnecessary re-runs. > Proposing the following changes. > 1. KILL fragments which are currently in the RUNNING state (instead of > relying on a timeout which leads to the attempt being marked as FAILED after > the timeout interval. > 2. Keep track of these failed nodes, and use this as input to the failure > heuristics. Normally source tasks require multiple consumers to report > failure for them to be marked as bad. If a single consumer reports failure > against a source which ran on a bad node, consider it bad and re-schedule > immediately. (Otherwise failures can take a while to propagate, and jobs get > a lot slower). > [~jlowe] - think you've looked at this in the past. Any thoughts/suggestions. > What I'm seeing is retroactive failures taking a long time to apply, and > restart sources which ran on a bad node. Also running tasks being counted as > FAILURES instead of KILLS. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3777) Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter
[ https://issues.apache.org/jira/browse/TEZ-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-3777: -- Attachment: TEZ-3777.2.patch > Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter > - > > Key: TEZ-3777 > URL: https://issues.apache.org/jira/browse/TEZ-3777 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3777.1.patch, TEZ-3777.2.patch > > > RLE information computed in pipelinedSorter can be passed on to > {{TezMerger.merge}} in {{PipelinedSorter::flush}}. Depending on dataset, > this can save lot of data copies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-3767: -- Attachment: TEZ-3767.3.patch Uploading .3 patch; addressing review comments. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064752#comment-16064752 ] Kuhu Shukla commented on TEZ-3605: -- Request for comments/review on the latest patch [~sseth], [~jeagles]. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, TEZ-3605.012.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3718) Better handling of 'bad' nodes
[ https://issues.apache.org/jira/browse/TEZ-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated TEZ-3718: -- Attachment: TEZ-3718.2.patch Upload new patch to address comments. > Better handling of 'bad' nodes > -- > > Key: TEZ-3718 > URL: https://issues.apache.org/jira/browse/TEZ-3718 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Zhiyuan Yang > Attachments: TEZ-3718.1.patch, TEZ-3718.2.patch > > > At the moment, the default behaviour in case of a node being marked bad is to > do nothing other than not schedule new tasks on this node. > The alternate, via config, is to retroactively kill every task which ran on > the node, which causes far too many unnecessary re-runs. > Proposing the following changes. > 1. KILL fragments which are currently in the RUNNING state (instead of > relying on a timeout which leads to the attempt being marked as FAILED after > the timeout interval. > 2. Keep track of these failed nodes, and use this as input to the failure > heuristics. Normally source tasks require multiple consumers to report > failure for them to be marked as bad. If a single consumer reports failure > against a source which ran on a bad node, consider it bad and re-schedule > immediately. (Otherwise failures can take a while to propagate, and jobs get > a lot slower). > [~jlowe] - think you've looked at this in the past. Any thoughts/suggestions. > What I'm seeing is retroactive failures taking a long time to apply, and > restart sources which ran on a bad node. Also running tasks being counted as > FAILURES instead of KILLS. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064602#comment-16064602 ] Siddharth Seth commented on TEZ-3767: - Should this also have the checks for isShutdown / previously reported errors, like reportException? so that there isn't a race between a previous exception and a killSelf invocation. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3777) Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter
[ https://issues.apache.org/jira/browse/TEZ-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065662#comment-16065662 ] Rajesh Balamohan commented on TEZ-3777: --- \cc [~gopalv] > Avoid buffer copies by passing RLE flag to TezMerger from PipelinedSorter > - > > Key: TEZ-3777 > URL: https://issues.apache.org/jira/browse/TEZ-3777 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3777.1.patch, TEZ-3777.2.patch > > > RLE information computed in pipelinedSorter can be passed on to > {{TezMerger.merge}} in {{PipelinedSorter::flush}}. Depending on dataset, > this can save lot of data copies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3770) DAG-aware YARN task scheduler
[ https://issues.apache.org/jira/browse/TEZ-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065756#comment-16065756 ] Bikas Saha commented on TEZ-3770: - Just clarifying that the original scheduler was not made dag aware by design. It was an attempt to prevent leaky features where code changed across the scheduler and the dag state machine. Like it happened in MR code where logic was spread all over. The DAG core logic and VertexManager user logic could determine the dependencies and priorities of tasks and the scheduler would allocate resources based on priority. So other schedulers could be easily written since they dont need to understand complex relationships. However not all of those design assumptions have been validated since we dont have many schedulers written :P > DAG-aware YARN task scheduler > - > > Key: TEZ-3770 > URL: https://issues.apache.org/jira/browse/TEZ-3770 > Project: Apache Tez > Issue Type: New Feature >Reporter: Jason Lowe >Assignee: Jason Lowe > Attachments: TEZ-3770.001.patch > > > There are cases where priority alone does not convey the relationship between > tasks, and this can cause problems when scheduling or preempting tasks. If > the YARN task scheduler was aware of the relationship between tasks then it > could make smarter decisions when trying to assign tasks to containers or > preempt running tasks to schedule pending tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-3769: -- Attachment: TEZ-3769.3.patch Uploading .3 with review comments addressed. Agreed that unordered writer needs refactoring to reduce the complexity. > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch, TEZ-3769.3.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned TEZ-3769: - Assignee: Rajesh Balamohan > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch, TEZ-3769.3.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: TEZ-3767 PreCommit Build #2558
Jira: https://issues.apache.org/jira/browse/TEZ-3767 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2558/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.60 KB...] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 50:28 min [INFO] Finished at: 2017-06-28T00:41:16Z [INFO] Final Memory: 105M/1393M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874676/TEZ-3767.3.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2558//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2558//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. c716719e597ea1632fb2bb33a695ace226951285 logged out == == Finished build. == == Archiving artifacts Compressed 3.50 MB of artifacts by 10.7% relative to #2555 [description-setter] Description set: TEZ-3767 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065736#comment-16065736 ] TezQA commented on TEZ-3767: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874676/TEZ-3767.3.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2558//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2558//console This message is automatically generated. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3617) TestHistoryParser#testParserWithSuccessfulJob fails intermittently
[ https://issues.apache.org/jira/browse/TEZ-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065738#comment-16065738 ] Zhiyuan Yang commented on TEZ-3617: --- [~jeagles] It doesn't seems a port conflict issue. If port 8188 is already occupied, ATS fail to start and throw exception. I tried 'nc -l 8188' before running this test and get following error, different from the one attached to this jira. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: ApplicationHistoryServer failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster$ApplicationHistoryServerWrapper.serviceStart(MiniYARNCluster.java:717) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.tez.tests.MiniTezClusterWithTimeline.serviceStart(MiniTezClusterWithTimeline.java:202) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.tez.history.TestHistoryParser.setupTezCluster(TestHistoryParser.java:174) at org.apache.tez.history.TestHistoryParser.setupCluster(TestHistoryParser.java:133) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.junit.runner.JUnitCore.run(JUnitCore.java:160) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) Caused by: java.io.IOException: ApplicationHistoryServer failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster$ApplicationHistoryServerWrapper.serviceStart(MiniYARNCluster.java:713) ... {noformat} > TestHistoryParser#testParserWithSuccessfulJob fails intermittently > -- > > Key: TEZ-3617 > URL: https://issues.apache.org/jira/browse/TEZ-3617 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 > Environment: Ubuntu 14.04 >Reporter: Sonia Garudi >Assignee: Jonathan Eagles > Labels: ppc64le, x86 > Attachments: org.apache.tez.history.TestHistoryParser-output.txt, > TEZ-3617.1.patch > > > The TestHistoryParser#testParserWithSuccessfulJob test fails intermittently > in tez-history-parser project. > Error message : > testParserWithSuccessfulJob(org.apache.tez.history.TestHistoryParser) Time > elapsed: 29.952 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.tez.history.TestHistoryParser.verifyJobSpecificInfo(TestHistoryParser.java:266) > at > org.apache.tez.history.TestHistoryParser.testParserWithSuccessfulJob(TestHistoryParser.java:212) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3767) Shuffle should not report error to AM during inputContext.killSelf()
[ https://issues.apache.org/jira/browse/TEZ-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065743#comment-16065743 ] Rajesh Balamohan commented on TEZ-3767: --- Thanks for the review [~sseth]. Test failure earlier was not related to this patch. Will commit it shortly. > Shuffle should not report error to AM during inputContext.killSelf() > > > Key: TEZ-3767 > URL: https://issues.apache.org/jira/browse/TEZ-3767 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3767.1.patch, TEZ-3767.2.patch, TEZ-3767.2.patch, > TEZ-3767.3.patch > > > {{ShuffleScheduler::killSelf}} kills the current attempt when it encounters > certain errors. As a part of cleanup, it invokes {{close}} which internally > releases the resources. > If merge is happening in the middle, it could throw the following exception. > This is caught in {{RunShuffleCallable}} and reported to AM immediately. This > causes tasks to fail. > {noformat} > » Error: Error while running task ( failure ) : > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: > Error while doing final merge > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265) > at java.util.AbstractCollection.toArray(AbstractCollection.java:141) > at java.util.ArrayList.addAll(ArrayList.java:577) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:636) > at > org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316) > ... 6 more > {noformat} > When {{isShutDown}} is set to true, it would be good to avoid sending error > messages to AM. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: TEZ-3769 PreCommit Build #2559
Jira: https://issues.apache.org/jira/browse/TEZ-3769 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2559/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.43 KB...] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 54:44 min [INFO] Finished at: 2017-06-28T03:01:53Z [INFO] Final Memory: 92M/1380M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874789/TEZ-3769.3.patch against master revision 020a7c8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2559//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2559//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. aa7e80736c851b34307f9c3bfc6334bbe81f3bb1 logged out == == Finished build. == == Archiving artifacts Compressed 3.50 MB of artifacts by 10.7% relative to #2558 [description-setter] Description set: TEZ-3769 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled
[ https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065858#comment-16065858 ] TezQA commented on TEZ-3769: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874789/TEZ-3769.3.patch against master revision 020a7c8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2559//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2559//console This message is automatically generated. > Unordered: Fix wrong stats being sent out in the last event, when final merge > is disabled > - > > Key: TEZ-3769 > URL: https://issues.apache.org/jira/browse/TEZ-3769 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch, TEZ-3769.3.patch > > > When final merge is disabled (without pipelining), wrong stats was sent out > in the last event. > It was based on {{numRecordsPerPartition}} which contains the overall > partition data. It should be ideally be based on the spill result and its > buffers. > Also, {{finalSpill}} was unncessarily sending events when no data was present > (i.e, when currentBuffer didn't have any data). This can be optimized to > reduce the number of events being sent across. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3718) Better handling of 'bad' nodes
[ https://issues.apache.org/jira/browse/TEZ-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065786#comment-16065786 ] Siddharth Seth commented on TEZ-3718: - I'm not sure why AMNodeIMpl treats NodeUnhealthy and NodeBlacklisted differently from each other w.r.t the config which determines whether tasks need to be restarted or not. I think both can be treated the same. [~jlowe] - you may have more context on this. The changes to Node related classes mostly look good to me. Instead of isUnhealthy in the event - this could be an enum (UNHEALTHY, BLACKLISTED) For AMContainer, not sure why the fail task config needs to be read. Will the following work. - When event received, annotate the container to say "On A Failed Node" (Already done) - Inform prior and current attempts of the node failure. - Don't change container state - allow a task action to change the state via a STOP_REQEUST depending on task level configs. OR If there are not running fragments on the container, change state to a COMPLETED state, so that new tasks allocations are not accepted. Do not accept new tasks since nodeFailure has been set. TaskAttempt - From a brief glance, the functionality looks good. Fail_Fast / decide whether to keep a task / cause it to be killed on a node failure. Genearl - Don't read from a Configuration instance within each AMContainer / TaskAttemptImpl - there's example code on how to avoid this in TaskImpl/TaskAttemptImpl - Thought the configs woiuld be the following TEZ_AM_NODE_UNHEALTHY_RESCHEDULE_TASKS=false - Current, default=false TEZ_AM_NODE_UNHEALTHY_KILL_RUNNING=true - New, default=true (overrides TEZ_AM_NODE_UNHEALTHY_RESCHEDULE_TASKS) Third config in the patch looks good > Better handling of 'bad' nodes > -- > > Key: TEZ-3718 > URL: https://issues.apache.org/jira/browse/TEZ-3718 > Project: Apache Tez > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Zhiyuan Yang > Attachments: TEZ-3718.1.patch, TEZ-3718.2.patch > > > At the moment, the default behaviour in case of a node being marked bad is to > do nothing other than not schedule new tasks on this node. > The alternate, via config, is to retroactively kill every task which ran on > the node, which causes far too many unnecessary re-runs. > Proposing the following changes. > 1. KILL fragments which are currently in the RUNNING state (instead of > relying on a timeout which leads to the attempt being marked as FAILED after > the timeout interval. > 2. Keep track of these failed nodes, and use this as input to the failure > heuristics. Normally source tasks require multiple consumers to report > failure for them to be marked as bad. If a single consumer reports failure > against a source which ran on a bad node, consider it bad and re-schedule > immediately. (Otherwise failures can take a while to propagate, and jobs get > a lot slower). > [~jlowe] - think you've looked at this in the past. Any thoughts/suggestions. > What I'm seeing is retroactive failures taking a long time to apply, and > restart sources which ran on a bad node. Also running tasks being counted as > FAILURES instead of KILLS. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (TEZ-3617) TestHistoryParser#testParserWithSuccessfulJob fails intermittently
[ https://issues.apache.org/jira/browse/TEZ-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065738#comment-16065738 ] Zhiyuan Yang edited comment on TEZ-3617 at 6/28/17 12:49 AM: - [~jeagles] It doesn't seems a port conflict issue. If port 8188 is already occupied, ATS fail to start and throw exception. I tried 'nc -l 8188' before running this test and get following error, different from the one attached to this jira. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: ApplicationHistoryServer failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster$ApplicationHistoryServerWrapper.serviceStart(MiniYARNCluster.java:717) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.tez.tests.MiniTezClusterWithTimeline.serviceStart(MiniTezClusterWithTimeline.java:202) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.tez.history.TestHistoryParser.setupTezCluster(TestHistoryParser.java:174) at org.apache.tez.history.TestHistoryParser.setupCluster(TestHistoryParser.java:133) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.junit.runner.JUnitCore.run(JUnitCore.java:160) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) Caused by: java.io.IOException: ApplicationHistoryServer failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster$ApplicationHistoryServerWrapper.serviceStart(MiniYARNCluster.java:713) ... {noformat} I suspect there are other fixes in 2.7.2 related to this. [~skanekar]] Could you please try Jon's patch partially, i.e.,bumping hadoop version only? This help us to identify the root cause. was (Author: aplusplus): [~jeagles] It doesn't seems a port conflict issue. If port 8188 is already occupied, ATS fail to start and throw exception. I tried 'nc -l 8188' before running this test and get following error, different from the one attached to this jira. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: ApplicationHistoryServer failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster$ApplicationHistoryServerWrapper.serviceStart(MiniYARNCluster.java:717) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.tez.tests.MiniTezClusterWithTimeline.serviceStart(MiniTezClusterWithTimeline.java:202) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.tez.history.TestHistoryParser.setupTezCluster(TestHistoryParser.java:174) at org.apache.tez.history.TestHistoryParser.setupCluster(TestHistoryParser.java:133) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at