[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158501#comment-15158501 ] TezQA commented on TEZ-3124: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision fd75e64. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//console This message is automatically generated. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > TEZ-3124-4.patch, a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:
Failed: TEZ-3124 PreCommit Build #1502
Jira: https://issues.apache.org/jira/browse/TEZ-3124 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1502/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3769 lines...] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 56:55 min [INFO] Finished at: 2016-02-23T07:59:04+00:00 [INFO] Final Memory: 63M/844M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision fd75e64. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. b96c70826ed26a596dcd097105adec9dc36f13a1 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158393#comment-15158393 ] TezQA commented on TEZ-3124: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision fd75e64. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//console This message is automatically generated. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > TEZ-3124-4.patch, a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:
Failed: TEZ-3124 PreCommit Build #1501
Jira: https://issues.apache.org/jira/browse/TEZ-3124 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1501/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 9284 lines...] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :atlas-docs {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision fd75e64. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 206be74a18c45eb3c4364c30336efe4e951e03a3 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158300#comment-15158300 ] TezQA commented on TEZ-3124: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision 44ca229. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//console This message is automatically generated. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > TEZ-3124-4.patch, a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:
Failed: TEZ-3124 PreCommit Build #1500
Jira: https://issues.apache.org/jira/browse/TEZ-3124 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1500/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3773 lines...] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 57:19 min [INFO] Finished at: 2016-02-23T05:20:36+00:00 [INFO] Final Memory: 71M/1160M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch against master revision 44ca229. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 086061b26e7fdbc3b9e30be33ee26c2944e450fe logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-3124: Attachment: TEZ-3124-4.patch > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > TEZ-3124-4.patch, a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=class > org.apache.hadoop.io.IntWritable, > valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, > comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, > partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, > serialization=org.apache.hadoop.io.serializer.WritableSerialization > 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up > PipelinedSorter for ireduce1: , UsingHashComparator=false > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly > allocated block size=1725956096, index=0, Number of buffers=1, > currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096 > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSort
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158204#comment-15158204 ] TezQA commented on TEZ-3124: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.dag.impl.TestDAGImpl The following test timeouts occurred in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//console This message is automatically generated. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num
Failed: TEZ-3124 PreCommit Build #1499
Jira: https://issues.apache.org/jira/browse/TEZ-3124 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1499/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3555 lines...] [ERROR] mvn -rf :tez-dag [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.dag.impl.TestDAGImpl The following test timeouts occurred in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. fbf51670fcdd9af809a490d33063b1c030862880 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits Error Message: expected: but was: Stack Trace: java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits(TestDAGImpl.java:2290)
[jira] [Commented] (TEZ-3126) Log reason for not reducing parallelism
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158125#comment-15158125 ] Bikas Saha commented on TEZ-3126: - lgtm > Log reason for not reducing parallelism > --- > > Key: TEZ-3126 > URL: https://issues.apache.org/jira/browse/TEZ-3126 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Critical > Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch > > > For example, when reducing parallelism from 36 to 22. The basePartitionRange > will be 1 and will not re-configure the vertex. > {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey} > int desiredTaskParallelism = > (int)( > (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/ > desiredTaskInputDataSize); > if(desiredTaskParallelism < minTaskParallelism) { > desiredTaskParallelism = minTaskParallelism; > } > > if(desiredTaskParallelism >= currentParallelism) { > return true; > } > > // most shufflers will be assigned this range > basePartitionRange = currentParallelism/desiredTaskParallelism; > > if (basePartitionRange <= 1) { > // nothing to do if range is equal 1 partition. shuffler does it by > default > return true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread
[ https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158037#comment-15158037 ] Hitesh Shah commented on TEZ-3128: -- bq. Could you help me to clarify where to fix? The ContainerLauncher I think seems to be the one as per my understanding. Lets wait for [~sseth] or [~rajesh.balamohan] to supply additional logs to pinpoint the problem. > Avoid stopping containers on the AM shutdown thread > --- > > Key: TEZ-3128 > URL: https://issues.apache.org/jira/browse/TEZ-3128 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.0-alpha >Reporter: Siddharth Seth >Assignee: Tsuyoshi Ozawa > Labels: newbie > Attachments: TEZ-3128.001.patch > > > During an AM shutdown, the TaskCommunicator is also shutdown and it tries to > stop containers in the shutdown thread itself. This can cause the AM shutdown > to block if NMs are not available. > This likely affects 0.7 as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158031#comment-15158031 ] TezQA commented on TEZ-3131: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789087/TEZ-3131.3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//console This message is automatically generated. > Support a way to override test_root_dir for FaultToleranceTestRunner > > > Key: TEZ-3131 > URL: https://issues.apache.org/jira/browse/TEZ-3131 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah >Priority: Minor > Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch, TEZ-3131.3.patch > > > The path is hardcoded. For regression testing, it will be useful if it can be > overridden via command-line if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-3131 PreCommit Build #1498
Jira: https://issues.apache.org/jira/browse/TEZ-3131 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1498/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3610 lines...] [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-tests [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789087/TEZ-3131.3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 6cf4febb8576997a961b23136d7815620518bd06 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 6 tests failed. FAILED: org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261) FAILED: org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703) FAILED: org.apache.tez.test.TestFaultTolerance.testMultipleInputFailureWithoutExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.r
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158026#comment-15158026 ] Hitesh Shah commented on TEZ-3124: -- bq. The failed tests are TestFaultTolerance and TestDAGImpl.testCounterLimits which are not related. The following test timeouts occurred in : org.apache.tez.test.TestRecovery > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=class > org.apache.hadoop.io.IntWritable, > valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, > comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, > partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, > serialization=org.apache.hadoop.io.serializer.WritableSerialization > 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up > PipelinedSorter for ireduce1: , UsingHashComparator=false > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly > allocated block s
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158021#comment-15158021 ] Jeff Zhang commented on TEZ-3124: - The failed tests are TestFaultTolerance and TestDAGImpl.testCounterLimits which are not related. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=class > org.apache.hadoop.io.IntWritable, > valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, > comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, > partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, > serialization=org.apache.hadoop.io.serializer.WritableSerialization > 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up > PipelinedSorter for ireduce1: , UsingHashComparator=false > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly > allocated block size=1725956096, index=0, Number of buffers=1, > currentAllocatableMemory=0, curr
[jira] [Commented] (TEZ-3067) Links to tez configs documentation should be bubbled up to top-level release page
[ https://issues.apache.org/jira/browse/TEZ-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158007#comment-15158007 ] Tsuyoshi Ozawa commented on TEZ-3067: - Thanks [~hitesh] for your committing and reviewing. > Links to tez configs documentation should be bubbled up to top-level release > page > -- > > Key: TEZ-3067 > URL: https://issues.apache.org/jira/browse/TEZ-3067 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Tsuyoshi Ozawa > Labels: newbie > Fix For: 0.8.3 > > Attachments: TEZ-3067.001.patch, TEZ-3067.002.patch > > > http://tez.apache.org/releases/0.8.2/tez-api-javadocs/configs/TezConfiguration.html > is hidden away in the api docs. Would you useful to update the release > template to add direct links to the config docs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread
[ https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158004#comment-15158004 ] Tsuyoshi Ozawa commented on TEZ-3128: - [~hitesh] [~sseth] Thank you for pointing. {quote} dagappmaster shuts down yarn scheduler service but it does not kill containers on shutdown - just releases them via amrmclient TezTaskCommunicatorImpl on stop() does nothing to kill containers. {quote} Right, that's why I thought the place I fixed was what you mentioned. Could you help me to clarify where to fix? > Avoid stopping containers on the AM shutdown thread > --- > > Key: TEZ-3128 > URL: https://issues.apache.org/jira/browse/TEZ-3128 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.0-alpha >Reporter: Siddharth Seth >Assignee: Tsuyoshi Ozawa > Labels: newbie > Attachments: TEZ-3128.001.patch > > > During an AM shutdown, the TaskCommunicator is also shutdown and it tries to > stop containers in the shutdown thread itself. This can cause the AM shutdown > to block if NMs are not available. > This likely affects 0.7 as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3126) Log reason for not reducing parallelism
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157878#comment-15157878 ] TezQA commented on TEZ-3126: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789063/TEZ-3126.2.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//console This message is automatically generated. > Log reason for not reducing parallelism > --- > > Key: TEZ-3126 > URL: https://issues.apache.org/jira/browse/TEZ-3126 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Critical > Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch > > > For example, when reducing parallelism from 36 to 22. The basePartitionRange > will be 1 and will not re-configure the vertex. > {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey} > int desiredTaskParallelism = > (int)( > (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/ > desiredTaskInputDataSize); > if(desiredTaskParallelism < minTaskParallelism) { > desiredTaskParallelism = minTaskParallelism; > } > > if(desiredTaskParallelism >= currentParallelism) { > return true; > } > > // most shufflers will be assigned this range > basePartitionRange = currentParallelism/desiredTaskParallelism; > > if (basePartitionRange <= 1) { > // nothing to do if range is equal 1 partition. shuffler does it by > default > return true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-3131: - Attachment: TEZ-3131.3.patch > Support a way to override test_root_dir for FaultToleranceTestRunner > > > Key: TEZ-3131 > URL: https://issues.apache.org/jira/browse/TEZ-3131 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah >Priority: Minor > Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch, TEZ-3131.3.patch > > > The path is hardcoded. For regression testing, it will be useful if it can be > overridden via command-line if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-3126 PreCommit Build #1497
Jira: https://issues.apache.org/jira/browse/TEZ-3126 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1497/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3607 lines...] [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-tests [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789063/TEZ-3126.2.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 6f1eba9ae9e095834e1202ea1e7cdbbdf309e0b3 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 6 tests failed. FAILED: org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261) FAILED: org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703) FAILED: org.apache.tez.test.TestFaultTolerance.testMultipleInputFailureWithoutExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolera
[jira] [Updated] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-3131: - Attachment: TEZ-3131.2.patch > Support a way to override test_root_dir for FaultToleranceTestRunner > > > Key: TEZ-3131 > URL: https://issues.apache.org/jira/browse/TEZ-3131 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah >Priority: Minor > Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch > > > The path is hardcoded. For regression testing, it will be useful if it can be > overridden via command-line if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles reassigned TEZ-3115: Assignee: Jonathan Eagles > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3126) Log reason for not reducing parallelism
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3126: - Attachment: TEZ-3126.2.patch [~bikassaha], [~rajesh.balamohan] let me know if the updated log messages are clear enough. > Log reason for not reducing parallelism > --- > > Key: TEZ-3126 > URL: https://issues.apache.org/jira/browse/TEZ-3126 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Critical > Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch > > > For example, when reducing parallelism from 36 to 22. The basePartitionRange > will be 1 and will not re-configure the vertex. > {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey} > int desiredTaskParallelism = > (int)( > (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/ > desiredTaskInputDataSize); > if(desiredTaskParallelism < minTaskParallelism) { > desiredTaskParallelism = minTaskParallelism; > } > > if(desiredTaskParallelism >= currentParallelism) { > return true; > } > > // most shufflers will be assigned this range > basePartitionRange = currentParallelism/desiredTaskParallelism; > > if (basePartitionRange <= 1) { > // nothing to do if range is equal 1 partition. shuffler does it by > default > return true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157572#comment-15157572 ] Bikas Saha commented on TEZ-3131: - Sure. Please go ahead. +1. > Support a way to override test_root_dir for FaultToleranceTestRunner > > > Key: TEZ-3131 > URL: https://issues.apache.org/jira/browse/TEZ-3131 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah >Priority: Minor > Attachments: TEZ-3131.1.patch > > > The path is hardcoded. For regression testing, it will be useful if it can be > overridden via command-line if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157558#comment-15157558 ] Hitesh Shah commented on TEZ-3124: -- TestRecovery seems to have failed with the new patch > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=class > org.apache.hadoop.io.IntWritable, > valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, > comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, > partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, > serialization=org.apache.hadoop.io.serializer.WritableSerialization > 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up > PipelinedSorter for ireduce1: , UsingHashComparator=false > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly > allocated block size=1725956096, index=0, Number of buffers=1, > currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096
[jira] [Updated] (TEZ-3119) Add missing AM translations in DeprecatedKeys#populateMRToDagParamMap
[ https://issues.apache.org/jira/browse/TEZ-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3119: - Attachment: TEZ-3119.001.patch Attaching initial patch. > Add missing AM translations in DeprecatedKeys#populateMRToDagParamMap > - > > Key: TEZ-3119 > URL: https://issues.apache.org/jira/browse/TEZ-3119 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.2 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3119.001.patch > > > MRToDagParamMap is missing some of the relevant configs. Some of them include: > {code} > TEZ_CREDENTIALS_PATH > TEZ_AM_LOG_LEVEL > TEZ_AM_MAX_APP_ATTEMPTS > TEZ_AM_RESOURCE_MEMORY_MB > TEZ_AM_RESOURCE_CPU_VCORES > TEZ_AM_CLIENT_THREAD_COUNT > TEZ_AM_CLIENT_AM_PORT_RANGE > TEZ_AM_RM_HEARTBEAT_INTERVAL_MS_MAX > TASK_HEARTBEAT_TIMEOUT_MS > TEZ_TASK_AM_HEARTBEAT_INTERVAL_MS > TEZ_AM_APPLICATION_PRIORITY > TEZ_AM_VIEW_ACLS > TEZ_AM_MODIFY_ACLS > TEZ_CANCEL_DELEGATION_TOKENS_ON_COMPLETION > TEZ_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT > TEZ_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT > TEZ_AM_LEGACY_SPECULATIVE_SLOWTASK_THRESHOLD > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException
[ https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157509#comment-15157509 ] TezQA commented on TEZ-3129: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789030/TEZ-3129.1.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//console This message is automatically generated. > Tez task and task attempt UI needs application fails with NotFoundException > --- > > Key: TEZ-3129 > URL: https://issues.apache.org/jira/browse/TEZ-3129 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3129.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-3129 PreCommit Build #1496
Jira: https://issues.apache.org/jira/browse/TEZ-3129 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1496/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3769 lines...] [INFO] [INFO] Total time: 01:01 h [INFO] Finished at: 2016-02-22T19:12:36+00:00 [INFO] Final Memory: 64M/1032M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789030/TEZ-3129.1.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 37917eb3ef711b3d4bad3a0e3b1ed00413e64f90 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch
[ https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157505#comment-15157505 ] Hitesh Shah commented on TEZ-2863: -- +1 for the most part. \cc [~zjffdu] in case he has any comments on the recovery aspect where the container info is not being written to recovery and whether it needs to be. For the UI, i think it might be better to leave the UI unchanged. I think The UI probably can remain dumb about trying to figure out whether to redirect to syslog or stderr if the syslog_attempt* file does not exist ( main reasons are that an additional http call to verify existence will be needed and I am not sure if YARN supports that cleanly and secondly, the syslog/stderr choice may not be trivial to solve ) . > Container, node, and logs not available in UI for tasks that fail to launch > --- > > Key: TEZ-2863 > URL: https://issues.apache.org/jira/browse/TEZ-2863 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, > TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch > > > While running a sample tez job > {noformat} > tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 > -Dtez.task.launch.cmd-opts="-Xmx1m" input output > {noformat} > It was noticed that the Tez UI task attempt > http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id > was missing the TEZ_ATTEMPT_STARTED event > {noformat} > 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] > |history.HistoryEventHandler|: > [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: > vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, > scheduledTime=1443711835342, launchTime=1443711835342 > 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] > |rm.YarnTaskSchedulerService|: Allocation request for task: > attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null > 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] > |impl.AMRMClientImpl|: Received new token for : localhost:57381 > 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] > |rm.YarnTaskSchedulerService|: Assigning container to task: > containerId=container_1443711816411_0001_01_02, > task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, > containerPriority= 2, containerResources=, > localityMatchType=NodeLocal, matchedLocation=localhost, > honorLocalityFlags=true, reusedContainer=false, delayedContainers=0 > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] > |node.AMNodeTracker|: Adding new node: localhost:57381 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |launcher.ContainerLauncherImpl|: Launching > container_1443711816411_0001_01_02 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381 > 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] > |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: > containerId=container_1443711816411_0001_01_02, launchTime=1443711836741 > 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] > |rm.YarnTaskSchedulerService|: Allocated container > completed:container_1443711816411_0001_01_02 last allocated to task: > attempt_1443711816411_0001_1_00_00_0 > 2015-10-01 10:03:57,648 [INFO] [Dispatcher thread {Central}] > |container.AMContainerImpl|: Container container_1443711816411_0001_01_02 > exited with diagnostics set to Container failed, exitCode=1. Exception from > container-launch. > Container id: container_1443711816411_0001_01_02 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerE
[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner
[ https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157490#comment-15157490 ] Hitesh Shah commented on TEZ-3131: -- bq. The string value of the config name is atypical of config names - e.g. tez.test.root-dir. Perhaps something similar could be made available in a common place for all tests with similar logic. Most other tests actually just use the staging dir as is instead of overridding it ( unless it is being overridden for an instance specific sub-dir of the base staging dir). Also, TEST_ROOT_DIR is commonly used to denote ./target/ for unit tests and never really used for a path on DFS when running an end-to-end job test. I went with TEST_ROOT_DIR as that is a similar approach used in TestOrderedWordCount for override params mainly for testing purposes. I can make the property name change to something such as "tez.test-fault-tolerance.staging-dir" ( default value being ./tmp ). Would that work? > Support a way to override test_root_dir for FaultToleranceTestRunner > > > Key: TEZ-3131 > URL: https://issues.apache.org/jira/browse/TEZ-3131 > Project: Apache Tez > Issue Type: Bug >Reporter: Hitesh Shah >Assignee: Hitesh Shah >Priority: Minor > Attachments: TEZ-3131.1.patch > > > The path is hardcoded. For regression testing, it will be useful if it can be > overridden via command-line if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3126) Log reason for not reducing parallelism
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3126: - Summary: Log reason for not reducing parallelism (was: Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.) > Log reason for not reducing parallelism > --- > > Key: TEZ-3126 > URL: https://issues.apache.org/jira/browse/TEZ-3126 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Critical > Attachments: TEZ-3126.1.patch > > > For example, when reducing parallelism from 36 to 22. The basePartitionRange > will be 1 and will not re-configure the vertex. > {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey} > int desiredTaskParallelism = > (int)( > (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/ > desiredTaskInputDataSize); > if(desiredTaskParallelism < minTaskParallelism) { > desiredTaskParallelism = minTaskParallelism; > } > > if(desiredTaskParallelism >= currentParallelism) { > return true; > } > > // most shufflers will be assigned this range > basePartitionRange = currentParallelism/desiredTaskParallelism; > > if (basePartitionRange <= 1) { > // nothing to do if range is equal 1 partition. shuffler does it by > default > return true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3126) Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157440#comment-15157440 ] Jonathan Eagles commented on TEZ-3126: -- I'll use this ticket to log the reason parallelism was not reduced. As to grouping, a better distribution may help. Empty partitions could be an interesting case since it has 0 output size. > Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than > half. > - > > Key: TEZ-3126 > URL: https://issues.apache.org/jira/browse/TEZ-3126 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Critical > Attachments: TEZ-3126.1.patch > > > For example, when reducing parallelism from 36 to 22. The basePartitionRange > will be 1 and will not re-configure the vertex. > {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey} > int desiredTaskParallelism = > (int)( > (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/ > desiredTaskInputDataSize); > if(desiredTaskParallelism < minTaskParallelism) { > desiredTaskParallelism = minTaskParallelism; > } > > if(desiredTaskParallelism >= currentParallelism) { > return true; > } > > // most shufflers will be assigned this range > basePartitionRange = currentParallelism/desiredTaskParallelism; > > if (basePartitionRange <= 1) { > // nothing to do if range is equal 1 partition. shuffler does it by > default > return true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch
[ https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157416#comment-15157416 ] TezQA commented on TEZ-2863: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789026/TEZ-2863.3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 21 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//console This message is automatically generated. > Container, node, and logs not available in UI for tasks that fail to launch > --- > > Key: TEZ-2863 > URL: https://issues.apache.org/jira/browse/TEZ-2863 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, > TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch > > > While running a sample tez job > {noformat} > tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 > -Dtez.task.launch.cmd-opts="-Xmx1m" input output > {noformat} > It was noticed that the Tez UI task attempt > http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id > was missing the TEZ_ATTEMPT_STARTED event > {noformat} > 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] > |history.HistoryEventHandler|: > [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: > vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, > scheduledTime=1443711835342, launchTime=1443711835342 > 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] > |rm.YarnTaskSchedulerService|: Allocation request for task: > attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null > 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] > |impl.AMRMClientImpl|: Received new token for : localhost:57381 > 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] > |rm.YarnTaskSchedulerService|: Assigning container to task: > containerId=container_1443711816411_0001_01_02, > task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, > containerPriority= 2, containerResources=, > localityMatchType=NodeLocal, matchedLocation=localhost, > honorLocalityFlags=true, reusedContainer=false, delayedContainers=0 > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] > |node.AMNodeTracker|: Adding new node: localhost:57381 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |launcher.ContainerLauncherImpl|: Launching > container_1443711816411_0001_01_02 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381 > 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] > |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: > containerId=container_1443711816411_0001_01_02, launchTime=1443711836741 > 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] > |rm.YarnTaskSchedulerService|: Allocated container > completed:container_1443711816411_0001_01_02 last allocated to task: > attempt_1443711816411_0001_1_00_00_0 >
Failed: TEZ-2863 PreCommit Build #1495
Jira: https://issues.apache.org/jira/browse/TEZ-2863 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1495/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3635 lines...] [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-tests [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12789026/TEZ-2863.3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 21 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 4c64686e3b66ad248aea8add6cd9066f541ae44a logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 7 tests failed. FAILED: org.apache.tez.test.TestFaultTolerance.testRandomFailingInputs Error Message: expected: but was: Stack Trace: java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:141) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testRandomFailingInputs(TestFaultTolerance.java:763) FAILED: org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261) FAILED: org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
[jira] [Updated] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException
[ https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3129: - Attachment: TEZ-3129.1.patch Thanks for pointing me in the right direction, [~Sreenath]. Posting a patch that catches the error and prevents the exception from loading the task/ and taskAttempt/ pages. > Tez task and task attempt UI needs application fails with NotFoundException > --- > > Key: TEZ-3129 > URL: https://issues.apache.org/jira/browse/TEZ-3129 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-3129.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException
[ https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles reassigned TEZ-3129: Assignee: Jonathan Eagles > Tez task and task attempt UI needs application fails with NotFoundException > --- > > Key: TEZ-3129 > URL: https://issues.apache.org/jira/browse/TEZ-3129 > Project: Apache Tez > Issue Type: Bug > Components: UI >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread
[ https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157279#comment-15157279 ] Hitesh Shah commented on TEZ-3128: -- [~ozawa] I dont think the delayed container manager thread is the issue here. [~sseth] can you add more details/logs on this. I see the following as per code: - dagappmaster shuts down yarn scheduler service but it does not kill containers on shutdown - just releases them via amrmclient - TezTaskCommunicatorImpl on stop() does nothing to kill containers. It seems like the container launcher is the one trying shut down containers for some reason. Maybe we should just release containers via the scheduler service instead of trying to stop them? > Avoid stopping containers on the AM shutdown thread > --- > > Key: TEZ-3128 > URL: https://issues.apache.org/jira/browse/TEZ-3128 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.0-alpha >Reporter: Siddharth Seth >Assignee: Tsuyoshi Ozawa > Labels: newbie > Attachments: TEZ-3128.001.patch > > > During an AM shutdown, the TaskCommunicator is also shutdown and it tries to > stop containers in the shutdown thread itself. This can cause the AM shutdown > to block if NMs are not available. > This likely affects 0.7 as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch
[ https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-2863: - Attachment: TEZ-2863.3.patch TEZ-2863.3-branch-0.7.patch > Container, node, and logs not available in UI for tasks that fail to launch > --- > > Key: TEZ-2863 > URL: https://issues.apache.org/jira/browse/TEZ-2863 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, > TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch > > > While running a sample tez job > {noformat} > tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 > -Dtez.task.launch.cmd-opts="-Xmx1m" input output > {noformat} > It was noticed that the Tez UI task attempt > http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id > was missing the TEZ_ATTEMPT_STARTED event > {noformat} > 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] > |history.HistoryEventHandler|: > [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: > vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, > scheduledTime=1443711835342, launchTime=1443711835342 > 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] > |rm.YarnTaskSchedulerService|: Allocation request for task: > attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null > 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] > |impl.AMRMClientImpl|: Received new token for : localhost:57381 > 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] > |util.RackResolver|: Resolved localhost to /default-rack > 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] > |rm.YarnTaskSchedulerService|: Assigning container to task: > containerId=container_1443711816411_0001_01_02, > task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, > containerPriority= 2, containerResources=, > localityMatchType=NodeLocal, matchedLocation=localhost, > honorLocalityFlags=true, reusedContainer=false, delayedContainers=0 > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: > Resolved localhost to /default-rack > 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] > |node.AMNodeTracker|: Adding new node: localhost:57381 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |launcher.ContainerLauncherImpl|: Launching > container_1443711816411_0001_01_02 > 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] > |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381 > 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] > |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: > containerId=container_1443711816411_0001_01_02, launchTime=1443711836741 > 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] > |rm.YarnTaskSchedulerService|: Allocated container > completed:container_1443711816411_0001_01_02 last allocated to task: > attempt_1443711816411_0001_1_00_00_0 > 2015-10-01 10:03:57,648 [INFO] [Dispatcher thread {Central}] > |container.AMContainerImpl|: Container container_1443711816411_0001_01_02 > exited with diagnostics set to Container failed, exitCode=1. Exception from > container-launch. > Container id: container_1443711816411_0001_01_02 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > at org.apache.hadoop.util.Shell.run(Shell.java:455) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156775#comment-15156775 ] TezQA commented on TEZ-3124: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance org.apache.tez.dag.app.dag.impl.TestDAGImpl The following test timeouts occurred in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//console This message is automatically generated. > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [
Failed: TEZ-3124 PreCommit Build #1494
Jira: https://issues.apache.org/jira/browse/TEZ-3124 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1494/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3615 lines...] [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch against master revision f38e23c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 19 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance org.apache.tez.dag.app.dag.impl.TestDAGImpl The following test timeouts occurred in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 8a381516ce563c7dc864f558d8ae00cea0b7f634 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 7 tests failed. FAILED: org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits Error Message: expected: but was: Stack Trace: java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits(TestDAGImpl.java:2290) FAILED: org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261) FAILED: org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices Error Message: TezSession has already shutdown. No cluster diagnostics found. Stack Trace: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No cluster diagnostics found. at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124) at org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120) at org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703) FAILE
[jira] [Comment Edited] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156668#comment-15156668 ] Jeff Zhang edited comment on TEZ-3124 at 2/22/16 9:22 AM: -- Find the root cause, this is due to there's multiple VertexInitializedEvent in the case of multiple rounds of recovering. And initGeneratedEvents is not restored if it is in the recovering. The cause the second round of recovering get 0 initGeneratedEvents {noformat} 2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from event, eventType=VERTEX_INITIALIZED, event=vertexName=map, vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, initedTime=1454993194025, numTasks=90, processorName=null, additionalInputsCount=1, initGeneratedEventsCount=0 {noformat} Attach one patch, [~bikassaha] Please help review. * log VertexInitializedEvent only once * add multiple rounds recoverying test case was (Author: zjffdu): Find the root cause, this is due to there's multiple VertexInitializedEvent in the case of multiple rounds of recovering. And initGeneratedEvents is not restored in VertexInitializedEvent if it is in recovering. The cause the second round of recovering get 0 initGeneratedEvents {noformat} 2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from event, eventType=VERTEX_INITIALIZED, event=vertexName=map, vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, initedTime=1454993194025, numTasks=90, processorName=null, additionalInputsCount=1, initGeneratedEventsCount=0 {noformat} Attach one patch, [~bikassaha] Please help review. * log VertexInitializedEvent only once * add multiple rounds recoverying test case > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput bein
[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156668#comment-15156668 ] Jeff Zhang commented on TEZ-3124: - Find the root cause, this is due to there's multiple VertexInitializedEvent in the case of multiple rounds of recovering. And initGeneratedEvents is not restored in VertexInitializedEvent if it is in recovering. The cause the second round of recovering get 0 initGeneratedEvents {noformat} 2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from event, eventType=VERTEX_INITIALIZED, event=vertexName=map, vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, initedTime=1454993194025, numTasks=90, processorName=null, additionalInputsCount=1, initGeneratedEventsCount=0 {noformat} Attach one patch, [~bikassaha] Please help review. * log VertexInitializedEvent only once * add multiple rounds recoverying test case > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=
[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-3124: Attachment: TEZ-3124-3.patch > Running task hangs due to missing event to initialize input in recovery > --- > > Key: TEZ-3124 > URL: https://issues.apache.org/jira/browse/TEZ-3124 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.2 >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Labels: Recovery > Fix For: 0.8.3 > > Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, > a.log > > > {noformat} > 2016-02-09 04:48:42 Starting to run new task attempt: > attempt_1454993155302_0001_1_00_61_3 > /attempt_1454993155302_0001_1_00_61 > 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, > numPhysicalInputs=1 > 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] > |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish > 2016-02-09 04:48:43,333 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished > 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: > InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], > > [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] > 2016-02-09 04:48:43,559 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: > ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] > 2016-02-09 04:48:43,563 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, > AdditionalReservationFractionForIOs=0.03, > finalReserveFractionUsed=0.32996 > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: > 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: > 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, > TotalRequested/TotalJVMHeap:0.70 > 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: > Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], > [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] > 2016-02-09 04:48:43,564 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs > 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] > |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput > 2016-02-09 04:48:43,572 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by > the framework. Subsequent instances will not be auto-started > 2016-02-09 04:48:43,573 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start > 2016-02-09 04:48:43,574 [INFO] [TezChild] > |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete > 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running > task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 > 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: > attempt_1454993155302_0001_1_00_61_3_10001 > 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 > using: memoryMb=1646, keySerializerClass=class > org.apache.hadoop.io.IntWritable, > valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, > comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, > partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, > serialization=org.apache.hadoop.io.serializer.WritableSerialization > 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up > PipelinedSorter for ireduce1: , UsingHashComparator=false > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly > allocated block size=1725956096, index=0, Number of buffers=1, > currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096 > 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Pre > alloca
[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery
[ https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-3124: Description: {noformat} 2016-02-09 04:48:42 Starting to run new task attempt: attempt_1454993155302_0001_1_00_61_3 /attempt_1454993155302_0001_1_00_61 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, numPhysicalInputs=1 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization 2016-02-09 04:48:43,333 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor 2016-02-09 04:48:43,333 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish 2016-02-09 04:48:43,333 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish 2016-02-09 04:48:43,333 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput] 2016-02-09 04:48:43,559 [INFO] [TezChild] |resources.WeightedScalingMemoryDistributor|: ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1] 2016-02-09 04:48:43,563 [INFO] [TezChild] |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, AdditionalReservationFractionForIOs=0.03, finalReserveFractionUsed=0.32996 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, TotalRequested/TotalJVMHeap:0.70 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871] 2016-02-09 04:48:43,564 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput 2016-02-09 04:48:43,572 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by the framework. Subsequent instances will not be auto-started 2016-02-09 04:48:43,573 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1 2016-02-09 04:48:43,574 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start 2016-02-09 04:48:43,574 [INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: attempt_1454993155302_0001_1_00_61_3_10001 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 using: memoryMb=1646, keySerializerClass=class org.apache.hadoop.io.IntWritable, valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6, comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, serialization=org.apache.hadoop.io.serializer.WritableSerialization 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up PipelinedSorter for ireduce1: , UsingHashComparator=false 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly allocated block size=1725956096, index=0, Number of buffers=1, currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Pre allocating rest of memory buffers upfront 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up PipelinedSorter for ireduce1: , UsingHashComparator=false#blocks=1, maxMemUsage=1725956096, lazyAllocateMem=false, minBlockSize=2097152000, initial BLOCK_SIZE=1725956096, finalMergeEnabled=true, pipelinedShuffle=false, sendEmptyPartitions=true, tez.runtime.io.sort.mb=1646 2016-02-09 04:48:45,099 [INFO] [TezChild] |impl.PipelinedSorter|: ireduce1: reserved.remaining()=1725956096, reserved.metasize=16777216 2016-02-09 04:48:45,175 [INFO] [TezChild] |input.MRInput|: Initialized MRInput: MRInput 2016-02-09 08:55:40,790 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Received should die res