[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158501#comment-15158501
 ] 

TezQA commented on TEZ-3124:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision fd75e64.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1502//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1502//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//console

This message is automatically generated.

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> TEZ-3124-4.patch, a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:

Failed: TEZ-3124 PreCommit Build #1502

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3124
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1502/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3769 lines...]
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 56:55 min
[INFO] Finished at: 2016-02-23T07:59:04+00:00
[INFO] Final Memory: 63M/844M
[INFO] 




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision fd75e64.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1502//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1502//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1502//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
b96c70826ed26a596dcd097105adec9dc36f13a1 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158393#comment-15158393
 ] 

TezQA commented on TEZ-3124:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision fd75e64.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1501//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1501//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//console

This message is automatically generated.

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> TEZ-3124-4.patch, a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:

Failed: TEZ-3124 PreCommit Build #1501

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3124
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1501/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 9284 lines...]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :atlas-docs




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision fd75e64.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1501//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1501//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1501//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
206be74a18c45eb3c4364c30336efe4e951e03a3 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158300#comment-15158300
 ] 

TezQA commented on TEZ-3124:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision 44ca229.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1500//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1500//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//console

This message is automatically generated.

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> TEZ-3124-4.patch, a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:

Failed: TEZ-3124 PreCommit Build #1500

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3124
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1500/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3773 lines...]
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 57:19 min
[INFO] Finished at: 2016-02-23T05:20:36+00:00
[INFO] Final Memory: 71M/1160M
[INFO] 




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789126/TEZ-3124-4.patch
  against master revision 44ca229.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1500//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1500//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1500//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
086061b26e7fdbc3b9e30be33ee26c2944e450fe logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-3124:

Attachment: TEZ-3124-4.patch

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> TEZ-3124-4.patch, a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=class 
> org.apache.hadoop.io.IntWritable, 
> valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
>  comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
> partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
> serialization=org.apache.hadoop.io.serializer.WritableSerialization
> 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
> PipelinedSorter for ireduce1: , UsingHashComparator=false
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
> allocated block size=1725956096, index=0, Number of buffers=1, 
> currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSort

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158204#comment-15158204
 ] 

TezQA commented on TEZ-3124:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestDAGImpl

  The following test timeouts occurred in :
 org.apache.tez.test.TestRecovery

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1499//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1499//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//console

This message is automatically generated.

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num

Failed: TEZ-3124 PreCommit Build #1499

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3124
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1499/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3555 lines...]
[ERROR]   mvn  -rf :tez-dag
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestDAGImpl

  The following test timeouts occurred in :
 org.apache.tez.test.TestRecovery

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1499//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1499//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1499//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
fbf51670fcdd9af809a490d33063b1c030862880 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits(TestDAGImpl.java:2290)




[jira] [Commented] (TEZ-3126) Log reason for not reducing parallelism

2016-02-22 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158125#comment-15158125
 ] 

Bikas Saha commented on TEZ-3126:
-

lgtm

> Log reason for not reducing parallelism
> ---
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism = 
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
>   desiredTaskParallelism = minTaskParallelism;
> }
> 
> if(desiredTaskParallelism >= currentParallelism) {
>   return true;
> }
> 
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
> 
> if (basePartitionRange <= 1) {
>   // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>   return true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158037#comment-15158037
 ] 

Hitesh Shah commented on TEZ-3128:
--

bq.  Could you help me to clarify where to fix? 

The ContainerLauncher I think seems to be the one as per my understanding. Lets 
wait for [~sseth] or [~rajesh.balamohan] to supply additional logs to pinpoint 
the problem. 

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158031#comment-15158031
 ] 

TezQA commented on TEZ-3131:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789087/TEZ-3131.3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1498//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//console

This message is automatically generated.

> Support a way to override test_root_dir for FaultToleranceTestRunner
> 
>
> Key: TEZ-3131
> URL: https://issues.apache.org/jira/browse/TEZ-3131
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
> Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch, TEZ-3131.3.patch
>
>
> The path is hardcoded. For regression testing, it will be useful if it can be 
> overridden via command-line if needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-3131 PreCommit Build #1498

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3131
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1498/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3610 lines...]
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-tests
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789087/TEZ-3131.3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1498//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1498//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
6cf4febb8576997a961b23136d7815620518bd06 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
6 tests failed.
FAILED:  org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testMultipleInputFailureWithoutExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.r

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158026#comment-15158026
 ] 

Hitesh Shah commented on TEZ-3124:
--

bq. The failed tests are TestFaultTolerance and TestDAGImpl.testCounterLimits 
which are not related.

The following test timeouts occurred in :
org.apache.tez.test.TestRecovery

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=class 
> org.apache.hadoop.io.IntWritable, 
> valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
>  comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
> partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
> serialization=org.apache.hadoop.io.serializer.WritableSerialization
> 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
> PipelinedSorter for ireduce1: , UsingHashComparator=false
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
> allocated block s

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158021#comment-15158021
 ] 

Jeff Zhang commented on TEZ-3124:
-

The failed tests are TestFaultTolerance and TestDAGImpl.testCounterLimits which 
are not related. 

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=class 
> org.apache.hadoop.io.IntWritable, 
> valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
>  comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
> partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
> serialization=org.apache.hadoop.io.serializer.WritableSerialization
> 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
> PipelinedSorter for ireduce1: , UsingHashComparator=false
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
> allocated block size=1725956096, index=0, Number of buffers=1, 
> currentAllocatableMemory=0, curr

[jira] [Commented] (TEZ-3067) Links to tez configs documentation should be bubbled up to top-level release page

2016-02-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158007#comment-15158007
 ] 

Tsuyoshi Ozawa commented on TEZ-3067:
-

Thanks [~hitesh] for your committing and reviewing.

> Links to tez configs documentation should be bubbled up to top-level release 
> page 
> --
>
> Key: TEZ-3067
> URL: https://issues.apache.org/jira/browse/TEZ-3067
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Fix For: 0.8.3
>
> Attachments: TEZ-3067.001.patch, TEZ-3067.002.patch
>
>
> http://tez.apache.org/releases/0.8.2/tez-api-javadocs/configs/TezConfiguration.html
>  is hidden away in the api docs. Would you useful to update the release 
> template to add direct links to the config docs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158004#comment-15158004
 ] 

Tsuyoshi Ozawa commented on TEZ-3128:
-

[~hitesh] [~sseth] Thank you for pointing.

{quote}
dagappmaster shuts down yarn scheduler service but it does not kill containers 
on shutdown - just releases them via amrmclient
TezTaskCommunicatorImpl on stop() does nothing to kill containers.
{quote}

Right, that's why I thought the place I fixed was what you mentioned. Could you 
help me to clarify where to fix?

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3126) Log reason for not reducing parallelism

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157878#comment-15157878
 ] 

TezQA commented on TEZ-3126:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789063/TEZ-3126.2.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1497//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//console

This message is automatically generated.

> Log reason for not reducing parallelism
> ---
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism = 
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
>   desiredTaskParallelism = minTaskParallelism;
> }
> 
> if(desiredTaskParallelism >= currentParallelism) {
>   return true;
> }
> 
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
> 
> if (basePartitionRange <= 1) {
>   // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>   return true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-3131:
-
Attachment: TEZ-3131.3.patch

> Support a way to override test_root_dir for FaultToleranceTestRunner
> 
>
> Key: TEZ-3131
> URL: https://issues.apache.org/jira/browse/TEZ-3131
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
> Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch, TEZ-3131.3.patch
>
>
> The path is hardcoded. For regression testing, it will be useful if it can be 
> overridden via command-line if needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-3126 PreCommit Build #1497

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3126
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1497/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3607 lines...]
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-tests
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789063/TEZ-3126.2.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1497//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1497//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
6f1eba9ae9e095834e1202ea1e7cdbbdf309e0b3 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
6 tests failed.
FAILED:  org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testMultipleInputFailureWithoutExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolera

[jira] [Updated] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-3131:
-
Attachment: TEZ-3131.2.patch

> Support a way to override test_root_dir for FaultToleranceTestRunner
> 
>
> Key: TEZ-3131
> URL: https://issues.apache.org/jira/browse/TEZ-3131
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
> Attachments: TEZ-3131.1.patch, TEZ-3131.2.patch
>
>
> The path is hardcoded. For regression testing, it will be useful if it can be 
> overridden via command-line if needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-3115) Shuffle string handling adds significant memory overhead

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned TEZ-3115:


Assignee: Jonathan Eagles

> Shuffle string handling adds significant memory overhead
> 
>
> Key: TEZ-3115
> URL: https://issues.apache.org/jira/browse/TEZ-3115
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
> Attachments: TEZ-3115.1.patch
>
>
> While investigating the OOM heap dump from TEZ-3114 I noticed that the 
> ShuffleManager and other shuffle-related objects were holding onto many 
> strings that added up to over a hundred megabytes of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3126) Log reason for not reducing parallelism

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3126:
-
Attachment: TEZ-3126.2.patch

[~bikassaha], [~rajesh.balamohan] let me know if the updated log messages are 
clear enough.

> Log reason for not reducing parallelism
> ---
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-3126.1.patch, TEZ-3126.2.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism = 
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
>   desiredTaskParallelism = minTaskParallelism;
> }
> 
> if(desiredTaskParallelism >= currentParallelism) {
>   return true;
> }
> 
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
> 
> if (basePartitionRange <= 1) {
>   // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>   return true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157572#comment-15157572
 ] 

Bikas Saha commented on TEZ-3131:
-

Sure. Please go ahead. +1.

> Support a way to override test_root_dir for FaultToleranceTestRunner
> 
>
> Key: TEZ-3131
> URL: https://issues.apache.org/jira/browse/TEZ-3131
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
> Attachments: TEZ-3131.1.patch
>
>
> The path is hardcoded. For regression testing, it will be useful if it can be 
> overridden via command-line if needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157558#comment-15157558
 ] 

Hitesh Shah commented on TEZ-3124:
--

TestRecovery seems to have failed with the new patch 

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=class 
> org.apache.hadoop.io.IntWritable, 
> valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
>  comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
> partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
> serialization=org.apache.hadoop.io.serializer.WritableSerialization
> 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
> PipelinedSorter for ireduce1: , UsingHashComparator=false
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
> allocated block size=1725956096, index=0, Number of buffers=1, 
> currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096

[jira] [Updated] (TEZ-3119) Add missing AM translations in DeprecatedKeys#populateMRToDagParamMap

2016-02-22 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated TEZ-3119:
-
Attachment: TEZ-3119.001.patch

Attaching initial patch.

> Add missing AM translations in DeprecatedKeys#populateMRToDagParamMap
> -
>
> Key: TEZ-3119
> URL: https://issues.apache.org/jira/browse/TEZ-3119
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.2
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: TEZ-3119.001.patch
>
>
> MRToDagParamMap is missing some of the relevant configs. Some of them include:
> {code}
> TEZ_CREDENTIALS_PATH
> TEZ_AM_LOG_LEVEL
> TEZ_AM_MAX_APP_ATTEMPTS
> TEZ_AM_RESOURCE_MEMORY_MB
> TEZ_AM_RESOURCE_CPU_VCORES
> TEZ_AM_CLIENT_THREAD_COUNT
> TEZ_AM_CLIENT_AM_PORT_RANGE
> TEZ_AM_RM_HEARTBEAT_INTERVAL_MS_MAX
> TASK_HEARTBEAT_TIMEOUT_MS
> TEZ_TASK_AM_HEARTBEAT_INTERVAL_MS
> TEZ_AM_APPLICATION_PRIORITY
> TEZ_AM_VIEW_ACLS
> TEZ_AM_MODIFY_ACLS
> TEZ_CANCEL_DELEGATION_TOKENS_ON_COMPLETION
> TEZ_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT
> TEZ_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT
> TEZ_AM_LEGACY_SPECULATIVE_SLOWTASK_THRESHOLD
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157509#comment-15157509
 ] 

TezQA commented on TEZ-3129:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789030/TEZ-3129.1.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1496//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//console

This message is automatically generated.

> Tez task and task attempt UI needs application fails with NotFoundException
> ---
>
> Key: TEZ-3129
> URL: https://issues.apache.org/jira/browse/TEZ-3129
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3129.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-3129 PreCommit Build #1496

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3129
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1496/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3769 lines...]
[INFO] 
[INFO] Total time: 01:01 h
[INFO] Finished at: 2016-02-22T19:12:36+00:00
[INFO] Final Memory: 64M/1032M
[INFO] 




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789030/TEZ-3129.1.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1496//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1496//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
37917eb3ef711b3d4bad3a0e3b1ed00413e64f90 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157505#comment-15157505
 ] 

Hitesh Shah commented on TEZ-2863:
--

+1 for the most part.

\cc [~zjffdu] in case he has any comments on the recovery aspect where the 
container info is not being written to recovery and whether it needs to be. 

For the UI, i think it might be better to leave the UI unchanged. I think The 
UI probably can remain dumb about trying to figure out whether to redirect to 
syslog or stderr if the syslog_attempt* file does not exist ( main reasons are 
that an additional http call to verify existence will be needed and I am not 
sure if YARN supports that cleanly and secondly, the syslog/stderr choice may 
not be trivial to solve ) .  

> Container, node, and logs not available in UI for tasks that fail to launch
> ---
>
> Key: TEZ-2863
> URL: https://issues.apache.org/jira/browse/TEZ-2863
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, 
> TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch
>
>
> While running a sample tez job
> {noformat}
> tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 
> -Dtez.task.launch.cmd-opts="-Xmx1m" input output
> {noformat}
> It was noticed that the Tez UI task attempt 
> http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id 
> was missing the TEZ_ATTEMPT_STARTED event
> {noformat}
> 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: 
> vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, 
> scheduledTime=1443711835342, launchTime=1443711835342
> 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] 
> |rm.YarnTaskSchedulerService|: Allocation request for task: 
> attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null
> 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] 
> |impl.AMRMClientImpl|: Received new token for : localhost:57381
> 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: Assigning container to task: 
> containerId=container_1443711816411_0001_01_02, 
> task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, 
> containerPriority= 2, containerResources=, 
> localityMatchType=NodeLocal, matchedLocation=localhost, 
> honorLocalityFlags=true, reusedContainer=false, delayedContainers=0
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] 
> |node.AMNodeTracker|: Adding new node: localhost:57381
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |launcher.ContainerLauncherImpl|: Launching 
> container_1443711816411_0001_01_02
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381
> 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] 
> |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: 
> containerId=container_1443711816411_0001_01_02, launchTime=1443711836741
> 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] 
> |rm.YarnTaskSchedulerService|: Allocated container 
> completed:container_1443711816411_0001_01_02 last allocated to task: 
> attempt_1443711816411_0001_1_00_00_0
> 2015-10-01 10:03:57,648 [INFO] [Dispatcher thread {Central}] 
> |container.AMContainerImpl|: Container container_1443711816411_0001_01_02 
> exited with diagnostics set to Container failed, exitCode=1. Exception from 
> container-launch.
> Container id: container_1443711816411_0001_01_02
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>   at org.apache.hadoop.util.Shell.run(Shell.java:455)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerE

[jira] [Commented] (TEZ-3131) Support a way to override test_root_dir for FaultToleranceTestRunner

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157490#comment-15157490
 ] 

Hitesh Shah commented on TEZ-3131:
--

bq. The string value of the config name is atypical of config names - e.g. 
tez.test.root-dir. Perhaps something similar could be made available in a 
common place for all tests with similar logic.

Most other tests actually just use the staging dir as is instead of overridding 
it ( unless it is being overridden for an instance specific sub-dir of the base 
staging dir). Also, TEST_ROOT_DIR is commonly used to denote ./target/ for unit 
tests and never really used for a path on DFS when running an end-to-end job 
test. 

I went with TEST_ROOT_DIR as that is a similar approach used in 
TestOrderedWordCount for override params mainly for testing purposes. I can 
make the property name change to something such as 
"tez.test-fault-tolerance.staging-dir" ( default value being ./tmp ). Would 
that work? 


> Support a way to override test_root_dir for FaultToleranceTestRunner
> 
>
> Key: TEZ-3131
> URL: https://issues.apache.org/jira/browse/TEZ-3131
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>Priority: Minor
> Attachments: TEZ-3131.1.patch
>
>
> The path is hardcoded. For regression testing, it will be useful if it can be 
> overridden via command-line if needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3126) Log reason for not reducing parallelism

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3126:
-
Summary: Log reason for not reducing parallelism  (was: Auto-Reduce 
Parallelism: Vertex not re-configured when reduced by less than half.)

> Log reason for not reducing parallelism
> ---
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-3126.1.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism = 
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
>   desiredTaskParallelism = minTaskParallelism;
> }
> 
> if(desiredTaskParallelism >= currentParallelism) {
>   return true;
> }
> 
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
> 
> if (basePartitionRange <= 1) {
>   // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>   return true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3126) Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.

2016-02-22 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157440#comment-15157440
 ] 

Jonathan Eagles commented on TEZ-3126:
--

I'll use this ticket to log the reason parallelism was not reduced. As to 
grouping, a better distribution may help. Empty partitions could be an 
interesting case since it has 0 output size.

> Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than 
> half.
> -
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-3126.1.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange 
> will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism = 
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
>   desiredTaskParallelism = minTaskParallelism;
> }
> 
> if(desiredTaskParallelism >= currentParallelism) {
>   return true;
> }
> 
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
> 
> if (basePartitionRange <= 1) {
>   // nothing to do if range is equal 1 partition. shuffler does it by 
> default
>   return true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157416#comment-15157416
 ] 

TezQA commented on TEZ-2863:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789026/TEZ-2863.3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 21 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1495//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1495//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//console

This message is automatically generated.

> Container, node, and logs not available in UI for tasks that fail to launch
> ---
>
> Key: TEZ-2863
> URL: https://issues.apache.org/jira/browse/TEZ-2863
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, 
> TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch
>
>
> While running a sample tez job
> {noformat}
> tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 
> -Dtez.task.launch.cmd-opts="-Xmx1m" input output
> {noformat}
> It was noticed that the Tez UI task attempt 
> http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id 
> was missing the TEZ_ATTEMPT_STARTED event
> {noformat}
> 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: 
> vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, 
> scheduledTime=1443711835342, launchTime=1443711835342
> 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] 
> |rm.YarnTaskSchedulerService|: Allocation request for task: 
> attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null
> 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] 
> |impl.AMRMClientImpl|: Received new token for : localhost:57381
> 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: Assigning container to task: 
> containerId=container_1443711816411_0001_01_02, 
> task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, 
> containerPriority= 2, containerResources=, 
> localityMatchType=NodeLocal, matchedLocation=localhost, 
> honorLocalityFlags=true, reusedContainer=false, delayedContainers=0
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] 
> |node.AMNodeTracker|: Adding new node: localhost:57381
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |launcher.ContainerLauncherImpl|: Launching 
> container_1443711816411_0001_01_02
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381
> 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] 
> |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: 
> containerId=container_1443711816411_0001_01_02, launchTime=1443711836741
> 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] 
> |rm.YarnTaskSchedulerService|: Allocated container 
> completed:container_1443711816411_0001_01_02 last allocated to task: 
> attempt_1443711816411_0001_1_00_00_0
>

Failed: TEZ-2863 PreCommit Build #1495

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2863
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1495/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3635 lines...]
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-tests
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789026/TEZ-2863.3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 21 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1495//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1495//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1495//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
4c64686e3b66ad248aea8add6cd9066f541ae44a logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
7 tests failed.
FAILED:  org.apache.tez.test.TestFaultTolerance.testRandomFailingInputs

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:141)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testRandomFailingInputs(TestFaultTolerance.java:763)


FAILED:  org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)

[jira] [Updated] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3129:
-
Attachment: TEZ-3129.1.patch

Thanks for pointing me in the right direction, [~Sreenath]. Posting a patch 
that catches the error and prevents the exception from loading the 
task/ and taskAttempt/ pages.

> Tez task and task attempt UI needs application fails with NotFoundException
> ---
>
> Key: TEZ-3129
> URL: https://issues.apache.org/jira/browse/TEZ-3129
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3129.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-3129) Tez task and task attempt UI needs application fails with NotFoundException

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned TEZ-3129:


Assignee: Jonathan Eagles

> Tez task and task attempt UI needs application fails with NotFoundException
> ---
>
> Key: TEZ-3129
> URL: https://issues.apache.org/jira/browse/TEZ-3129
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157279#comment-15157279
 ] 

Hitesh Shah commented on TEZ-3128:
--

[~ozawa] I dont think the delayed container manager thread is the issue here. 

[~sseth] can you add more details/logs on this.


I see the following as per code: 
   - dagappmaster shuts down yarn scheduler service but it does not kill 
containers on shutdown - just releases them via amrmclient
   - TezTaskCommunicatorImpl on stop() does nothing to kill containers. 

It seems like the container launcher is the one trying shut down containers for 
some reason. Maybe we should just release containers via the scheduler service 
instead of trying to stop them?


> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2863) Container, node, and logs not available in UI for tasks that fail to launch

2016-02-22 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2863:
-
Attachment: TEZ-2863.3.patch
TEZ-2863.3-branch-0.7.patch

> Container, node, and logs not available in UI for tasks that fail to launch
> ---
>
> Key: TEZ-2863
> URL: https://issues.apache.org/jira/browse/TEZ-2863
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-2863.1.patch, TEZ-2863.2-branch-0.7.patch, 
> TEZ-2863.2.patch, TEZ-2863.3-branch-0.7.patch, TEZ-2863.3.patch
>
>
> While running a sample tez job
> {noformat}
> tez-examples-*.jar orderedwordcount -Dtez.task.resource.memory.mb=1 
> -Dtez.task.launch.cmd-opts="-Xmx1m" input output
> {noformat}
> It was noticed that the Tez UI task attempt 
> http://timelineserverhost:port/ws/v1/timeline/TEZ_TASK_ATTEMPT_ID/attempt_id 
> was missing the TEZ_ATTEMPT_STARTED event
> {noformat}
> 2015-10-01 10:03:55,344 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1443711816411_0001_1][Event:TASK_STARTED]: 
> vertexName=Tokenizer, taskId=task_1443711816411_0001_1_00_00, 
> scheduledTime=1443711835342, launchTime=1443711835342
> 2015-10-01 10:03:55,346 [INFO] [Dispatcher thread {Central}] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,356 [INFO] [TaskSchedulerEventHandlerThread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:55,364 [INFO] [TaskSchedulerEventHandlerThread] 
> |rm.YarnTaskSchedulerService|: Allocation request for task: 
> attempt_1443711816411_0001_1_00_00_0 with request: Capability[ vCores:1>]Priority[2] host: localhost rack: null
> 2015-10-01 10:03:56,639 [INFO] [AMRM Heartbeater thread] 
> |impl.AMRMClientImpl|: Received new token for : localhost:57381
> 2015-10-01 10:03:56,646 [INFO] [AMRM Callback Handler Thread] 
> |util.RackResolver|: Resolved localhost to /default-rack
> 2015-10-01 10:03:56,648 [INFO] [DelayedContainerManager] 
> |rm.YarnTaskSchedulerService|: Assigning container to task: 
> containerId=container_1443711816411_0001_01_02, 
> task=attempt_1443711816411_0001_1_00_00_0, containerHost=localhost:57381, 
> containerPriority= 2, containerResources=, 
> localityMatchType=NodeLocal, matchedLocation=localhost, 
> honorLocalityFlags=true, reusedContainer=false, delayedContainers=0
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,649 [INFO] [DelayedContainerManager] |util.RackResolver|: 
> Resolved localhost to /default-rack
> 2015-10-01 10:03:56,686 [INFO] [TaskSchedulerAppCaller #0] 
> |node.AMNodeTracker|: Adding new node: localhost:57381
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |launcher.ContainerLauncherImpl|: Launching 
> container_1443711816411_0001_01_02
> 2015-10-01 10:03:56,700 [INFO] [ContainerLauncher #0] 
> |impl.ContainerManagementProtocolProxy|: Opening proxy : localhost:57381
> 2015-10-01 10:03:56,741 [INFO] [ContainerLauncher #0] 
> |history.HistoryEventHandler|: [HISTORY][DAG:N/A][Event:CONTAINER_LAUNCHED]: 
> containerId=container_1443711816411_0001_01_02, launchTime=1443711836741
> 2015-10-01 10:03:57,647 [INFO] [AMRM Callback Handler Thread] 
> |rm.YarnTaskSchedulerService|: Allocated container 
> completed:container_1443711816411_0001_01_02 last allocated to task: 
> attempt_1443711816411_0001_1_00_00_0
> 2015-10-01 10:03:57,648 [INFO] [Dispatcher thread {Central}] 
> |container.AMContainerImpl|: Container container_1443711816411_0001_01_02 
> exited with diagnostics set to Container failed, exitCode=1. Exception from 
> container-launch.
> Container id: container_1443711816411_0001_01_02
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>   at org.apache.hadoop.util.Shell.run(Shell.java:455)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156775#comment-15156775
 ] 

TezQA commented on TEZ-3124:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance
  org.apache.tez.dag.app.dag.impl.TestDAGImpl

  The following test timeouts occurred in :
 org.apache.tez.test.TestRecovery

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1494//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1494//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//console

This message is automatically generated.

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [

Failed: TEZ-3124 PreCommit Build #1494

2016-02-22 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3124
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1494/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3615 lines...]
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12788977/TEZ-3124-3.patch
  against master revision f38e23c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 20 javac 
compiler warnings (more than the master's current 19 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance
  org.apache.tez.dag.app.dag.impl.TestDAGImpl

  The following test timeouts occurred in :
 org.apache.tez.test.TestRecovery

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1494//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1494//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1494//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
8a381516ce563c7dc864f558d8ae00cea0b7f634 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
7 tests failed.
FAILED:  org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.dag.app.dag.impl.TestDAGImpl.testCounterLimits(TestDAGImpl.java:2290)


FAILED:  org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testBasicInputFailureWithExit(TestFaultTolerance.java:261)


FAILED:  
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices

Error Message:
TezSession has already shutdown. No cluster diagnostics found.

Stack Trace:
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. No 
cluster diagnostics found.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:784)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:129)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:124)
at 
org.apache.tez.test.TestFaultTolerance.runDAGAndVerify(TestFaultTolerance.java:120)
at 
org.apache.tez.test.TestFaultTolerance.testInputFailureRerunCanSendOutputToTwoDownstreamVertices(TestFaultTolerance.java:703)


FAILE

[jira] [Comment Edited] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156668#comment-15156668
 ] 

Jeff Zhang edited comment on TEZ-3124 at 2/22/16 9:22 AM:
--

Find the root cause, this is due to there's multiple VertexInitializedEvent in 
the case of multiple rounds of recovering.  And initGeneratedEvents is not 
restored if it is in the recovering. The cause the second round of recovering 
get 0 initGeneratedEvents  
{noformat}
2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from 
event, eventType=VERTEX_INITIALIZED, event=vertexName=map, 
vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, 
initedTime=1454993194025, numTasks=90, processorName=null, 
additionalInputsCount=1, initGeneratedEventsCount=0
{noformat}

Attach one patch, [~bikassaha] Please help review.

* log VertexInitializedEvent only once
* add multiple rounds recoverying test case


was (Author: zjffdu):
Find the root cause, this is due to there's multiple VertexInitializedEvent in 
the case of multiple rounds of recovering.  And initGeneratedEvents is not 
restored in VertexInitializedEvent if it is in recovering. The cause the second 
round of recovering get 0 initGeneratedEvents  
{noformat}
2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from 
event, eventType=VERTEX_INITIALIZED, event=vertexName=map, 
vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, 
initedTime=1454993194025, numTasks=90, processorName=null, 
additionalInputsCount=1, initGeneratedEventsCount=0
{noformat}

Attach one patch, [~bikassaha] Please help review.

* log VertexInitializedEvent only once
* add multiple rounds recoverying test case

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput bein

[jira] [Commented] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156668#comment-15156668
 ] 

Jeff Zhang commented on TEZ-3124:
-

Find the root cause, this is due to there's multiple VertexInitializedEvent in 
the case of multiple rounds of recovering.  And initGeneratedEvents is not 
restored in VertexInitializedEvent if it is in recovering. The cause the second 
round of recovering get 0 initGeneratedEvents  
{noformat}
2016-02-09 04:48:37,175 [INFO] [main] |app.RecoveryParser|: Recovering from 
event, eventType=VERTEX_INITIALIZED, event=vertexName=map, 
vertexId=vertex_1454993155302_0001_1_00, initRequestedTime=1454993277903, 
initedTime=1454993194025, numTasks=90, processorName=null, 
additionalInputsCount=1, initGeneratedEventsCount=0
{noformat}

Attach one patch, [~bikassaha] Please help review.

* log VertexInitializedEvent only once
* add multiple rounds recoverying test case

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=

[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-3124:

Attachment: TEZ-3124-3.patch

> Running task hangs due to missing event to initialize input in recovery
> ---
>
> Key: TEZ-3124
> URL: https://issues.apache.org/jira/browse/TEZ-3124
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.2
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>  Labels: Recovery
> Fix For: 0.8.3
>
> Attachments: TEZ-3124-1.patch, TEZ-3124-2.patch, TEZ-3124-3.patch, 
> a.log
>
>
> {noformat}
> 2016-02-09 04:48:42 Starting to run new task attempt: 
> attempt_1454993155302_0001_1_00_61_3
> /attempt_1454993155302_0001_1_00_61
> 2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
> numPhysicalInputs=1
> 2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
> |input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
> 2016-02-09 04:48:43,333 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
> 2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
> InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy],
>  
> [ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
> 2016-02-09 04:48:43,559 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: 
> ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
> 2016-02-09 04:48:43,563 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
> AdditionalReservationFractionForIOs=0.03, 
> finalReserveFractionUsed=0.32996
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 
> 2, numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
> 1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
> TotalRequested/TotalJVMHeap:0.70
> 2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
> Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
> [ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
> 2016-02-09 04:48:43,564 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
> 2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
> |runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
> 2016-02-09 04:48:43,572 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
> the framework. Subsequent instances will not be auto-started
> 2016-02-09 04:48:43,573 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
> 2016-02-09 04:48:43,574 [INFO] [TezChild] 
> |runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
> task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
> 2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
> attempt_1454993155302_0001_1_00_61_3_10001
> 2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
> using: memoryMb=1646, keySerializerClass=class 
> org.apache.hadoop.io.IntWritable, 
> valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
>  comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
> partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
> serialization=org.apache.hadoop.io.serializer.WritableSerialization
> 2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
> PipelinedSorter for ireduce1: , UsingHashComparator=false
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
> allocated block size=1725956096, index=0, Number of buffers=1, 
> currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096
> 2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Pre 
> alloca

[jira] [Updated] (TEZ-3124) Running task hangs due to missing event to initialize input in recovery

2016-02-22 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-3124:

Description: 
{noformat}
2016-02-09 04:48:42 Starting to run new task attempt: 
attempt_1454993155302_0001_1_00_61_3
/attempt_1454993155302_0001_1_00_61
2016-02-09 04:48:43,196 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
|input.MRInput|: MRInput using newmapreduce API=true, split via event=true, 
numPhysicalInputs=1
2016-02-09 04:48:43,200 [INFO] [I/O Setup 0 Initialize: {MRInput}] 
|input.MRInputLegacy|: MRInput MRInputLegacy deferring initialization
2016-02-09 04:48:43,333 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Initialized processor
2016-02-09 04:48:43,333 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Waiting for 2 initializers to finish
2016-02-09 04:48:43,333 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 initializers to finish
2016-02-09 04:48:43,333 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: All initializers finished
2016-02-09 04:48:43,345 [INFO] [TezChild] |resources.MemoryDistributor|: 
InitialRequests=[MRInput:INPUT:0:org.apache.tez.mapreduce.input.MRInputLegacy], 
[ireduce1:OUTPUT:1802502144:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput]
2016-02-09 04:48:43,559 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: 
ScaleRatiosUsed=[PARTITIONED_UNSORTED_OUTPUT:1][UNSORTED_OUTPUT:1][UNSORTED_INPUT:1][SORTED_OUTPUT:12][SORTED_MERGED_INPUT:12][PROCESSOR:1][OTHER:1]
2016-02-09 04:48:43,563 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: InitialReservationFraction=0.3, 
AdditionalReservationFractionForIOs=0.03, 
finalReserveFractionUsed=0.32996
2016-02-09 04:48:43,564 [INFO] [TezChild] 
|resources.WeightedScalingMemoryDistributor|: Scaling Requests. NumRequests: 2, 
numScaledRequests: 13, TotalRequested: 1802502144, TotalRequestedScaled: 
1.663848132923077E9, TotalJVMHeap: 2577399808, TotalAvailable: 1726857871, 
TotalRequested/TotalJVMHeap:0.70
2016-02-09 04:48:43,564 [INFO] [TezChild] |resources.MemoryDistributor|: 
Allocations=[MRInput:org.apache.tez.mapreduce.input.MRInputLegacy:INPUT:0:0], 
[ireduce1:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:OUTPUT:1802502144:1726857871]
2016-02-09 04:48:43,564 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Starting Inputs/Outputs
2016-02-09 04:48:43,572 [INFO] [I/O Setup 1 Start: {MRInput}] 
|runtime.LogicalIOProcessorRuntimeTask|: Started Input with src edge: MRInput
2016-02-09 04:48:43,572 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Input: MRInput being auto started by 
the framework. Subsequent instances will not be auto-started
2016-02-09 04:48:43,573 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Num IOs determined for AutoStart: 1
2016-02-09 04:48:43,574 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: Waiting for 1 IOs to start
2016-02-09 04:48:43,574 [INFO] [TezChild] 
|runtime.LogicalIOProcessorRuntimeTask|: AutoStartComplete
2016-02-09 04:48:43,583 [INFO] [TezChild] |task.TaskRunner2Callable|: Running 
task, taskAttemptId=attempt_1454993155302_0001_1_00_61_3
2016-02-09 04:48:43,583 [INFO] [TezChild] |map.MapProcessor|: Running map: 
attempt_1454993155302_0001_1_00_61_3_10001
2016-02-09 04:48:43,675 [INFO] [TezChild] |impl.ExternalSorter|: ireduce1 
using: memoryMb=1646, keySerializerClass=class 
org.apache.hadoop.io.IntWritable, 
valueSerializerClass=org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer@5f143de6,
 comparator=org.apache.hadoop.io.IntWritable$Comparator@ec52d1f, 
partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, 
serialization=org.apache.hadoop.io.serializer.WritableSerialization
2016-02-09 04:48:43,686 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
PipelinedSorter for ireduce1: , UsingHashComparator=false
2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Newly 
allocated block size=1725956096, index=0, Number of buffers=1, 
currentAllocatableMemory=0, currentBufferSize=1725956096, total=1725956096
2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Pre 
allocating rest of memory buffers upfront
2016-02-09 04:48:45,093 [INFO] [TezChild] |impl.PipelinedSorter|: Setting up 
PipelinedSorter for ireduce1: , UsingHashComparator=false#blocks=1, 
maxMemUsage=1725956096, lazyAllocateMem=false, minBlockSize=2097152000, initial 
BLOCK_SIZE=1725956096, finalMergeEnabled=true, pipelinedShuffle=false, 
sendEmptyPartitions=true, tez.runtime.io.sort.mb=1646
2016-02-09 04:48:45,099 [INFO] [TezChild] |impl.PipelinedSorter|: ireduce1: 
reserved.remaining()=1725956096, reserved.metasize=16777216
2016-02-09 04:48:45,175 [INFO] [TezChild] |input.MRInput|: Initialized MRInput: 
MRInput
2016-02-09 08:55:40,790 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: 
Received should die res