date:20150323


[ 
https://issues.apache.org/jira/browse/TEZ-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375762#comment-14375762
 ] 

Hadoop QA commented on TEZ-2196:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706508/TEZ-2196.3.patch
  against master revision aa784be.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/330//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/330//console

This message is automatically generated.

 Consider reusing UnorderedPartitionedKVWriter with single output in 
 UnorderedKVOutput
 -

 Key: TEZ-2196
 URL: https://issues.apache.org/jira/browse/TEZ-2196
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2196.1.patch, TEZ-2196.2.patch, TEZ-2196.3.patch


 Can possibly get rid of FileBasedKVWriter and reuse 
 UnorderedPartitionedKVWriter with single partition in UnorderedKVOutput.  
 This can also benefit from pipelined shuffle changes done in 
 UnorderedPartitionedKVWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: TEZ-2196 PreCommit Build #330

Jira: https://issues.apache.org/jira/browse/TEZ-2196
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/330/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2762 lines...]
[INFO] Final Memory: 79M/900M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706508/TEZ-2196.3.patch
  against master revision aa784be.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/330//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/330//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
1e29b7fd5ef921eb35a85f522e7ef6d3951966d8 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #329
Archived 44 artifacts
Archive block size is 32768
Received 6 blocks and 2526255 bytes
Compression is 7.2%
Took 1 sec
Description set: TEZ-2196
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue


[ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377301#comment-14377301
 ] 

Hitesh Shah commented on TEZ-2221:
--

what happens if someone does the following:

{code}
dag.createVertexGroup(group_1, v1,v2);
dag.createVertexGroup(group_2, v1,v2);
{code}

This should also be disallowed. Correct?

 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue


[ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377327#comment-14377327
 ] 

Hitesh Shah commented on TEZ-2221:
--

Sorry - should have clarified. The test is being changed to not test that 
condition.

 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue


[ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377330#comment-14377330
 ] 

Jeff Zhang commented on TEZ-2221:
-

In the previous testcase we compare vertex group by using both group_name and 
members, I change the the test case to indicate that now we only compare with 
group name.


 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch


[ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377280#comment-14377280
 ] 

Hitesh Shah commented on TEZ-1421:
--

[~ozawa] In that case ( given that the solution seems to non-trivial), I think 
we can move the target version to 0.7.0 given that not many other folks have 
reported this issue. Agree?

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-03-23 Thread Tsuyoshi Ozawa (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated TEZ-1421:

Comment: was deleted

(was: [~hitesh] Yes, I agree with you.)

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-03-23 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377293#comment-14377293
 ] 

Tsuyoshi Ozawa commented on TEZ-1421:
-

[~hitesh] Yes, I agree with you.

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-1909) Remove need to copy over all events from attempt 1 to attempt 2 dir


[ 
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377297#comment-14377297
 ] 

Hitesh Shah commented on TEZ-1909:
--

Comments:

{code}
LOG.warn(Other recovery files will be skipped due to error in the previous 
recovery file);
{code}
  - please add the file name to this line as well as its length 

For TEZ_AM_RECOVERY_HANDLE_REMAINING_EVENT_WHEN_STOPPED, maybe change to 
TEZ_TEST_... and likewise change property value. No scope defined? 

It seems like the patch for this jira has been merged with fixes for a 
different jira? Can these be separated out? 




 Remove need to copy over all events from attempt 1 to attempt 2 dir
 ---

 Key: TEZ-1909
 URL: https://issues.apache.org/jira/browse/TEZ-1909
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-1909-1.patch, TEZ-1909-2.patch, TEZ-1909-3.patch


 Use of file versions should prevent the need for copying over data into a 
 second attempt dir. Care needs to be taken to handle last corrupt record 
 handling. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TEZ-1909) Remove need to copy over all events from attempt 1 to attempt 2 dir

[
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377297#comment-14377297
]

Hitesh Shah edited comment on TEZ-1909 at 3/24/15 5:07 AM:
---

Comments:

{code}
LOG.warn(Other recovery files will be skipped due to error in the previous
recovery file);
{code}
- please add the file name to this log line

For TEZ_AM_RECOVERY_HANDLE_REMAINING_EVENT_WHEN_STOPPED, maybe change to
TEZ_TEST_... and likewise change property value. No scope defined?

It seems like the patch for this jira has been merged with fixes for a
different jira? Can these be separated out?

was (Author: hitesh):
Comments:

{code}
LOG.warn(Other recovery files will be skipped due to error in the previous
recovery file);
{code}
- please add the file name to this line as well as its length

For TEZ_AM_RECOVERY_HANDLE_REMAINING_EVENT_WHEN_STOPPED, maybe change to
TEZ_TEST_... and likewise change property value. No scope defined?

It seems like the patch for this jira has been merged with fixes for a
different jira? Can these be separated out?

Remove need to copy over all events from attempt 1 to attempt 2 dir
---

Key: TEZ-1909
URL: https://issues.apache.org/jira/browse/TEZ-1909
Project: Apache Tez
Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Jeff Zhang
Attachments: TEZ-1909-1.patch, TEZ-1909-2.patch, TEZ-1909-3.patch

Use of file versions should prevent the need for copying over data into a
second attempt dir. Care needs to be taken to handle last corrupt record
handling.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch


 [ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1421:
-
Target Version/s: 0.7.0  (was: 0.6.1)

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


[ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377310#comment-14377310
 ] 

Hitesh Shah commented on TEZ-2204:
--

Comments:

{code}
// don't handle events if DAGAppMaster is in the state of STOPPED,
720   // otherwise there may be dead-lock happen.  TEZ-2204
721   if (DAGAppMaster.this.getServiceState() == STATE.STOPPED) {
722 return;
723   }
{code}

Can you add a log message to identify what events are being received after the 
AM is stopped? 

+1 after the above comment is addressed. 

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch, 
 TEZ-2204-4.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch


 [ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1421:
-
Priority: Critical  (was: Blocker)

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa
Priority: Critical

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


 [ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2204:
-
Target Version/s: 0.7.0  (was: 0.5.4)

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch, 
 TEZ-2204-4.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue


[ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377323#comment-14377323
 ] 

Jeff Zhang commented on TEZ-2221:
-

bq. This should also be disallowed. Correct?
Yes, it is not allowed. 

 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2176) Move all logging to slf4j


[ 
https://issues.apache.org/jira/browse/TEZ-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376493#comment-14376493
 ] 

Bikas Saha commented on TEZ-2176:
-

There should probably be a follow up jira to remove instances of 
LOG.isDebugEnabled() from the code based on 
http://www.slf4j.org/faq.html#logging_performance
[~vasanthkumar] Do you think you can take a crack at it?

 Move all logging to slf4j
 -

 Key: TEZ-2176
 URL: https://issues.apache.org/jira/browse/TEZ-2176
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Vasanth kumar RJ
 Fix For: 0.7.0

 Attachments: TEZ-2176.1.patch, TEZ-2176.2.1.txt, TEZ-2176.2.patch, 
 TEZ-2176.patch


 SLF4J supports a more comprehensive set of APIs - MDC, Formatted strings.
 Also drop commons-logging from the dependency set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2219) Should verify the input_name/output_name to be unique per vertex


[ 
https://issues.apache.org/jira/browse/TEZ-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375458#comment-14375458
 ] 

Hadoop QA commented on TEZ-2219:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706455/TEZ-2219-3.patch
  against master revision be982af.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/329//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/329//console

This message is automatically generated.

 Should verify the input_name/output_name to be unique per vertex
 

 Key: TEZ-2219
 URL: https://issues.apache.org/jira/browse/TEZ-2219
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2219-1.txt, TEZ-2219-2.patch, TEZ-2219-3.patch


 RuntimeTask try to get the Input/Output using the input_name/output_name, so 
 input_name/output_name should be unique per vertex



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2186) OOM with a simple scatter gather job with re-use


 [ 
https://issues.apache.org/jira/browse/TEZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2186:
--
Attachment: TEZ-2186-branch-0.6.patch

Looks like I didn't upload the branch-0.6 patch in this earlier. 

 OOM with a simple scatter gather job with re-use
 

 Key: TEZ-2186
 URL: https://issues.apache.org/jira/browse/TEZ-2186
 Project: Apache Tez
  Issue Type: Bug
Reporter: Siddharth Seth
Assignee: Rajesh Balamohan
 Fix For: 0.7.0

 Attachments: TEZ-2186-branch-0.6.patch, TEZ-2186.1.patch, 
 TEZ-2186.2.patch, noopexample.txt


 With a no-op scatter gather job, 20K x 2K, on a 20 node cluster with 20 2GB 
 containers per node - reducers end up failing with OOM errors. Haven't been 
 able to generate a heap dump yet. Will add details as they're found. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

[
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375512#comment-14375512
]

Jeff Zhang commented on TEZ-714:

Upload a new patch. [~bikassaha] Please help review it.

* Wrap the commit in the CallableEvent both in DAG Vertex, but for the abort,
still call it inline. Make the abort asyn will complicate the patch, so still
keep it a sync call as before.
* Introduce new state COMMITTING for Vertex DAG
** Vertex's COMMITTING means vertex is in the middle of committing, if vertex
has no committers or the option of TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is
true, vertex would not to to COMMITTING state.
** DAG's COMMITTING has 2 cases, one is when
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is true and all the vertices are
completed, another case is that TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is
false and all the vertices are completed, but still some vertex group
committers are running.
* Regarding the issue of not sure why group-commit and non-group commit need
to be differentiated in different transitions., I rename it to
NonFinalCommitCompletedTransition and FinalCommitCompletetionTransition (maybe
there's better names ). One mean the committer when
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false and the other means
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is true. The reason I differentiate
them is that for the NonFinalCommitCompletedEvent, we need to log the recovery
log of VertexGroupCommitCompletedEvent while it is not necessary for
FinalCommitCompletedEvent.
* Unit test is still not perfect. Because currently in the DAGImpl/VertexImpl
we run the shared thread pool in the AsynDispatcher thread ( that means
Committer still run in the thread of AsynDispather) so this may hide some
potential issues and under this thread mode, it is not possible for test some
cases like kill dag while it is in committing. I am trying to think of ways to
simulate the shared thread pool in the unit test.
* For the some existing transition, like (RUNNING to ERROR due to INTERNAL
ERROR), I am not sure why it go to ERROR directly rather than TERMINATING.
Maybe it is to allow the client get the final status as earyl as possible.

OutputCommitters should not run in the main AM dispatcher thread

Key: TEZ-714
URL: https://issues.apache.org/jira/browse/TEZ-714
Project: Apache Tez
Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Jeff Zhang
Priority: Critical
Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, Vertex_2.pdf

Follow up jira from TEZ-41.
1) If there's multiple OutputCommitters on a Vertex, they can be run in
parallel.
2) Running an OutputCommitter in the main thread blocks all other event
handling, w.r.t the DAG, and causes the event queue to back up.
3) This should also cover shared commits that happen in the DAG.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2219) Should verify the input_name/output_name to be unique per vertex


[ 
https://issues.apache.org/jira/browse/TEZ-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375485#comment-14375485
 ] 

Jeff Zhang commented on TEZ-2219:
-

Thanks [~hitesh] Committed to master, branch-0.5, branch-0.6

 Should verify the input_name/output_name to be unique per vertex
 

 Key: TEZ-2219
 URL: https://issues.apache.org/jira/browse/TEZ-2219
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Fix For: 0.5.4

 Attachments: TEZ-2219-1.txt, TEZ-2219-2.patch, TEZ-2219-3.patch


 RuntimeTask try to get the Input/Output using the input_name/output_name, so 
 input_name/output_name should be unique per vertex



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: TEZ-2219 PreCommit Build #329

Jira: https://issues.apache.org/jira/browse/TEZ-2219
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/329/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2750 lines...]
[INFO] Final Memory: 70M/979M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706455/TEZ-2219-3.patch
  against master revision be982af.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/329//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/329//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
8158b886c4295bfc0bcb65a420475e2c8d99222b logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #328
Archived 44 artifacts
Archive block size is 32768
Received 24 blocks and 1936384 bytes
Compression is 28.9%
Took 1.5 sec
Description set: TEZ-2219
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread


 [ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-714:
---
Attachment: TEZ-714-2.patch

 OutputCommitters should not run in the main AM dispatcher thread
 

 Key: TEZ-714
 URL: https://issues.apache.org/jira/browse/TEZ-714
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Jeff Zhang
Priority: Critical
 Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, Vertex_2.pdf


 Follow up jira from TEZ-41.
 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
 parallel.
 2) Running an OutputCommitter in the main thread blocks all other event 
 handling, w.r.t the DAG, and causes the event queue to back up.
 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging


[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376763#comment-14376763
 ] 

Rajesh Balamohan edited comment on TEZ-2214 at 3/23/15 10:21 PM:
-

[~hitesh] - In such cases, the next line inMemoryMerger.waitForMerge() acts 
as the barrier.  It would wait until the existing merge completes (which 
internally releases memory for usedMemory  commitMemory). 


was (Author: rajesh.balamohan):
[~hitesh] - In such cases, the next line inMemoryMerger.waitForMerge() acts 
as the barrier.  It would wait until the existing merging completes (which 
internally releases memory for usedMemory  commitMemory). 

 FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
 memToDiskMerging
 --

 Key: TEZ-2214
 URL: https://issues.apache.org/jira/browse/TEZ-2214
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2214.1.patch


 Scenario:
 - commitMemory  usedMemory are beyond their allowed threshold.
 - InMemoryMerge kicks off and is in the process of flushing memory contents 
 to disk
 - As it progresses, it releases memory segments as well (but not yet over).
 - Fetchers who need memory  maxSingleShuffleLimit, get scheduled.
 - If fetchers are fast, this quickly adds up to commitMemory  usedMemory. 
 Since InMemoryMerge is already in progress, this wouldn't trigger another 
 merge().
 - Pretty soon all fetchers would be stalled and get into the following state.
 {noformat}
 Thread 9351: (state = BLOCKED)
  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
 imprecise)
  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
  @bci=17, line=337 (Interpreted frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
  @bci=34, line=157 (Interpreted frame)
 {noformat}
 - Even if InMemoryMerger completes, commitedMem  usedMem are beyond their 
 threshold and no other fetcher threads (all are in stalled state) are there 
 to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging


[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376763#comment-14376763
 ] 

Rajesh Balamohan commented on TEZ-2214:
---

[~hitesh] - In such cases, the next line inMemoryMerger.waitForMerge() acts 
as the barrier.  It would wait until the existing merging completes (which 
internally releases memory for usedMemory  commitMemory). 

 FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
 memToDiskMerging
 --

 Key: TEZ-2214
 URL: https://issues.apache.org/jira/browse/TEZ-2214
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2214.1.patch


 Scenario:
 - commitMemory  usedMemory are beyond their allowed threshold.
 - InMemoryMerge kicks off and is in the process of flushing memory contents 
 to disk
 - As it progresses, it releases memory segments as well (but not yet over).
 - Fetchers who need memory  maxSingleShuffleLimit, get scheduled.
 - If fetchers are fast, this quickly adds up to commitMemory  usedMemory. 
 Since InMemoryMerge is already in progress, this wouldn't trigger another 
 merge().
 - Pretty soon all fetchers would be stalled and get into the following state.
 {noformat}
 Thread 9351: (state = BLOCKED)
  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
 imprecise)
  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
  @bci=17, line=337 (Interpreted frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
  @bci=34, line=157 (Interpreted frame)
 {noformat}
 - Even if InMemoryMerger completes, commitedMem  usedMem are beyond their 
 threshold and no other fetcher threads (all are in stalled state) are there 
 to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2149) Optimizations for the timed version of DAGClient.getStatus


 [ 
https://issues.apache.org/jira/browse/TEZ-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2149:

Attachment: TEZ-2149.1.txt

Patch adds a notify in the AM to return early instead of the sleep. Also 
changes the waitUntilCompletion methods to use this API instead of an explicit 
sleep.

[~bikassaha], [~hitesh], [~pramachandran] - please review.

 Optimizations for the timed version of DAGClient.getStatus
 --

 Key: TEZ-2149
 URL: https://issues.apache.org/jira/browse/TEZ-2149
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-2149.1.txt


 From 
 https://issues.apache.org/jira/browse/TEZ-1967?focusedCommentId=14325037page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325037
 - The sleep within the AM can be improved via monitors.
 - INITED state is returned when communicating with the AM, SUBMITTED state is 
 returned when communicating with the RM. That could be used to optimize the 
 flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (TEZ-2222) Investigate moving to log4j2 for logging

Siddharth Seth created TEZ-:
---

 Summary: Investigate moving to log4j2 for logging
 Key: TEZ-
 URL: https://issues.apache.org/jira/browse/TEZ-
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth


Via slf4j.

Some bits to keep in mind
- We have explicit code which rotates logs using direct log4j12 APIs. This 
should keep working. I believe the log4j2 APIs are different here
- API compatibility between log4j12 / log4j2 can be problematic - if both end 
up on the classpath (I believe the APIs are different)
- Hadoop dist includes a slf4j-log4j12 binding. Changing the default can result 
in sl4j-log4j12 and slf4j-log4j2 to co-exist by default - which could be 
problematic. Needs investigation.

End of the day, we will likely need an option to use either of the two.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2149) Optimizations for the timed version of DAGClient.getStatus


[ 
https://issues.apache.org/jira/browse/TEZ-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376778#comment-14376778
 ] 

Hadoop QA commented on TEZ-2149:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706725/TEZ-2149.1.txt
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 186 javac 
compiler warnings (more than the master's current 180 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/331//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/331//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/331//console

This message is automatically generated.

 Optimizations for the timed version of DAGClient.getStatus
 --

 Key: TEZ-2149
 URL: https://issues.apache.org/jira/browse/TEZ-2149
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-2149.1.txt


 From 
 https://issues.apache.org/jira/browse/TEZ-1967?focusedCommentId=14325037page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325037
 - The sleep within the AM can be improved via monitors.
 - INITED state is returned when communicating with the AM, SUBMITTED state is 
 returned when communicating with the RM. That could be used to optimize the 
 flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Failed: TEZ-2149 PreCommit Build #331

Jira: https://issues.apache.org/jira/browse/TEZ-2149
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/331/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2762 lines...]




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706725/TEZ-2149.1.txt
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 186 javac 
compiler warnings (more than the master's current 180 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/331//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/331//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/331//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
51847fbdaa19c11add5148b625cd3be38588f1c8 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #330
Archived 45 artifacts
Archive block size is 32768
Received 19 blocks and 2106290 bytes
Compression is 22.8%
Took 1.6 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2214) FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging


[ 
https://issues.apache.org/jira/browse/TEZ-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376374#comment-14376374
 ] 

Hitesh Shah commented on TEZ-2214:
--

[~rajesh.balamohan] question on the newly added invocation to 
startMemToDiskMerge. What happens when startMemToDiskMerge() is called while 
a merge is in progress? It seems like startMemToDiskMerge() is a no-op when 
that happens.





 FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses 
 memToDiskMerging
 --

 Key: TEZ-2214
 URL: https://issues.apache.org/jira/browse/TEZ-2214
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2214.1.patch


 Scenario:
 - commitMemory  usedMemory are beyond their allowed threshold.
 - InMemoryMerge kicks off and is in the process of flushing memory contents 
 to disk
 - As it progresses, it releases memory segments as well (but not yet over).
 - Fetchers who need memory  maxSingleShuffleLimit, get scheduled.
 - If fetchers are fast, this quickly adds up to commitMemory  usedMemory. 
 Since InMemoryMerge is already in progress, this wouldn't trigger another 
 merge().
 - Pretty soon all fetchers would be stalled and get into the following state.
 {noformat}
 Thread 9351: (state = BLOCKED)
  - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be 
 imprecise)
  - java.lang.Object.wait() @bci=2, line=502 (Compiled frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.waitForShuffleToMergeMemory()
  @bci=17, line=337 (Interpreted frame)
  - 
 org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run()
  @bci=34, line=157 (Interpreted frame)
 {noformat}
 - Even if InMemoryMerger completes, commitedMem  usedMem are beyond their 
 threshold and no other fetcher threads (all are in stalled state) are there 
 to release memory. This causes fetchers to wait indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2176) Move all logging to slf4j


[ 
https://issues.apache.org/jira/browse/TEZ-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376404#comment-14376404
 ] 

Siddharth Seth commented on TEZ-2176:
-

+1. Looks good. Attaching a rebased patch after the last few commits and 
committing, before this goes stale. Thanks [~vasanthkumar].

 Move all logging to slf4j
 -

 Key: TEZ-2176
 URL: https://issues.apache.org/jira/browse/TEZ-2176
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Vasanth kumar RJ
 Attachments: TEZ-2176.1.patch, TEZ-2176.2.1.txt, TEZ-2176.2.patch, 
 TEZ-2176.patch


 SLF4J supports a more comprehensive set of APIs - MDC, Formatted strings.
 Also drop commons-logging from the dependency set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2176) Move all logging to slf4j


 [ 
https://issues.apache.org/jira/browse/TEZ-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2176:

Attachment: TEZ-2176.2.1.txt

Rebased version of TEZ-2176.2.

 Move all logging to slf4j
 -

 Key: TEZ-2176
 URL: https://issues.apache.org/jira/browse/TEZ-2176
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Vasanth kumar RJ
 Attachments: TEZ-2176.1.patch, TEZ-2176.2.1.txt, TEZ-2176.2.patch, 
 TEZ-2176.patch


 SLF4J supports a more comprehensive set of APIs - MDC, Formatted strings.
 Also drop commons-logging from the dependency set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (TEZ-1937) Reduce cost of merging ifiles in UnorderedPartitionedWriter


 [ 
https://issues.apache.org/jira/browse/TEZ-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan resolved TEZ-1937.
---
Resolution: Duplicate

This is already taken care as part of fixing TEZ-1094.  Marking this as a 
duplicate.  

 Reduce cost of merging ifiles in UnorderedPartitionedWriter
 ---

 Key: TEZ-1937
 URL: https://issues.apache.org/jira/browse/TEZ-1937
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-1937.1.patch, TEZ-1937.2.patch, TEZ-1937.WIP.patch


 Currently we iterate through all spilled files for merging.  This incurs 
 additional deserialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2076) Tez framework to extract/analyze data stored in ATS for specific dag


 [ 
https://issues.apache.org/jira/browse/TEZ-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2076:
--
Attachment: TEZ-2076.6.patch

Fixed minor pom.xml issue.

 Tez framework to extract/analyze data stored in ATS for specific dag
 

 Key: TEZ-2076
 URL: https://issues.apache.org/jira/browse/TEZ-2076
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2076.1.patch, TEZ-2076.2.patch, TEZ-2076.3.patch, 
 TEZ-2076.4.patch, TEZ-2076.5.patch, TEZ-2076.6.patch, TEZ-2076.WIP.2.patch, 
 TEZ-2076.WIP.3.patch, TEZ-2076.WIP.patch


 - Users should be able to download ATS data pertaining to a DAG from Tez-UI 
 (more like a zip file containing DAG/Vertex/Task/TaskAttempt info).
 - This can be plugged to an analyzer which parses the data, adds semantics 
 and provides an in-memory representation for further analysis.
 - This will enable to write different analyzer rules, which can be run on top 
 of this in-memory representation to come up with analysis on the DAG.
 - Results of this analyzer rules can be rendered on to UI (standalone webapp) 
 later point in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: TEZ-2076 PreCommit Build #332

Jira: https://issues.apache.org/jira/browse/TEZ-2076
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/332/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2767 lines...]
[INFO] Final Memory: 82M/846M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706758/TEZ-2076.6.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/332//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/332//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
03d06ac089aa0174e7fd748b49510ec9b96dd930 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #330
Archived 53 artifacts
Archive block size is 32768
Received 6 blocks and 7384023 bytes
Compression is 2.6%
Took 2.3 sec
Description set: TEZ-2076
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2076) Tez framework to extract/analyze data stored in ATS for specific dag


[ 
https://issues.apache.org/jira/browse/TEZ-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376958#comment-14376958
 ] 

Hadoop QA commented on TEZ-2076:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706758/TEZ-2076.6.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/332//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/332//console

This message is automatically generated.

 Tez framework to extract/analyze data stored in ATS for specific dag
 

 Key: TEZ-2076
 URL: https://issues.apache.org/jira/browse/TEZ-2076
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2076.1.patch, TEZ-2076.2.patch, TEZ-2076.3.patch, 
 TEZ-2076.4.patch, TEZ-2076.5.patch, TEZ-2076.6.patch, TEZ-2076.WIP.2.patch, 
 TEZ-2076.WIP.3.patch, TEZ-2076.WIP.patch


 - Users should be able to download ATS data pertaining to a DAG from Tez-UI 
 (more like a zip file containing DAG/Vertex/Task/TaskAttempt info).
 - This can be plugged to an analyzer which parses the data, adds semantics 
 and provides an in-memory representation for further analysis.
 - This will enable to write different analyzer rules, which can be run on top 
 of this in-memory representation to come up with analysis on the DAG.
 - Results of this analyzer rules can be rendered on to UI (standalone webapp) 
 later point in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


 [ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2204:

Attachment: TEZ-2204-3.patch

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


[ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377018#comment-14377018
 ] 

Jeff Zhang commented on TEZ-2204:
-

Upload new patch (exclude the findbugs warning )

[~hitesh] [~bikassaha] Please help review it.

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


 [ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2204:

Attachment: TEZ-2204-4.patch

Minor update on the patch

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch, 
 TEZ-2204-4.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

[
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bikas Saha updated TEZ-2217:

Attachment: TEZ-2217.1.patch

Attaching a fix that ensures that when there are no further pending container
requests then new containers are not released if they have been added to the
min held list. This should be safe because there are no pending requests.
[~gopalv] Can you please try this out and see if this fixes your case? If so,
then a review would be great :) The code change is minimal and explained above.
The test was a pain to write :P

The min-held-containers constraint is not enforced during query runtime

Key: TEZ-2217
URL: https://issues.apache.org/jira/browse/TEZ-2217
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2

The min-held containers constraint is respected during query idle times, but
is not respected when a query is actually in motion.
The AM releases unused containers during dag execution without checking for
min-held containers.
{code}
2015-03-20 15:41:53,475 INFO [DelayedContainerManager]
rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing
container, containerId=container_1424502260528_1348_01_13,
containerExpiryTime=1426891313264, idleTimeoutMin=5000
2015-03-20 15:41:53,475 INFO [DelayedContainerManager]
rm.YarnTaskSchedulerService: Releasing unused container:
container_1424502260528_1348_01_13
{code}
This is actually useful only after the AM has received a soft pre-emption
message, doing it on an idle cluster slows down one of the most common query
patterns in BI systems.
{code}
create temporary table smalltable as ...;
select ... bigtable JOIN smalltable ON ...;
{code}
The smaller query in the beginning throws away the pre-warmed capacity.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377072#comment-14377072
 ] 

Gopal V commented on TEZ-2217:
--

[~bikassaha]: any suggestions on more logging in the code to narrow down this?

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2223) TestMockDAGAppMaster fails due to TEZ-2210 on mac


 [ 
https://issues.apache.org/jira/browse/TEZ-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2223:

Summary: TestMockDAGAppMaster fails due to TEZ-2210 on mac  (was: 
TestMockDAGAppMaster fails due to TEZ-2210)

 TestMockDAGAppMaster fails due to TEZ-2210 on mac
 -

 Key: TEZ-2223
 URL: https://issues.apache.org/jira/browse/TEZ-2223
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang

 [~bikassaha] looks like TestMockDAGAppMaster fails due to TEZ-2210 
 It would fail on mac due to cpuPlugin is null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2223) TestMockDAGAppMaster fails due to TEZ-2210


 [ 
https://issues.apache.org/jira/browse/TEZ-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2223:

Description: 
[~bikassaha] looks like TestMockDAGAppMaster fails due to TEZ-2210 
It would fail on mac due to cpuPlugin is null

  was:[~bikassaha] looks like TestMockDAGAppMaster fails due to TEZ-2210 


 TestMockDAGAppMaster fails due to TEZ-2210
 --

 Key: TEZ-2223
 URL: https://issues.apache.org/jira/browse/TEZ-2223
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang

 [~bikassaha] looks like TestMockDAGAppMaster fails due to TEZ-2210 
 It would fail on mac due to cpuPlugin is null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377077#comment-14377077
 ] 

Bikas Saha commented on TEZ-2217:
-

Sorry. To be clear. This is with the patch attached?

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377128#comment-14377128
 ] 

Hadoop QA commented on TEZ-2217:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706789/TEZ-2217.1.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/334//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/334//console

This message is automatically generated.

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2221) VertexGroup name should be unqiue


[ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377180#comment-14377180
 ] 

Hadoop QA commented on TEZ-2221:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706800/TEZ-2221-1.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/335//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/335//console

This message is automatically generated.

 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: TEZ-2221 PreCommit Build #335

Jira: https://issues.apache.org/jira/browse/TEZ-2221
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/335/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2749 lines...]
[INFO] Final Memory: 70M/973M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706800/TEZ-2221-1.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/335//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/335//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
d5affd0ff69e0697d9b68ca07d5e206cb522faa6 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #334
Archived 44 artifacts
Archive block size is 32768
Received 21 blocks and 2035862 bytes
Compression is 25.3%
Took 0.75 sec
Description set: TEZ-2221
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-986) Make conf set on DAG and vertex available in jobhistory


[ 
https://issues.apache.org/jira/browse/TEZ-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377271#comment-14377271
 ] 

Hitesh Shah commented on TEZ-986:
-

Moving this out to 0.6.2. 

Not sure if [~Sreenath] has had a chance to look at this jira.

 Make conf set on DAG and vertex available in jobhistory
 ---

 Key: TEZ-986
 URL: https://issues.apache.org/jira/browse/TEZ-986
 Project: Apache Tez
  Issue Type: Sub-task
  Components: UI
Reporter: Rohini Palaniswamy
Priority: Blocker

 Would like to have the conf set on DAG and Vertex
   1) viewable in Tez UI after the job completes. This is very essential for 
 debugging jobs.
   2) We have processes, that parse jobconf.xml from job history (hdfs) and 
 load them into hive tables for analysis. Would like to have Tez also make all 
 the configuration (byte array) available in job history so that we can 
 similarly parse them. 1) mandates that you store it in hdfs. 2) is just to 
 say make the format stored as a contract others can rely on for parsing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2221) VertexGroup name should be unqiue


 [ 
https://issues.apache.org/jira/browse/TEZ-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2221:

Attachment: TEZ-2221-1.patch

 VertexGroup name should be unqiue
 -

 Key: TEZ-2221
 URL: https://issues.apache.org/jira/browse/TEZ-2221
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-2221-1.patch


 VertexGroupCommitStartedEvent  VertexGroupCommitFinishedEvent use vertex 
 group name to identify the vertex group commit, the same name of vertex group 
 will conflict. While in the current equals  hashCode of VertexGroup, vertex 
 group name and members name are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread


[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377150#comment-14377150
 ] 

Hadoop QA commented on TEZ-714:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706466/TEZ-714-2.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in  

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/336//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/336//console

This message is automatically generated.

 OutputCommitters should not run in the main AM dispatcher thread
 

 Key: TEZ-714
 URL: https://issues.apache.org/jira/browse/TEZ-714
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Jeff Zhang
Priority: Critical
 Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, Vertex_2.pdf


 Follow up jira from TEZ-41.
 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
 parallel.
 2) Running an OutputCommitter in the main thread blocks all other event 
 handling, w.r.t the DAG, and causes the event queue to back up.
 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Failed: TEZ-714 PreCommit Build #336

Jira: https://issues.apache.org/jira/browse/TEZ-714
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/336/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1229 lines...]


  Running tests 
  /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/build-tools/test-patch.sh:
 line 609: 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt:
 No such file or directory
cat: 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt:
 No such file or directory
awk: cannot open 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt
 (No such file or directory)




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706466/TEZ-714-2.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in  

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/336//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/336//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
92eeff2a0bc0fe4afb6396a0f6663a6b640cf699 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377148#comment-14377148
 ] 

Hadoop QA commented on TEZ-2217:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12706809/TEZ-2217-debug.txt.bz2
  against master revision 6d0b10a.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/337//console

This message is automatically generated.

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
 TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

[
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377209#comment-14377209
]

Jeff Zhang commented on TEZ-714:

bq. Can this be fixed by having the events for both be different? But still
handled in the same transition.
It could, but this may make the transition complicated. Currently we need to
differentiate these 2 kinds of commits, besides there's 2 possible states
(RUNNING, COMMITTING) when the commit happens and we also need check handle 2
different cases (commit succeeded failure), so there would be totally 8
different cases in one transition which may be difficult to read.

bq. Is this recovery log written relevant only in the non-commit-at-end case
where group commits can happen before the DAG finishes?
Yes

bq. Maybe you can create a new TestCommit that starts from scratch without the
hacks in TestVertexImpl.
Yes, this is I plan to do.

bq. Is this for VertexImpl or DAGImpl? That sounds like a bug. Is that relevant
to the commit operation though?
It is relevant to the abort. Currently in DAG's InternalErrorTransition (no
matter what state it is ), dag would abort directly and go to ERROR state
without waiting for vertex to finish.

OutputCommitters should not run in the main AM dispatcher thread

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2097) TEZ-UI Add dag logs


 [ 
https://issues.apache.org/jira/browse/TEZ-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2097:
-
Target Version/s: 0.6.2  (was: 0.6.1)

 TEZ-UI Add dag logs
 ---

 Key: TEZ-2097
 URL: https://issues.apache.org/jira/browse/TEZ-2097
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Reporter: Jeff Zhang
Priority: Critical

 If dag fails due to AM error, there's no way to check the dag logs on tez-ui. 
 Users have to grab the app logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2097) TEZ-UI Add dag logs


[ 
https://issues.apache.org/jira/browse/TEZ-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377263#comment-14377263
 ] 

Hitesh Shah commented on TEZ-2097:
--

Downgrading to critical. 

 TEZ-UI Add dag logs
 ---

 Key: TEZ-2097
 URL: https://issues.apache.org/jira/browse/TEZ-2097
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Reporter: Jeff Zhang
Priority: Critical

 If dag fails due to AM error, there's no way to check the dag logs on tez-ui. 
 Users have to grab the app logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2204) TestAMRecovery increasingly flaky on jenkins builds.


[ 
https://issues.apache.org/jira/browse/TEZ-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377074#comment-14377074
 ] 

Hadoop QA commented on TEZ-2204:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706785/TEZ-2204-4.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/333//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/333//console

This message is automatically generated.

 TestAMRecovery increasingly flaky on jenkins builds. 
 -

 Key: TEZ-2204
 URL: https://issues.apache.org/jira/browse/TEZ-2204
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Jeff Zhang
 Attachments: TEZ-2204-1.patch, TEZ-2204-2.patch, TEZ-2204-3.patch, 
 TEZ-2204-4.patch


 In recent pre-commit builds and daily builds, there seem to have been some 
 occurrences of TestAMRecovery failing or timing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Failed: TEZ-2204 PreCommit Build #333

Jira: https://issues.apache.org/jira/browse/TEZ-2204
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/333/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2753 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706785/TEZ-2204-4.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/333//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/333//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
e96c6d82358f4d860778235c1794bfe40a782908 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #332
Archived 44 artifacts
Archive block size is 32768
Received 8 blocks and 2461464 bytes
Compression is 9.6%
Took 0.89 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Created] (TEZ-2223) TestMockDAGAppMaster fails due to TEZ-2210

Jeff Zhang created TEZ-2223:
---

 Summary: TestMockDAGAppMaster fails due to TEZ-2210
 Key: TEZ-2223
 URL: https://issues.apache.org/jira/browse/TEZ-2223
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang


[~bikassaha] looks like TestMockDAGAppMaster fails due to TEZ-2210 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

[
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377105#comment-14377105
]

Bikas Saha edited comment on TEZ-714 at 3/24/15 2:05 AM:
-

Not seen the patch yet because it may change if you agree with these comments
bq. Regarding the issue of not sure why group-commit and non-group commit need
to be differentiated in different transitions.
Can this be fixed by having the events for both be different? But still handled
in the same transition. The transition can check if its a group commit event vs
normal commit event (based on event type) - and then log for group commit.
Maybe group commit event can derive from normal commit event. IMO, having less
transitions makes the code much simpler.

Is this recovery log written relevant only in the non-commit-at-end case where
group commits can happen before the DAG finishes?

bq. Unit test is still not perfect. Because currently in the DAGImpl/VertexImpl
we run the shared thread pool in the AsynDispatcher
For these tests we could choose to use the normal thread pool by overriding the
setup. Since this is a new test, it can try to not depend on ordering like the
existing tests do. If so, then it should be fine to use the real threadpool
instead of the fake thread pool that delegates to the dispatcher. Maybe you can
create a new TestCommit that starts from scratch without the hacks in
TestVertexImpl.

bq. For the some existing transition, like (RUNNING to ERROR due to INTERNAL
ERROR)
Is this for VertexImpl or DAGImpl? That sounds like a bug. Is that relevant to
the commit operation though?

was (Author: bikassaha):
Not seen the patch yet because it may change if you agree with these comments
bq. Regarding the issue of not sure why group-commit and non-group commit need
to be differentiated in different transitions.
Can this be fixed by having the events for both be different? But still handled
in the same transition. The transition can check if its a group commit event vs
normal commit event (based on event type) - and then log for group commit.
Maybe group commit event can derive from normal commit event.

Is this recovery log written relevant only in the non-commit-at-end case where
group commits can happen before the DAG finishes?

bq. For the some existing transition, like (RUNNING to ERROR due to INTERNAL
ERROR)
Is this for VertexImpl or DAGImpl? That sounds like a bug. Is that relevant to
the commit operation though?

OutputCommitters should not run in the main AM dispatcher thread

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

[
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377105#comment-14377105
]

Bikas Saha commented on TEZ-714:

Is this recovery log written relevant only in the non-commit-at-end case where
group commits can happen before the DAG finishes?

bq. For the some existing transition, like (RUNNING to ERROR due to INTERNAL
ERROR)
Is this for VertexImpl or DAGImpl? That sounds like a bug. Is that relevant to
the commit operation though?

OutputCommitters should not run in the main AM dispatcher thread

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377118#comment-14377118
 ] 

Bikas Saha commented on TEZ-2217:
-

This may help in setting debug logs for only 1 class
{noformat}  /**
   * Root Logging level passed to the Tez app master.
   *
   * Simple configuration: Set the log level for all loggers.
   *   e.g. INFO
   *   This sets the log level to INFO for all loggers.
   *
   * Advanced configuration: Set the log level for all classes, along with a 
different level for some.
   *   e.g. DEBUG;org.apache.hadoop.ipc=INFO;org.apache.hadoop.security=INFO
   *   This sets the log level for all loggers to DEBUG, expect for the
   *   org.apache.hadoop.ipc and org.apache.hadoop.security, which are set to 
INFO
   *
   * Note: The global log level must always be the first parameter.
   *   DEBUG;org.apache.hadoop.ipc=INFO;org.apache.hadoop.security=INFO is valid
   *   org.apache.hadoop.ipc=INFO;org.apache.hadoop.security=INFO is not valid
   * */
  @ConfigurationScope(Scope.AM)
  public static final String TEZ_AM_LOG_LEVEL = TEZ_AM_PREFIX + log.level;
  public static final String TEZ_AM_LOG_LEVEL_DEFAULT = INFO;
{noformat}

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2097) TEZ-UI Add dag logs


 [ 
https://issues.apache.org/jira/browse/TEZ-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2097:
-
Priority: Critical  (was: Blocker)

 TEZ-UI Add dag logs
 ---

 Key: TEZ-2097
 URL: https://issues.apache.org/jira/browse/TEZ-2097
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Reporter: Jeff Zhang
Priority: Critical

 If dag fails due to AM error, there's no way to check the dag logs on tez-ui. 
 Users have to grab the app logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2047) Build fails against hadoop-2.2 post TEZ-2018


[ 
https://issues.apache.org/jira/browse/TEZ-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377270#comment-14377270
 ] 

Hitesh Shah commented on TEZ-2047:
--

[~pramachandran] Sorry for the delay in the review. 

Comments: 

The basic change looks fine but I am not sure how we are enforcing only http ( 
no ssl ) mode with the current implemenation? The WebApps code seems to 
eventually look into the config for the yarn policy. Should the WebUIService be 
setting that up correctly to enforce http only?

 Build fails against hadoop-2.2 post TEZ-2018
 

 Key: TEZ-2047
 URL: https://issues.apache.org/jira/browse/TEZ-2047
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Prakash Ramachandran
Priority: Blocker
 Attachments: TEZ-2047.1.patch


 Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project tez-dag: Compilation failure: Compilation failure:
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[85,13]
  cannot find symbol
 [ERROR] symbol  : method 
 withHttpPolicy(org.apache.hadoop.conf.Configuration,org.apache.hadoop.http.HttpConfig.Policy)
 [ERROR] location: class 
 org.apache.hadoop.yarn.webapp.WebApps.Builderorg.apache.tez.dag.app.web.WebUIService.TezAMWebApp
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/Tez-Build-Hadoop-2.2/tez-dag/src/main/java/org/apache/tez/dag/app/web/WebUIService.java:[87,45]
  cannot find symbol
 [ERROR] symbol  : method getConnectorAddress(int)
 [ERROR] location: class org.apache.hadoop.http.HttpServer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377068#comment-14377068
 ] 

Gopal V commented on TEZ-2217:
--

I quickly cross-checked, this - it seems to be still letting go of containers 
despite min-held being  queue size.

The containers were observed as being released during the getSplits() operation.

{code}
2015-03-23 18:35:14,865 INFO [InputInitializer [Map 1] #0] io.HiveInputFormat: 
Generating splits
2015-03-23 18:35:14,870 INFO [InputInitializer [Map 1] #0] log.PerfLogger: 
PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl
2015-03-23 18:35:14,889 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
container, containerId=container_1424502260528_1391_01_000310, 
containerExpiryTime=1427160914665, idleTimeoutMin=5000
2015-03-23 18:35:14,889 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Releasing unused container: 
container_1424502260528_1391_01_000310
2015-03-23 18:35:14,889 INFO [Dispatcher thread: Central] 
history.HistoryEventHandler: 
[HISTORY][DAG:dag_1424502260528_1391_11][Event:CONTAINER_STOPPED]: 
containerId=container_1424502260528_1391_01_000310, stoppedTime=1427160914889, 
exitStatus=0
2015-03-23 18:35:14,889 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: AMContainer container_1424502260528_1391_01_000310 
transitioned from IDLE to STOP_REQUESTED via event C_STOP_REQUEST
2015-03-23 18:35:14,890 INFO [ContainerLauncher #25] 
launcher.ContainerLauncherImpl: Processing the event EventType: 
CONTAINER_STOP_REQUEST
{code}

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

[
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377086#comment-14377086
]

Gopal V commented on TEZ-2217:
--

Yes, the LOG does not say delay expired or is new. - which seems in the
codepath that this patch changed.

Which is why I asked about new logging.

The min-held-containers constraint is not enforced during query runtime

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


[ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377083#comment-14377083
 ] 

Bikas Saha commented on TEZ-2217:
-

If its with the patch, then it would mean that the scheduler has non-empty task 
requests at that time. With the fix, can you please attach the AM logs with 
debug logging enabled for the YarnTaskSchedulerService only. Else it will have 
RPC junk in it. Thanks

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217.1.patch, TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime

[
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377115#comment-14377115
]

Bikas Saha commented on TEZ-2217:
-

The existing debug logs should be enough if enabled. What is intriguing is that
at this point in time there are pending task requests that have not already
been matched to the containers because I am guessing that the job already has
all the containers it will ever get. If that was not the case then it would hit
the changed code path (AM is idle or there are no pending requests).
What is the min expiry time compared to the delays between node-rack-star
matching? Hoping that the containers have been tried to be matched upto star
before the min expiry elapses. So all tasks should have been matched to some
containers leading to empty task requests.

The min-held-containers constraint is not enforced during query runtime

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (TEZ-2217) The min-held-containers constraint is not enforced during query runtime


 [ 
https://issues.apache.org/jira/browse/TEZ-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated TEZ-2217:
-
Attachment: TEZ-2217-debug.txt.bz2

Debug logs attached.

{code}
$ grep Releasing unused app-log.txt | wc -l
111
{code}

I always use {{--hiveconf tez.am.log.level=INFO;class-name=DEBUG}}, that 
seems to have worked.

 The min-held-containers constraint is not enforced during query runtime 
 

 Key: TEZ-2217
 URL: https://issues.apache.org/jira/browse/TEZ-2217
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Gopal V
Assignee: Bikas Saha
 Attachments: TEZ-2217-debug.txt.bz2, TEZ-2217.1.patch, 
 TEZ-2217.txt.bz2


 The min-held containers constraint is respected during query idle times, but 
 is not respected when a query is actually in motion.
 The AM releases unused containers during dag execution without checking for 
 min-held containers.
 {code}
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Container's idle timeout expired. Releasing 
 container, containerId=container_1424502260528_1348_01_13, 
 containerExpiryTime=1426891313264, idleTimeoutMin=5000
 2015-03-20 15:41:53,475 INFO [DelayedContainerManager] 
 rm.YarnTaskSchedulerService: Releasing unused container: 
 container_1424502260528_1348_01_13
 {code}
 This is actually useful only after the AM has received a soft pre-emption 
 message, doing it on an idle cluster slows down one of the most common query 
 patterns in BI systems.
 {code}
 create temporary table smalltable as ...; 
 select ... bigtable JOIN smalltable ON ...;
 {code}
 The smaller query in the beginning throws away the pre-warmed capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Success: TEZ-2217 PreCommit Build #334

Jira: https://issues.apache.org/jira/browse/TEZ-2217
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/334/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2752 lines...]
[INFO] Final Memory: 67M/805M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12706789/TEZ-2217.1.patch
  against master revision 6d0b10a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/334//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/334//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
c119360121f6701025a88826b76dec1f3083c568 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #332
Archived 44 artifacts
Archive block size is 32768
Received 6 blocks and 2527096 bytes
Compression is 7.2%
Took 0.73 sec
Description set: TEZ-2217
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

Failed: TEZ-2217 PreCommit Build #337

Jira: https://issues.apache.org/jira/browse/TEZ-2217
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/337/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 31 lines...]
HEAD is now at 6d0b10a TEZ-2176. Move all logging to slf4j. Contributed by 
Vasanth kumar RJ.
Previous HEAD position was 6d0b10a... TEZ-2176. Move all logging to slf4j. 
Contributed by Vasanth kumar RJ.
Switched to branch 'master'
Your branch is behind 'origin/master' by 30 commits, and can be fast-forwarded.
  (use git pull to update your local branch)
First, rewinding head to replay your work on top of it...
Fast-forwarded master to 6d0b10a8445d3c26b0958ce816c64b577a1608d9.
TEZ-2217 patch is being downloaded at Tue Mar 24 02:28:13 UTC 2015 from
http://issues.apache.org/jira/secure/attachment/12706809/TEZ-2217-debug.txt.bz2
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0 to p2
PATCH APPLICATION FAILED




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12706809/TEZ-2217-debug.txt.bz2
  against master revision 6d0b10a.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/337//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
468b2e1ce34852fa777431321e7aaa5322b885d9 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #334
Archived 7 artifacts
Archive block size is 32768
Received 0 blocks and 1408632 bytes
Compression is 0.0%
Took 0.36 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (TEZ-2205) Tez still tries to post to ATS when yarn.timeline-service.enabled=false