[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: (was: TEZ-2234.1.patch)

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.1.patch

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1562) DAGImpl commitOrAbortOutputs takes long time (300+ seconds) for reducer vertex with 4000+ tasks

2015-04-05 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1562:
--
Attachment: TEZ-1562-fileoutputcommitter-time.png
TEZ-1562-job-runtime.png

- Tried a synthetic job which would read and store 32 GB of lineitem table 
(Mapper -- Reducer).
- Used hadoop 2.8 (Mapreduce-4815 got committed in 2.7 itself)
- Tried with mapreduce.fileoutputcommitter.algorithm.version=1 and 
mapreduce.fileoutputcommitter.algorithm.version=2.
- Attaching the graphs with both the configs.
- Basically, mapreduce.fileoutputcommitter.algorithm.version=2 fixes the 
issue.

 DAGImpl commitOrAbortOutputs takes long time (300+ seconds) for reducer 
 vertex with 4000+ tasks
 ---

 Key: TEZ-1562
 URL: https://issues.apache.org/jira/browse/TEZ-1562
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
  Labels: performance
 Attachments: TEZ-1562-fileoutputcommitter-time.png, 
 TEZ-1562-job-runtime.png


 M7 -- R1
 M2 -- R1
 I was running a job with 4000 reducers.  This is in non-session mode.
 At the end of the job, it took 300+ seconds just to commit().  
 It appears that its doing some cleanup work. But in session mode, this could 
 lead to the following issue
 1. Open session
 2. Run small job1
 3. Run large job1
 4. Run small job2
 In this case, small job2 wouldn't even start running until large-job1's 
 commit is over which would be in the order of 300+ seconds.  
 Please refer to the logs here, 
 2014-09-09 17:58:20,422 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Vertex 
 vertex_1409722953518_0143_1_02 completed., numCompletedVertices=3, 
 numSuccessfulVertices=3, numFailedVertices=0, numKilledVertices=0, 
 numVertices=3
 2014-09-09 17:58:20,422 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Calling DAG commit/abort for dag: 
 dag_1409722953518_0143_1
 2014-09-09 17:58:20,426 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Committing output: Reducer_6_sink 
 for vertex: vertex_1409722953518_0143_1_02
 2014-09-09 18:03:25,207 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: No output committers for vertex: 
 Map_7
 2014-09-09 18:03:25,207 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: No output committers for vertex: 
 Map_2
 2014-09-09 18:03:25,207 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Patch : Time taken to 
 commitOrAbortOutputs : 304784   In milliseconds
 2014-09-09 18:03:25,361 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Patch : Time taken to log jobhistory 
 : 304938   Cumulative number
 2014-09-09 18:03:25,363 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: Patch : Time taken to handle 
 DAGAppMasterEventDAGFinished : 304940  Cumulative number
 2014-09-09 18:03:25,363 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: DAG: dag_1409722953518_0143_1 
 finished with state: SUCCEEDED
 2014-09-09 18:03:25,363 INFO [AsyncDispatcher event handler] 
 org.apache.tez.dag.app.dag.impl.DAGImpl: dag_1409722953518_0143_1 
 transitioned from RUNNING to SUCCEEDED
 Should this operation be moved to a separate thread as the DAG is already 
 marked as SUCCEEDED?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2234 PreCommit Build #391

2015-04-05 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2234
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/391/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2973 lines...]
{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.mapreduce.TestMRRJobs
  org.apache.tez.test.TestSecureShuffle
  org.apache.tez.test.TestTezJobs
  org.apache.tez.test.TestDAGRecovery
  org.apache.tez.test.TestFaultTolerance
  org.apache.tez.test.TestAMRecovery

  The following test timeouts occurred in :
 org.apache.tez.mapreduce.TestMRRJobsDAGApi

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/391//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/391//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
44926ad9bcc6b2c4433527ba2fd2ffa025b94e46 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 0 blocks and 2741254 bytes
Compression is 0.0%
Took 0.98 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
44 tests failed.
REGRESSION:  org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJob

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJob(TestMRRJobs.java:132)


REGRESSION:  org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJobWithCompression

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJobWithCompression(TestMRRJobs.java:290)


REGRESSION:  org.apache.tez.mapreduce.TestMRRJobs.testFailingAttempt

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testFailingAttempt(TestMRRJobs.java:252)


REGRESSION:  org.apache.tez.mapreduce.TestMRRJobs.testRandomWriter

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testRandomWriter(TestMRRJobs.java:167)


REGRESSION:  
org.apache.tez.test.TestAMRecovery.testVertexPartiallyFinished_Broadcast

Error Message:
expected:SUCCEEDED but was:FAILED

Stack Trace:
java.lang.AssertionError: expected:SUCCEEDED but was:FAILED
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 

[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14430980#comment-14430980
 ] 

Hadoop QA commented on TEZ-2234:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12713460/TEZ-2234.1.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.mapreduce.TestMRRJobs
  org.apache.tez.test.TestSecureShuffle
  org.apache.tez.test.TestTezJobs
  org.apache.tez.test.TestDAGRecovery
  org.apache.tez.test.TestFaultTolerance
  org.apache.tez.test.TestAMRecovery

  The following test timeouts occurred in :
 org.apache.tez.mapreduce.TestMRRJobsDAGApi

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/391//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/391//console

This message is automatically generated.

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14448755#comment-14448755
 ] 

Bikas Saha commented on TEZ-2234:
-

I am looking at the test failures. Not sure why TestMRRJobsDAGApi timed out and 
whether it had anything to do with other tests failing.



 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.1.patch

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.1.patch

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: (was: TEZ-2234.1.patch)

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: TEZ-2234.1.patch

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2234:

Attachment: (was: TEZ-2234.1.patch)

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-04-05 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14438125#comment-14438125
 ] 

Rohini Palaniswamy commented on TEZ-1190:
-

With the changes in PIG-4495 have handled both of the above scenarios in Pig 
itself. So we do not require this anymore for Pig. But leaving it open if it 
makes life easier for Hive and Cascading.

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-714) OutputCommitters should not run in the main AM dispatcher thread

2015-04-05 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14446581#comment-14446581
 ] 

Bikas Saha commented on TEZ-714:


bq. Both task finishing and commit finishing may be the last step to the 
finished state (if no commit then task finishing is the last step, and 
committing finish is the last step if there's commit), so they would both 
trigger checkForCompletion.
Today, the semantics of checkForCompletion() are actually 
checkForTasksCompletion(). It checks if all tasks have succeeded. On success, 
it would complete all commits synchronously and return the final state 
SUCCEEDED. This patch is overloading checkForCompletion() to do both 
checkForTasksCompletion() and checkForCommitsCompletion() resulting in logic 
overloading that is error-prone and makes future changes harder because the 
method is doing 2 things.
{code}
+LOG.info(All tasks are succeeded, vertex: + vertex.logIdentifier);
+if (!vertex.commitVertexOutputs) {
+  // just finish because no vertex committing needed
+  return vertex.finished(VertexState.SUCCEEDED);
+} else if (!vertex.committed.getAndSet(true)) {
+  // start commit if there're commits or just finish if no commits
+  return commitOrFinish(vertex);
// This part belongs to checkForCommitsCompletion()
+} else if (vertex.commitFutures.isEmpty()) {
+  // move from COMMITTING to SUCCEEDED
+  return vertex.finished(VertexState.SUCCEEDED);
+} else {
+  return VertexState.COMMITTING;
 }{code}

bq. What i am doing is not exactly this way. I won't ignore failed commits, 
stead I will cancel pending commits and wait for them to complete and then move 
to failed state. 
Will calling cancel on the future object is going to result in 
VertexCommitCallback#onFailure() to be invoked. If not, then the vertex will 
hang on waiting for the commitFutures to be empty because no CommitCompleted 
event will come.

bq. This behavior is consistent with other cases when there's any fail event 
happens (commit fail, vertex termination event and etc ), all the cases would 
cancel pending commits and wait for them to complete and then move to finished 
state.
Is this existing behavior or behavior in the patch? In either case, this logic 
is not consistent from a users point of view. Depending on which async commit 
operation ran first on the threadpool and failed, the user will see anywhere 
between 1 to N-1 committed outputs. Is that observation correct? If yes, is 
that a better choice than saying - User will see either no outputs or all 
successfully committed outputs?



 OutputCommitters should not run in the main AM dispatcher thread
 

 Key: TEZ-714
 URL: https://issues.apache.org/jira/browse/TEZ-714
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Jeff Zhang
Priority: Critical
 Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
 TEZ-714-3.patch, TEZ-714-4.patch, TEZ-714-5.patch, TEZ-714-6.patch, 
 TEZ-714-7.patch, Vertex_2.pdf


 Follow up jira from TEZ-41.
 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
 parallel.
 2) Running an OutputCommitter in the main thread blocks all other event 
 handling, w.r.t the DAG, and causes the event queue to back up.
 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2159) Tez UI: download timeline data for offline use.

2015-04-05 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14401197#comment-14401197
 ] 

Hitesh Shah commented on TEZ-2159:
--

bq. The main use case is downloading a completed/failed dag.

Does it error out if the dag is still running? I am guessing where there might 
be incomplete data ( crashes , etc ) where incomplete data might still be 
useful. Does it make sense to trigger a pop-up for the user to inform him/her 
the data may be incomplete if the dag is still in a running state and whether 
the user wants to still download the data? 

 Tez UI: download timeline data for offline use.
 ---

 Key: TEZ-2159
 URL: https://issues.apache.org/jira/browse/TEZ-2159
 Project: Apache Tez
  Issue Type: Improvement
  Components: UI
Reporter: Prakash Ramachandran
Assignee: Prakash Ramachandran
 Attachments: TEZ-2159.1.patch, TEZ-2159.2.patch, TEZ-2159.wip.1.patch


 It is useful to have capability to download the timeline data for a dag for 
 offline analysis. for ex. TEZ-2076 uses the timeline data to do offline 
 analysis of a tez application run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14452286#comment-14452286
 ] 

Bikas Saha commented on TEZ-2234:
-

Attaching patch that should fix the tests.

The patch enhances the API in a simple and generics manner by providing 
input/output stats reporter interfaces for the tasks and a VertexStatistics 
provider interface to the vertex manager plugins. This allow API evolution to 
be restricted to these new interfaces. For now, only a setDataSize() has been 
added to the reporters and a getDataSize() has been added to the providers. 
That sums up the changes.
[~rajesh.balamohan] [~sseth] [~hitesh] Please review.

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2234) Allow vertex managers to get output size per source vertex

2015-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14459559#comment-14459559
 ] 

Hadoop QA commented on TEZ-2234:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12719166/TEZ-2234.1.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/393//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/393//console

This message is automatically generated.

 Allow vertex managers to get output size per source vertex
 --

 Key: TEZ-2234
 URL: https://issues.apache.org/jira/browse/TEZ-2234
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2234.1.patch


 Vertex managers may need per source vertex output stats to make 
 reconfiguration decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2234 PreCommit Build #393

2015-04-05 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2234
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/393/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2783 lines...]
[INFO] Final Memory: 72M/994M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12719166/TEZ-2234.1.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/393//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/393//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
c7cb4dfec518506a0e0c243639af54be1d7610ab logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 16 blocks and 2205485 bytes
Compression is 19.2%
Took 0.83 sec
Description set: TEZ-2234
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

Failed: TEZ-2234 PreCommit Build #392

2015-04-05 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2234
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/392/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2976 lines...]
{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestTezJobs
  org.apache.tez.test.TestDAGRecovery
  org.apache.tez.test.TestSecureShuffle
  org.apache.tez.test.TestFaultTolerance
  org.apache.tez.mapreduce.TestMRRJobs
  org.apache.tez.test.TestAMRecovery

  The following test timeouts occurred in :
 org.apache.tez.mapreduce.TestMRRJobsDAGApi

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/392//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/392//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
650f052231bee0a5e61600bd8479428dc66a5b07 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 16 blocks and 2215863 bytes
Compression is 19.1%
Took 0.89 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
44 tests failed.
FAILED:  org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJob

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJob(TestMRRJobs.java:132)


FAILED:  org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJobWithCompression

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testMRRSleepJobWithCompression(TestMRRJobs.java:290)


FAILED:  org.apache.tez.mapreduce.TestMRRJobs.testFailingAttempt

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testFailingAttempt(TestMRRJobs.java:252)


FAILED:  org.apache.tez.mapreduce.TestMRRJobs.testRandomWriter

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.mapreduce.TestMRRJobs.testRandomWriter(TestMRRJobs.java:167)


FAILED:  
org.apache.tez.test.TestAMRecovery.testVertexPartiallyFinished_Broadcast

Error Message:
expected:SUCCEEDED but was:FAILED

Stack Trace:
java.lang.AssertionError: expected:SUCCEEDED but was:FAILED
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 

[jira] [Commented] (TEZ-2159) Tez UI: download timeline data for offline use.

2015-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14396238#comment-14396238
 ] 

Hadoop QA commented on TEZ-2159:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709469/TEZ-2159.2.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/390//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/390//console

This message is automatically generated.

 Tez UI: download timeline data for offline use.
 ---

 Key: TEZ-2159
 URL: https://issues.apache.org/jira/browse/TEZ-2159
 Project: Apache Tez
  Issue Type: Improvement
  Components: UI
Reporter: Prakash Ramachandran
Assignee: Prakash Ramachandran
 Attachments: TEZ-2159.1.patch, TEZ-2159.2.patch, TEZ-2159.wip.1.patch


 It is useful to have capability to download the timeline data for a dag for 
 offline analysis. for ex. TEZ-2076 uses the timeline data to do offline 
 analysis of a tez application run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2159 PreCommit Build #390

2015-04-05 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2159
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/390/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 2775 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12709469/TEZ-2159.2.patch
  against master revision 5e2a55f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/390//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/390//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
a9b16bd1904f34a8626a030672bca4955b09d859 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #387
Archived 44 artifacts
Archive block size is 32768
Received 2 blocks and 2665715 bytes
Compression is 2.4%
Took 1 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-2159) Tez UI: download timeline data for offline use.

2015-04-05 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran updated TEZ-2159:
--
Attachment: TEZ-2159.2.patch

thanks [~Sreenath] attaching patch with comments addressed.

bq. As we are batching, wouldn't the data be inconsistent / distributed across 
time. Is that OK?
The main use case is downloading a completed/failed dag. even if there is no 
batching the data can be inconsistent between vertex/task.


 Tez UI: download timeline data for offline use.
 ---

 Key: TEZ-2159
 URL: https://issues.apache.org/jira/browse/TEZ-2159
 Project: Apache Tez
  Issue Type: Improvement
  Components: UI
Reporter: Prakash Ramachandran
Assignee: Prakash Ramachandran
 Attachments: TEZ-2159.1.patch, TEZ-2159.2.patch, TEZ-2159.wip.1.patch


 It is useful to have capability to download the timeline data for a dag for 
 offline analysis. for ex. TEZ-2076 uses the timeline data to do offline 
 analysis of a tez application run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)