[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge

2015-08-18 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701705#comment-14701705
 ] 

Saikat commented on TEZ-2726:
-

[~bikassaha] yes. So is this a proper place to raise an exception?

an AMUserCodeException  by checking this condition before sending out the CDMEs 
in Edge.java sendTezEventToDestinationTasks() for a scatter gather edge.

 Handle invalid number of partitions for SCATTER-GATHER edge
 ---

 Key: TEZ-2726
 URL: https://issues.apache.org/jira/browse/TEZ-2726
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Saikat
Assignee: Saikat

 Encountered an issue where the source vertex has M task and sink vertex has N 
 tasks (N  M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER.
 This resulted in sink vertex receiving DMEs with non existent targetIds.
 The fetchers for the sink vertex tasks then try to retrieve the map outputs 
 and retrieve invalid headers due to exception in the ShuffleHandler.
 Possible fixes:
 1. raise proper Tez Exception to indicate this invalid scenario.
 2. or write appropriate empty partition bits, for the missing partitions 
 before sending out the DMEs to sink vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2294) Add tez-site-template.xml with description of config properties

2015-08-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701821#comment-14701821
 ] 

Siddharth Seth commented on TEZ-2294:
-

Couple of general questions.
- Will this be published on the website along with the release. Should the 
template file be generated and checked into the repository ?
- It looks like this is analyzing all classes for an annotation, and then 
generating appropriate files. It may be simpler to generate the annotations for 
specific files (TezConfiguration and TezRuntimeConfiguration) for now. Another 
side affect of scanning all files is the creation of the apidocs/config 
directory in all modules - even though no config files exist.
- Double/float, int/long - differentiate between these ?
- Can the type be inferred from the default value, when it exists.
- Documentation on how to generate these files would be useful (outside of mvn 
site)

Specifics
- An empty index.html ends up getting generated in apidocs/config, which can be 
confusing.
- {code} * @see a 
href=../../../../../../configs/TezRuntimeConfiguration.htmlDetailed 
Configuration Information/a{code} Not sure what this will end up referring 
to, and where.
- TEZ_AM_RESOURCE_CPU_VCORES - type is string instead of integer.
- TezRuntimeConfiguration has no type information.
- Nits: Space between lines on the generated XML template.
- The XML generator likely needs some escaping. It generated invalid XML at the 
moment for TezConfiguration ()

ConfigStandardDoclet - Has references like TEZ_SITE_XML, ENDS_WITH, 
TEZ_AM_STAGING_DIR. If using the ConfigurationProperty annotation, can all of 
these special cases be skipped ? Alternately, skip using ConfigurationProperty 
altogether.

- Some commented out code in HtmlWriter and XmlWriter

 Add tez-site-template.xml with description of config properties
 ---

 Key: TEZ-2294
 URL: https://issues.apache.org/jira/browse/TEZ-2294
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Assignee: Hitesh Shah
 Attachments: TEZ-2294.4.patch, TEZ-2294.5.patch, TEZ-2294.6.patch, 
 TEZ-2294.7.patch, TEZ-2294.wip.2.patch, TEZ-2294.wip.3.patch, 
 TEZ-2294.wip.patch, TezConfiguration.html, TezRuntimeConfiguration.html, 
 tez-default-template.xml, tez-runtime-default-template.xml


 Document all tez configs with descriptions and default values. 
 Also, document MR configs that can be easily translated to Tez configs via 
 Tez helpers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701684#comment-14701684
 ] 

Bikas Saha commented on TEZ-2726:
-

Ah. so the producer task wrote data and that generated a composite event. the 
edge was scatter-gather. so it expanded that event based on the number of 
downstream tasks (where num tasks == num partitions). So each downstream task 
got an input with a different partition index. So the ones that got indices 1 
and 2 got the exception. 

 Handle invalid number of partitions for SCATTER-GATHER edge
 ---

 Key: TEZ-2726
 URL: https://issues.apache.org/jira/browse/TEZ-2726
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Saikat
Assignee: Saikat

 Encountered an issue where the source vertex has M task and sink vertex has N 
 tasks (N  M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER.
 This resulted in sink vertex receiving DMEs with non existent targetIds.
 The fetchers for the sink vertex tasks then try to retrieve the map outputs 
 and retrieve invalid headers due to exception in the ShuffleHandler.
 Possible fixes:
 1. raise proper Tez Exception to indicate this invalid scenario.
 2. or write appropriate empty partition bits, for the missing partitions 
 before sending out the DMEs to sink vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez

2015-08-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701884#comment-14701884
 ] 

Siddharth Seth commented on TEZ-2164:
-

{code}
[ERROR] Failed to execute goal on project tez-api: Could not resolve 
dependencies for project org.apache.tez:tez-api:jar:0.8.0-SNAPSHOT: Failure to 
find org.apache.tez:guava-tez:jar:18.0 in 
https://repository.apache.org/content/repositories/snapshots was cached in the 
local repository, resolution will not be reattempted until the update interval 
of apache.snapshots.https has elapsed or updates are forced - [Help 1]
{code}
Including guava-tez in the modules set in the top level pom gets further, but 
then fails with 
{code}
[INFO] tez-job-analyzer .. SUCCESS [0.194s]
[INFO] tez-dist .. FAILURE [0.072s]
[INFO] Tez ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 39.757s
[INFO] Finished at: Tue Aug 18 12:54:32 PDT 2015
[INFO] Final Memory: 81M/480M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-assembly-plugin:2.4:single (package-tez) on 
project tez-dist: Failed to create assembly: Error creating assembly archive 
tez-dist: You must set at least one file. - [Help 1]
[ERROR]
{code}

One question. Does the shade plugin allow usage of the original package names 
in code, and have the shading done post compile ? Otherwise, there'll be two 
options of each guava class - and we'll have to monitor each patch to avoid 
this.




 Shade the guava version used by Tez
 ---

 Key: TEZ-2164
 URL: https://issues.apache.org/jira/browse/TEZ-2164
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Hitesh Shah
Priority: Critical
 Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, 
 allow-guava-16.0.1.patch


 Should allow us to upgrade to a newer version without shipping a guava 
 dependency.
 Would be good to do this in 0.7 so that we stop shipping guava as early as 
 possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-2730:
-
Attachment: TEZ-2730.1.patch

 tez-api missing dependency on org.codehaus.jettison for json 
 -

 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-2730.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702032#comment-14702032
 ] 

Hitesh Shah commented on TEZ-2730:
--

[~sseth] [~bikassaha] review please. 

 tez-api missing dependency on org.codehaus.jettison for json 
 -

 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-2730.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah reassigned TEZ-2730:


Assignee: Hitesh Shah

 tez-api missing dependency on org.codehaus.jettison for json 
 -

 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2687:

Attachment: TEZ-2687-1.patch

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V
Assignee: Jeff Zhang
 Attachments: TEZ-2687-1.patch


 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702420#comment-14702420
 ] 

Jeff Zhang commented on TEZ-2687:
-

Upload patch to fix it.  [~hitesh] [~bikassaha] Please help review. 
* Release the held containers before the DAGAppMaster#stopServices is called
* Release the container at once if new container is allocated when it is 
shutting down.

Simulate the ATS hang behavior by adding Thread.sleep in 
HistoryEventHandler#serviceStop
Here's the log without this patch. Container will only be released when it is 
expired. 
{noformat}
2015-08-19 12:05:01,645 INFO [IPC Server handler 0 on 49920] app.DAGAppMaster: 
DAGAppMasterShutdownHandler invoked
2015-08-19 12:05:01,645 INFO [IPC Server handler 0 on 49920] app.DAGAppMaster: 
Handling DAGAppMaster shutdown
2015-08-19 12:05:01,645 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for 
5 seconds before shutting down
2015-08-19 12:05:06,646 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop 
for all the services
2015-08-19 12:05:06,647 INFO [AMShutdownThread] history.HistoryEventHandler: 
Stopping HistoryEventHandler
2015-08-19 12:05:23,083 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: No taskRequests. Container's idle timeout delay 
expired or is new. Releasing container, 
containerId=container_1439946425329_0022_01_03, 
containerExpiryTime=1439957123078, idleTimeout=2, taskRequestsCount=0, 
heldContainers=4, delayedContainers=3, isNew=false
2015-08-19 12:05:23,083 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Releasing unused container: 
container_1439946425329_0022_01_03
2015-08-19 12:05:23,083 INFO [Dispatcher thread: Central] 
history.HistoryEventHandler: 
[HISTORY][DAG:dag_1439946425329_0022_1][Event:CONTAINER_STOPPED]: 
containerId=container_1439946425329_0022_01_03, stoppedTime=1439957123083, 
exitStatus=0
2015-08-19 12:05:23,083 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 
transitioned from IDLE to STOP_REQUESTED via event C_STOP_REQUEST
2015-08-19 12:05:23,083 INFO [ContainerLauncher #6] 
launcher.ContainerLauncherImpl: Processing the event EventType: 
CONTAINER_STOP_REQUEST
2015-08-19 12:05:23,084 INFO [ContainerLauncher #6] 
launcher.ContainerLauncherImpl: Sending a stop request to the NM for 
ContainerId: container_1439946425329_0022_01_03
2015-08-19 12:05:23,084 INFO [ContainerLauncher #6] 
impl.ContainerManagementProtocolProxy: Opening proxy : 192.168.3.3:50421
2015-08-19 12:05:23,090 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 
transitioned from STOP_REQUESTED to STOPPING via event C_NM_STOP_SENT
2015-08-19 12:05:23,222 INFO [IPC Server handler 1 on 49919] 
app.TaskAttemptListenerImpTezDag: Container with id: 
container_1439946425329_0022_01_03 is valid, but no longer registered, and 
will be killed
2015-08-19 12:05:23,373 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: Released container 
completed:container_1439946425329_0022_01_03 last allocated to task: 
attempt_1439946425329_0022_1_02_03_0
2015-08-19 12:05:23,373 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: Container container_1439946425329_0022_01_03 
exited with diagnostics set to Container failed, exitCode=-100. Container 
released by application
2015-08-19 12:05:23,373 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 
transitioned from STOPPING to COMPLETED via event C_COMPLETED
{noformat}

Here's the log with this patch, containers will be released explicitly when 
shutdown is invoked. 
{noformat}
2015-08-19 12:07:37,137 INFO [IPC Server handler 0 on 50138] app.DAGAppMaster: 
DAGAppMasterShutdownHandler invoked
2015-08-19 12:07:37,137 INFO [IPC Server handler 0 on 50138] app.DAGAppMaster: 
Handling DAGAppMaster shutdown
2015-08-19 12:07:37,138 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for 
5 seconds before shutting down
2015-08-19 12:07:42,139 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop 
for all the services
2015-08-19 12:07:42,139 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: 
Realease held containers
2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] 
history.HistoryEventHandler: 
[HISTORY][DAG:dag_1439946425329_0023_1][Event:CONTAINER_STOPPED]: 
containerId=container_1439946425329_0023_01_05, stoppedTime=1439957262139, 
exitStatus=0
2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] 
container.AMContainerImpl: AMContainer container_1439946425329_0023_01_05 
transitioned from IDLE to STOP_REQUESTED via event C_STOP_REQUEST
2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] 
history.HistoryEventHandler: 
[HISTORY][DAG:dag_1439946425329_0023_1][Event:CONTAINER_STOPPED]: 

[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2687:

Attachment: TEZ-2687-1.patch

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V
Assignee: Jeff Zhang
 Attachments: TEZ-2687-1.patch


 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2687:

Attachment: (was: TEZ-2687-1.patch)

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V
Assignee: Jeff Zhang

 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2690 PreCommit Build #1004

2015-08-18 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2690
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1004/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 25 lines...]
==
Testing patch for TEZ-2690.
==
==


HEAD is now at 24ca1de TEZ-2730. tez-api missing dependency on 
org.codehaus.jettison for json. (hitesh)
Previous HEAD position was 24ca1de... TEZ-2730. tez-api missing dependency on 
org.codehaus.jettison for json. (hitesh)
Switched to branch 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use git pull to update your local branch)
First, rewinding head to replay your work on top of it...
Fast-forwarded master to 24ca1de0e12da3f9d165f1eda44c7076de0f2f12.
TEZ-2690 patch is being downloaded at Wed Aug 19 04:32:43 UTC 2015 from
http://issues.apache.org/jira/secure/attachment/12751186/criticalPath.jpg
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0 to p2
PATCH APPLICATION FAILED




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12751186/criticalPath.jpg
  against master revision 24ca1de.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1004//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
635728cb2868e162e9dfd38f7f347fa64616de53 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread Hitesh Shah (JIRA)
Hitesh Shah created TEZ-2730:


 Summary: tez-api missing dependency on org.codehaus.jettison for 
json 
 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Comment: was deleted

(was: Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.
!criticalPath.jpg|thumbnail!)

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.jpg


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702290#comment-14702290
 ] 

Bikas Saha commented on TEZ-2690:
-

Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.
!criticalPath.jpg!

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.jpg


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702302#comment-14702302
 ] 

TezQA commented on TEZ-2730:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12751128/TEZ-2730.1.patch
  against master revision 6cb8206.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1003//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//console

This message is automatically generated.

 tez-api missing dependency on org.codehaus.jettison for json 
 -

 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-2730.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2730 PreCommit Build #1003

2015-08-18 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2730
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1003/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3289 lines...]



{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12751128/TEZ-2730.1.patch
  against master revision 6cb8206.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1003//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
9e41068f4e8172aa1092f4923b9989c76998a72e logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #1000
Archived 50 artifacts
Archive block size is 32768
Received 2 blocks and 3001018 bytes
Compression is 2.1%
Took 0.96 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes

2015-08-18 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702134#comment-14702134
 ] 

Jonathan Eagles commented on TEZ-2300:
--

Do we have consensus on an approach?
TezClient:stop internally calls shutdownTezAM asynchronously by sending a 
DAG_KILL event
DAGClient:tryKillDAG synchronously calls dispatcher handle event on the 
DAG_KILL event - which calls for all Vertexes, etc.


 TezClient.stop() takes a lot of time or does not work sometimes
 ---

 Key: TEZ-2300
 URL: https://issues.apache.org/jira/browse/TEZ-2300
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Jonathan Eagles
 Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch, 
 TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post 


   Noticed this with a couple of pig scripts which were not behaving well (AM 
 close to OOM, etc) and even with some that were running fine. Pig calls 
 Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits 
 immediately or is hung. In both cases it either takes a long time for the 
 yarn application to go to KILLED state. Many times I just end up calling yarn 
 application -kill separately after waiting for 5 mins or more for it to get 
 killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2687) ATS History shutdown happens before the min-held containers are released

2015-08-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned TEZ-2687:
---

Assignee: Jeff Zhang

 ATS History shutdown happens before the min-held containers are released
 

 Key: TEZ-2687
 URL: https://issues.apache.org/jira/browse/TEZ-2687
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.2, 0.8.0, 0.7.1
Reporter: Gopal V
Assignee: Jeff Zhang

 When ATS goes into a GC pause under heavy loads and while it recovers, each 
 Tez AM holds onto a few containers even though it is shutting down and will 
 never accept any more DAGs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez

2015-08-18 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702181#comment-14702181
 ] 

Rajesh Balamohan commented on TEZ-2164:
---

- Built (guava-tez-18.jar got added to $TEZ_HOME/lib/) and tested on multi-node 
cluster setup. And ran couple of jobs (hive workload). Works fine without 
issues. 
- Possibly can remove unintentional changes to ConcatenatedMergedKeyValueInput 
DAGEventStartDag, DAGRecoveredEvent, EdgeManagerForTest, 
HistoryACLPolicyException, InitialMemoryAllocator, KVDataGen, 
OnStateChangedCallback, MultiStageMRConfToTezTranslator, Output, 
StateMachineTez, SVGUtils, TaskEventScheduleTask, 
TestHistoryEventTimelineConversion, TestShuffleInputEventHandlerOrderedGrouped, 
TezBodyDeferringAsyncHandler
- Imports need to be rearranged (e.g ExternalSorter, InputIntializerEvent, 
MROutput etC). or can be deferred for later

 Shade the guava version used by Tez
 ---

 Key: TEZ-2164
 URL: https://issues.apache.org/jira/browse/TEZ-2164
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Hitesh Shah
Priority: Critical
 Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, 
 allow-guava-16.0.1.patch


 Should allow us to upgrade to a newer version without shipping a guava 
 dependency.
 Would be good to do this in 0.7 so that we stop shipping guava as early as 
 possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: TEZ-2690.1.patch

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: TEZ-2690.1.patch

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282
 ] 

Bikas Saha commented on TEZ-2690:
-

Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.png


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: criticalPath.png

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.png


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282
 ] 

Bikas Saha edited comment on TEZ-2690 at 8/19/15 1:39 AM:
--

Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.
!criticalPath.png|thumbnail!


was (Author: bikassaha):
Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.png


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2629) LimitExceededException in Tez client when DAG has exceeds the default max

2015-08-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702207#comment-14702207
 ] 

Hitesh Shah commented on TEZ-2629:
--

+1

 LimitExceededException in Tez client when DAG has exceeds the default max
 -

 Key: TEZ-2629
 URL: https://issues.apache.org/jira/browse/TEZ-2629
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Jason Dere
Assignee: Siddharth Seth
 Attachments: TEZ-2629.1.txt


 Original issue was HIVE-11303, seeing LimitExceededException when the client 
 tries to get the counters for a completed job:
 {noformat}
 2015-07-17 18:18:11,830 INFO  [main]: counters.Limits 
 (Limits.java:ensureInitialized(59)) - Counter limits initialized with 
 parameters:  GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, 
 MAX_COUNTERS=1200
 2015-07-17 18:18:11,841 ERROR [main]: exec.Task (TezTask.java:execute(189)) - 
 Failed to execute tez graph.
 org.apache.tez.common.counters.LimitExceededException: Too many counters: 
 1201 max=1200
 at org.apache.tez.common.counters.Limits.checkCounters(Limits.java:87)
 at org.apache.tez.common.counters.Limits.incrCounters(Limits.java:94)
 at 
 org.apache.tez.common.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:76)
 at 
 org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:93)
 at 
 org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:104)
 at 
 org.apache.tez.dag.api.DagTypeConverters.convertTezCountersFromProto(DagTypeConverters.java:567)
 at 
 org.apache.tez.dag.api.client.DAGStatus.getDAGCounters(DAGStatus.java:148)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1673)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1432)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1213)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1064)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 It looks like Limits.ensureInitialized() is defaulting to an empty 
 configuration, resulting in COUNTERS_MAX being set to the default of 1200 
 (even though Hive's configuration specified tez.counters.max=16000).
 Per [~sseth]:
 {quote}
 I think the Tez client does need to make this call to setup the Configuration 
 correctly. We do this for the AM and the executing task - which is why it 
 works. Could you please open a Tez jira for this ?
 Also, Limits is making use of Configuration instead of TezConfiguration for 
 default initialization, which implies changes to tez-site on the local node 
 won't be picked up.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702216#comment-14702216
 ] 

Bikas Saha commented on TEZ-2300:
-

The current patch is useful because it ensures that the app is killed after 
some max deadline.

In addition to that, if we want to ensure ATS is flushed by keeping the AM 
alive, we could, in shutdownTezAM
1) send release containers signal to the scheduler (this will reduce resource 
usage)
2) ensure DAG Kill is initiated (may already be happening but Rohini mentioned 
she saw allocations happen during this time)
3) call stop() to asynchronously stop (this includes flush to ATS)
And return.

Thoughts?


 TezClient.stop() takes a lot of time or does not work sometimes
 ---

 Key: TEZ-2300
 URL: https://issues.apache.org/jira/browse/TEZ-2300
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Jonathan Eagles
 Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch, 
 TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post 


   Noticed this with a couple of pig scripts which were not behaving well (AM 
 close to OOM, etc) and even with some that were running fine. Pig calls 
 Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits 
 immediately or is hung. In both cases it either takes a long time for the 
 yarn application to go to KILLED state. Many times I just end up calling yarn 
 application -kill separately after waiting for 5 mins or more for it to get 
 killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: criticalPath.jpg

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.jpg


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282
 ] 

Bikas Saha edited comment on TEZ-2690 at 8/19/15 1:41 AM:
--

Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.
!criticalPath.jpg|thumbnail!


was (Author: bikassaha):
Adds a task attempt level critical path analyzer.
Uses the scheduling and data event dependencies to walk from the last attempt 
completion to first attempt creation to account for the time taken in the job.
The output of the analyzer is an svg rendering of the critical path. Attached 
sample. The svg code has been re-written to generate svg directly instead of 
using jaxb because of missing features in jaxb (e.g. setting the value of a 
text field).
Renames existing critical path analyzer to vertex level.
Adds an AnalyzerDriver to allow running analyzers from the command line using 
hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the 
driver because I am not sure how the other analyzers would behave on the 
command line. They are written to output csv results. Perhaps we can create a 
base Csv analyzer that can take the csv results and output them on the console 
or write them to a file. Then they could be run on the command line.

The goal is to get this in and have motivated developers start running it and 
finding issues/improvements.

[~rajesh.balamohan] Please review.
!criticalPath.png|thumbnail!

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.jpg


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: (was: TEZ-2690.1.patch)

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha

 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2690:

Attachment: (was: criticalPath.png)

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2690) Add critical path analyser

2015-08-18 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702537#comment-14702537
 ] 

Rajesh Balamohan commented on TEZ-2690:
---

lgtm overall. Minor comments.

- TezAnalyzerBase 
-- Since the analyzer is not going to download the data, it might be good to 
comment related to DagId that needs to be downloaded.
-- Is the main() function needed in base class? Or is it given mainly as an 
example?
-- Since base already extends Configured, Analyzer.getConfiguration() should be 
removed. But this would be separate JIRA to let all analyzers extend 
TezAnalyzerBase.
-- Printing usage might be useful (e.g need to refer to code for optional 
parameter outputDir)
- Changes in VertexInfo is unintentional?
- SVGUtils - It might break the earlier drawVertex(DagInfo). But can be added 
later as a part of refactoring other analyzers to extend TezAnalayzerBase.
- CriticalPathAnalyzer
-- getLastDataEventTime, getCreationTime etc got added as a part of TEZ-2701. 
So if we try to parse with older logs (e.g 0.8/0.7/0.6 etc), it might return 0 
for currentAttempt.getLastDataEventTime().
-- Should (currentAttempt.getLastDataEventTime()  0) checks be added for such 
cases to fail fast if the logs do not have those details? Other calculations 
(e.g if (!Strings.isNullOrEmpty(currentAttempt.getLastDataEventSourceTA( 
can also become invalid. So it might be good to consider failing fast if the 
logs do not have the info that analyzer is looking for.

Will go through the CriticalPathAnalyzer more in detail and post comments if 
any.

 Add critical path analyser
 --

 Key: TEZ-2690
 URL: https://issues.apache.org/jira/browse/TEZ-2690
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2690.1.patch, criticalPath.jpg


 Use input and scheduling dependencies to create critical path for a DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2725) Tez UI: Unit tests framework integration

2015-08-18 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2725:

Description: 
- Investigate for the best UT framework for Tez UI, and integrate the same into 
the codebase.
- UTs for each modules would be added as part of the respective patch.

 Tez UI: Unit tests framework integration
 

 Key: TEZ-2725
 URL: https://issues.apache.org/jira/browse/TEZ-2725
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram
Assignee: Sreenath Somarajapuram

 - Investigate for the best UT framework for Tez UI, and integrate the same 
 into the codebase.
 - UTs for each modules would be added as part of the respective patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2729) Standalone sample UI for running tez job analyzers

2015-08-18 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created TEZ-2729:
-

 Summary: Standalone sample UI for running tez job analyzers
 Key: TEZ-2729
 URL: https://issues.apache.org/jira/browse/TEZ-2729
 Project: Apache Tez
  Issue Type: Wish
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2729) Standalone sample UI for running tez job analyzers

2015-08-18 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2729:
--
Attachment: TEZ-2729.WIP.PoC.1.patch

Attaching the very preliminary poc patch which uses Vaadin web framework for 
rendering analyzer results.  User can download ATS data via ATSImportTool or 
from tez-ui. Downloaded zip file can be uploaded to this UI for analysis (i.e 
to run bunch of analyzers and render the results mostly in CSV format). Plz 
note that UI is just PoC/sample code.  Installation instructions are provided 
in INSTALL.txt

 Standalone sample UI for running tez job analyzers
 --

 Key: TEZ-2729
 URL: https://issues.apache.org/jira/browse/TEZ-2729
 Project: Apache Tez
  Issue Type: Wish
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-2729.WIP.PoC.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2725) Tez UI: Unit tests framework integration

2015-08-18 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2725:

Summary: Tez UI: Unit tests framework integration  (was: Tez UI: Unit tests)

 Tez UI: Unit tests framework integration
 

 Key: TEZ-2725
 URL: https://issues.apache.org/jira/browse/TEZ-2725
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram
Assignee: Sreenath Somarajapuram





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2725) Tez UI: Unit tests framework integration

2015-08-18 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701328#comment-14701328
 ] 

Sreenath Somarajapuram commented on TEZ-2725:
-

Have added the details. This ticket would be to just incorporate the UT 
framework.

 Tez UI: Unit tests framework integration
 

 Key: TEZ-2725
 URL: https://issues.apache.org/jira/browse/TEZ-2725
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram
Assignee: Sreenath Somarajapuram

 - Investigate for the best UT framework for Tez UI, and integrate the same 
 into the codebase.
 - UTs for each modules would be added as part of the respective patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json

2015-08-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702037#comment-14702037
 ] 

Bikas Saha commented on TEZ-2730:
-

lgtm

 tez-api missing dependency on org.codehaus.jettison for json 
 -

 Key: TEZ-2730
 URL: https://issues.apache.org/jira/browse/TEZ-2730
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-2730.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2164) Shade the guava version used by Tez

2015-08-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702036#comment-14702036
 ] 

Hitesh Shah edited comment on TEZ-2164 at 8/18/15 9:42 PM:
---

You will need to compile guava-tez first - it is not part of the top-level 
module list. I tried adding it but tez-dist hit some errors. 

bq.  Does the shade plugin allow usage of the original package names in code, 
and have the shading done post compile ?

I tried that approach but was not successful due to a set of reasons:
   - the relocation happens on jar creation
   - unit tests in other modules when referencing internal apis using guava 
breaks as they use normal guava packages and not the relocated ones
   - the only seamless way to do this is create a fat jar in tez-dist assembly 
with all guava relocated. However this still does not solve the case where tez 
wants to use a newer guava version. 

And yes, we will need to monitor each patch to see that com.google does not 
creep in - which is a possibility given that we cannot remove guava-11 as a 
compile time dependency ( caused by hadoop yarn using guava objects in its apis 
)  


was (Author: hitesh):
You will need to compile guava-tez first - it is not part of the top-level 
module list. 

bq.  Does the shade plugin allow usage of the original package names in code, 
and have the shading done post compile ?

I tried that approach but was not successful due to a set of reasons:
   - the relocation happens on jar creation
   - unit tests in other modules when referencing internal apis using guava 
breaks as they use normal guava packages and not the relocated ones
   - the only seamless way to do this is create a fat jar in tez-dist assembly 
with all guava relocated. However this still does not solve the case where tez 
wants to use a newer guava version. 

And yes, we will need to monitor each patch to see that com.google does not 
creep in - which is a possibility given that we cannot remove guava-11 as a 
compile time dependency ( caused by hadoop yarn using guava objects in its apis 
)  

 Shade the guava version used by Tez
 ---

 Key: TEZ-2164
 URL: https://issues.apache.org/jira/browse/TEZ-2164
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Hitesh Shah
Priority: Critical
 Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, 
 allow-guava-16.0.1.patch


 Should allow us to upgrade to a newer version without shipping a guava 
 dependency.
 Would be good to do this in 0.7 so that we stop shipping guava as early as 
 possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez

2015-08-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702036#comment-14702036
 ] 

Hitesh Shah commented on TEZ-2164:
--

You will need to compile guava-tez first - it is not part of the top-level 
module list. 

bq.  Does the shade plugin allow usage of the original package names in code, 
and have the shading done post compile ?

I tried that approach but was not successful due to a set of reasons:
   - the relocation happens on jar creation
   - unit tests in other modules when referencing internal apis using guava 
breaks as they use normal guava packages and not the relocated ones
   - the only seamless way to do this is create a fat jar in tez-dist assembly 
with all guava relocated. However this still does not solve the case where tez 
wants to use a newer guava version. 

And yes, we will need to monitor each patch to see that com.google does not 
creep in - which is a possibility given that we cannot remove guava-11 as a 
compile time dependency ( caused by hadoop yarn using guava objects in its apis 
)  

 Shade the guava version used by Tez
 ---

 Key: TEZ-2164
 URL: https://issues.apache.org/jira/browse/TEZ-2164
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Hitesh Shah
Priority: Critical
 Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, 
 allow-guava-16.0.1.patch


 Should allow us to upgrade to a newer version without shipping a guava 
 dependency.
 Would be good to do this in 0.7 so that we stop shipping guava as early as 
 possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge

2015-08-18 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701505#comment-14701505
 ] 

Saikat commented on TEZ-2726:
-

[~rajesh.balamohan] [~bikassaha]
There are no empty partitions in the example I mentioned. The source vertex has 
1 task (used a UnorderedKVOutput, so produced only 1 partition)and sink vertex 
has 3 tasks. The edge is of type SCATTER-GATHER.

When http fetchers sent a request for fetching the map outputs,  the code in 
shufflehandler catches IOException in
IndexCache.java getIndexInformation() function for the condition 
[info.mapSpillRecord.size() = reduce].


2015-08-10 12:36:42,314 [New I/O worker #32] ERROR mapred.ShuffleHandler: 
Shuffle error in populating headers :
java.io.IOException: Invalid request Map Id = 
attempt_1437478617943_17839_1_05_00_0_10003 Reducer = 1 Index Info Length = 
1
at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:84)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.getMapOutputInfo(ShuffleHandler.java:855)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.populateHeaders(ShuffleHandler.java:875)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:793)



I ll try to get  an excerpt of the Fetcher logs for DMEs and post here.

 Handle invalid number of partitions for SCATTER-GATHER edge
 ---

 Key: TEZ-2726
 URL: https://issues.apache.org/jira/browse/TEZ-2726
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Saikat
Assignee: Saikat

 Encountered an issue where the source vertex has M task and sink vertex has N 
 tasks (N  M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER.
 This resulted in sink vertex receiving DMEs with non existent targetIds.
 The fetchers for the sink vertex tasks then try to retrieve the map outputs 
 and retrieve invalid headers due to exception in the ShuffleHandler.
 Possible fixes:
 1. raise proper Tez Exception to indicate this invalid scenario.
 2. or write appropriate empty partition bits, for the missing partitions 
 before sending out the DMEs to sink vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge

2015-08-18 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2726:
-
Affects Version/s: 0.7.1

 Handle invalid number of partitions for SCATTER-GATHER edge
 ---

 Key: TEZ-2726
 URL: https://issues.apache.org/jira/browse/TEZ-2726
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Saikat
Assignee: Saikat

 Encountered an issue where the source vertex has M task and sink vertex has N 
 tasks (N  M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER.
 This resulted in sink vertex receiving DMEs with non existent targetIds.
 The fetchers for the sink vertex tasks then try to retrieve the map outputs 
 and retrieve invalid headers due to exception in the ShuffleHandler.
 Possible fixes:
 1. raise proper Tez Exception to indicate this invalid scenario.
 2. or write appropriate empty partition bits, for the missing partitions 
 before sending out the DMEs to sink vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge

2015-08-18 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2726:
-
Affects Version/s: (was: 0.7.1)
   0.7.0

 Handle invalid number of partitions for SCATTER-GATHER edge
 ---

 Key: TEZ-2726
 URL: https://issues.apache.org/jira/browse/TEZ-2726
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Saikat
Assignee: Saikat

 Encountered an issue where the source vertex has M task and sink vertex has N 
 tasks (N  M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER.
 This resulted in sink vertex receiving DMEs with non existent targetIds.
 The fetchers for the sink vertex tasks then try to retrieve the map outputs 
 and retrieve invalid headers due to exception in the ShuffleHandler.
 Possible fixes:
 1. raise proper Tez Exception to indicate this invalid scenario.
 2. or write appropriate empty partition bits, for the missing partitions 
 before sending out the DMEs to sink vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)