[jira] [Created] (TEZ-2790) TEZ-UI: Enhance the DAG View with more runtime information

2015-09-09 Thread Jeff Zhang (JIRA)
Jeff Zhang created TEZ-2790:
---

 Summary: TEZ-UI: Enhance the DAG View with more runtime 
information 
 Key: TEZ-2790
 URL: https://issues.apache.org/jira/browse/TEZ-2790
 Project: Apache Tez
  Issue Type: Bug
  Components: UI
Affects Versions: 0.8.1
Reporter: Jeff Zhang


Currently the dag view cover some general runtime info (like 
start_time/end_time/task_num etc). It would be better to include more runtime 
info, such as the input/output data size. DataSize/RecordNumber transferred on 
each edge (may use thickness to visualize that ).  Time spent on shuffle of 
each edge. (may use the edge length to visualize that ). 

Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2776) Add a way to trigger a log dump from the AM at runtime

2015-09-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736354#comment-14736354
 ] 

Jeff Zhang commented on TEZ-2776:
-

I think [~Sreenath] [~pramachandran] are working on a web service to track the 
realtime status of dag for tez-ui. We can consolidate them into one web service 
in AM.



> Add a way to trigger a log dump from the AM at runtime
> --
>
> Key: TEZ-2776
> URL: https://issues.apache.org/jira/browse/TEZ-2776
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> Likely a webservice which dumps out state like
> - # tasks running
> - # tasks completed per vertex
> - # events sent to each task per vertex
> - state of the scheduler
> - state of other components such as the NodeTracker - #nodes, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2790) TEZ-UI: Enhance the DAG View with more runtime information

2015-09-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2790:

Description: 
Currently the dag view cover some general runtime info (like 
start_time/end_time/task_num etc). It would be better to include more runtime 
info, such as 
* Input/output data size. 
* DataSize/RecordNumber transferred on each edge (may use thickness to 
visualize that ).  
* Time spent on shuffle of each edge. (may use the edge length to visualize 
that ). 
* etc.

Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
status. 

  was:
Currently the dag view cover some general runtime info (like 
start_time/end_time/task_num etc). It would be better to include more runtime 
info, such as the input/output data size. DataSize/RecordNumber transferred on 
each edge (may use thickness to visualize that ).  Time spent on shuffle of 
each edge. (may use the edge length to visualize that ). 

Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
status. 


> TEZ-UI: Enhance the DAG View with more runtime information 
> ---
>
> Key: TEZ-2790
> URL: https://issues.apache.org/jira/browse/TEZ-2790
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.8.1
>Reporter: Jeff Zhang
>
> Currently the dag view cover some general runtime info (like 
> start_time/end_time/task_num etc). It would be better to include more runtime 
> info, such as 
> * Input/output data size. 
> * DataSize/RecordNumber transferred on each edge (may use thickness to 
> visualize that ).  
> * Time spent on shuffle of each edge. (may use the edge length to visualize 
> that ). 
> * etc.
> Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
> status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2790) TEZ-UI: Enhance the DAG View with more runtime information

2015-09-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2790:

Description: 
Currently the dag view cover some general runtime info (like 
start_time/end_time/task_num etc). It would be better to include more runtime 
info, such as 
* Input/output data size. 
* DataSize/RecordNumber transferred on each edge (may use thickness to 
visualize that ).  
* Time spent on shuffle of each edge. (may use the edge length to visualize 
that ). 

Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
status. 

  was:
Currently the dag view cover some general runtime info (like 
start_time/end_time/task_num etc). It would be better to include more runtime 
info, such as 
* Input/output data size. 
* DataSize/RecordNumber transferred on each edge (may use thickness to 
visualize that ).  
* Time spent on shuffle of each edge. (may use the edge length to visualize 
that ). 
* etc.

Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
status. 


> TEZ-UI: Enhance the DAG View with more runtime information 
> ---
>
> Key: TEZ-2790
> URL: https://issues.apache.org/jira/browse/TEZ-2790
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.8.1
>Reporter: Jeff Zhang
>
> Currently the dag view cover some general runtime info (like 
> start_time/end_time/task_num etc). It would be better to include more runtime 
> info, such as 
> * Input/output data size. 
> * DataSize/RecordNumber transferred on each edge (may use thickness to 
> visualize that ).  
> * Time spent on shuffle of each edge. (may use the edge length to visualize 
> that ). 
> Once TEZ-2760 is resolved, we can also enhance the dag view with in-progress 
> status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2724) Tez Client keeps on showing old status when application is finished but RM is shutdown

2015-09-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736453#comment-14736453
 ] 

Jeff Zhang commented on TEZ-2724:
-

Steps to reproduce this issue:
* configuration requirements:
** yarn.timeline-service.generic-application-history.enabled=false
** yarn.resourcemanager.recovery.enabled=false
** ipc.client.connect.retry.interval=5000
** ipc.client.connect.max.retries=12
* Run command: "hadoop jar tez-tests/target/tez-tests-0.8.1-SNAPSHOT.jar 
mrrsleep -m 5 -r 5 -mt 2 -rt 1"
* Kill the AM in the middle of job running
* Check the RM UI to wait for the yarn app finished, then restart RM


> Tez Client keeps on showing old status when application is finished but RM is 
> shutdown
> --
>
> Key: TEZ-2724
> URL: https://issues.apache.org/jira/browse/TEZ-2724
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.4
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2724-1.patch, amrecovery_mutlipleamrestart.txt
>
>
> From the logs, it seems the ipc retry interval is set as 20 seconds and ipc 
> max retries is 45. This means that the client will retry the RPC connection 
> for total 900 (20*45) seconds. And in this period, the application may 
> already complete and RM Restarting may be triggered as said in the jira 
> description. And I think the RM recovery is not enabled, so even the new RM 
> is restarted, the original application info is lost, that means the client 
> can never get the correct application report which makes it showing the old 
> status forever. 
> {code}
> 15/05/07 19:13:43 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 26 time(s); maxRetries=45
> Deleted /user/hadoopqa/Input1
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs -ls 
> /user/hadoopqa/Input2
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs  -rm -r 
> -skipTrash /user/hadoopqa/Input2
> 15/05/07 19:14:03 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 27 time(s); maxRetries=45
> {code}
> Configuration to reproduce this issue
> * disable generic application history 
> (yarn.timeline-service.generic-application-history.enabled)
> * disable rm recovery (yarn.resourcemanager.recovery.enabled)
> * increase the ipc retry interval and max retry 
> (ipc.client.connect.retry.interval & ipc.client.connect.max.retries)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2776) Add a way to trigger a log dump from the AM at runtime

2015-09-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736354#comment-14736354
 ] 

Jeff Zhang edited comment on TEZ-2776 at 9/9/15 7:35 AM:
-

I think [~Sreenath] [~pramachandran] are working on a web service to track the 
realtime status of dag for tez-ui. We can consolidate them into one web service 
in AM.
Link with TEZ-2760




was (Author: zjffdu):
I think [~Sreenath] [~pramachandran] are working on a web service to track the 
realtime status of dag for tez-ui. We can consolidate them into one web service 
in AM.



> Add a way to trigger a log dump from the AM at runtime
> --
>
> Key: TEZ-2776
> URL: https://issues.apache.org/jira/browse/TEZ-2776
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> Likely a webservice which dumps out state like
> - # tasks running
> - # tasks completed per vertex
> - # events sent to each task per vertex
> - state of the scheduler
> - state of other components such as the NodeTracker - #nodes, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2791) Reduce tez.runtime.shuffle.fetch.buffer.percent default value to avoid corner case OOMs

2015-09-09 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created TEZ-2791:
-

 Summary: Reduce tez.runtime.shuffle.fetch.buffer.percent default 
value to avoid corner case OOMs
 Key: TEZ-2791
 URL: https://issues.apache.org/jira/browse/TEZ-2791
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan


Default value for tez.runtime.shuffle.fetch.buffer.percent is set to 0.9. In 
corner cases & based on scheduling & data sizes, it is possible that JVM 
crosses old-gen threshold and ends up throwing OOM. It would be better to set 
the default value to 0.6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter

2015-09-09 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736527#comment-14736527
 ] 

Rajesh Balamohan commented on TEZ-2643:
---

Thanks [~saikatr]. lgtm. +1. Will commit shortly.

> Minimize number of empty spills in Pipelined Sorter
> ---
>
> Key: TEZ-2643
> URL: https://issues.apache.org/jira/browse/TEZ-2643
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Saikat
>Assignee: Saikat
> Attachments: TEZ-2643.1.patch, TEZ-2643.2.patch, TEZ-2643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2789) Backport events added in TEZ-2612 to branch-0.7

2015-09-09 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736633#comment-14736633
 ] 

Rajesh Balamohan commented on TEZ-2789:
---

lgtm overall. +1.

> Backport events added in TEZ-2612 to branch-0.7
> ---
>
> Key: TEZ-2789
> URL: https://issues.apache.org/jira/browse/TEZ-2789
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: TEZ-2789.1.patch
>
>
> Having the events in the 0.7 line will allow them to be persisted to ATS or 
> SimpleHistory logging. After that, the latest analyzers from master or 0.8 
> could be used to analyze them. At some point when the analzyers are stable, 
> they could move into the UI directly or be back-ported in bulk to the 0.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram reassigned TEZ-2792:
---

Assignee: Sreenath Somarajapuram

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagID: The complete dagID, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736687#comment-14736687
 ] 

Sreenath Somarajapuram commented on TEZ-2780:
-

[~hitesh] Sure. Moving the API part to TEZ-2792. This ticket would be just for 
UI changes.

> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on model change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2792:

Description: 
Add AM API for fetching realtime tasks info:
- API endpoint : /ws/v2/tez/tasksInfo
- Query Params:
-- dagID: The complete dagID, (mandatory).
-- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
-- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
-- limit: Maximum number of items to be returned (Defaults to 100).
- If taskMinID is passed: All (capped by limit) the specified tasks will be 
returned. vertexMinID if present wont be considered.
- IF vertexMinID is passed: All (capped by limit) tasks under the vertices will 
be returned.
- If just dagID is passed: All (capped by limit) tasks under the DAG will be 
returned.
- Data returned: complete task id, progress, status

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagID: The complete dagID, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2792:

Description: 
Add AM API for fetching realtime tasks info:
- API endpoint : /ws/v2/tez/tasksInfo
- Query Params:
-- dagMinID: dagMinID = dagIndex, (mandatory).
-- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
-- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
-- limit: Maximum number of items to be returned (Defaults to 100).
- If taskMinID is passed: All (capped by limit) the specified tasks will be 
returned. vertexMinID if present wont be considered.
- IF vertexMinID is passed: All (capped by limit) tasks under the vertices will 
be returned.
- If just dagID is passed: All (capped by limit) tasks under the DAG will be 
returned.
- Data returned: complete task id, progress, status

  was:
Add AM API for fetching realtime tasks info:
- API endpoint : /ws/v2/tez/tasksInfo
- Query Params:
-- dagID: The complete dagID, (mandatory).
-- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
-- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
-- limit: Maximum number of items to be returned (Defaults to 100).
- If taskMinID is passed: All (capped by limit) the specified tasks will be 
returned. vertexMinID if present wont be considered.
- IF vertexMinID is passed: All (capped by limit) tasks under the vertices will 
be returned.
- If just dagID is passed: All (capped by limit) tasks under the DAG will be 
returned.
- Data returned: complete task id, progress, status


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2792:

Attachment: TEZ-2792.1.patch

[~pramachandran] [~hitesh] Please review.

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2792:
---

 Summary: Add AM web service API for tasks.
 Key: TEZ-2792
 URL: https://issues.apache.org/jira/browse/TEZ-2792
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2780:

Description: Modify table component to automatically update cell based on 
model change.  (was: Add AM API for fetching realtime tasks info:
- API endpoint : /ws/v2/tez/tasksInfo
- Query Params:
-- dagID: The complete dagID, (mandatory).
-- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
-- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
-- limit: Maximum number of items to be returned (Defaults to 100).
- If taskMinID is passed: All (capped by limit) the specified tasks will be 
returned. vertexMinID if present wont be considered.
- IF vertexMinID is passed: All (capped by limit) tasks under the vertices will 
be returned.
- If just dagID is passed: All (capped by limit) tasks under the DAG will be 
returned.
- Data returned: complete task id, progress, status

Modify table component to automatically update cell based on model change.)

> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on model change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736794#comment-14736794
 ] 

TezQA commented on TEZ-2792:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12754887/TEZ-2792.1.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1097//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1097//console

This message is automatically generated.

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2724) Tez Client keeps on showing old status when application is finished but RM is shutdown

2015-09-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736453#comment-14736453
 ] 

Jeff Zhang edited comment on TEZ-2724 at 9/9/15 1:57 PM:
-

Steps to reproduce this issue:
* configuration requirements:
** yarn.timeline-service.generic-application-history.enabled=false
** yarn.resourcemanager.recovery.enabled=false
** ipc.client.connect.retry.interval=5000
** ipc.client.connect.max.retries=12 
* Run command: "hadoop jar tez-tests/target/tez-tests-0.8.1-SNAPSHOT.jar 
mrrsleep -m 5 -r 5 -mt 2 -rt 1"
* Kill the AM in the middle of job running
* New app attempt will be started and the dag will be recovered and completed. 
Check the RM UI to wait for the yarn app finished  then restart RM (Before the 
app completed, the client continue try to reconnect to the AM of the first app 
attempt). 



was (Author: zjffdu):
Steps to reproduce this issue:
* configuration requirements:
** yarn.timeline-service.generic-application-history.enabled=false
** yarn.resourcemanager.recovery.enabled=false
** ipc.client.connect.retry.interval=5000
** ipc.client.connect.max.retries=12
* Run command: "hadoop jar tez-tests/target/tez-tests-0.8.1-SNAPSHOT.jar 
mrrsleep -m 5 -r 5 -mt 2 -rt 1"
* Kill the AM in the middle of job running
* Check the RM UI to wait for the yarn app finished, then restart RM


> Tez Client keeps on showing old status when application is finished but RM is 
> shutdown
> --
>
> Key: TEZ-2724
> URL: https://issues.apache.org/jira/browse/TEZ-2724
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.4
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: TEZ-2724-1.patch, amrecovery_mutlipleamrestart.txt
>
>
> From the logs, it seems the ipc retry interval is set as 20 seconds and ipc 
> max retries is 45. This means that the client will retry the RPC connection 
> for total 900 (20*45) seconds. And in this period, the application may 
> already complete and RM Restarting may be triggered as said in the jira 
> description. And I think the RM recovery is not enabled, so even the new RM 
> is restarted, the original application info is lost, that means the client 
> can never get the correct application report which makes it showing the old 
> status forever. 
> {code}
> 15/05/07 19:13:43 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 26 time(s); maxRetries=45
> Deleted /user/hadoopqa/Input1
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs -ls 
> /user/hadoopqa/Input2
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs  -rm -r 
> -skipTrash /user/hadoopqa/Input2
> 15/05/07 19:14:03 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 27 time(s); maxRetries=45
> {code}
> Configuration to reproduce this issue
> * disable generic application history 
> (yarn.timeline-service.generic-application-history.enabled)
> * disable rm recovery (yarn.resourcemanager.recovery.enabled)
> * increase the ipc retry interval and max retry 
> (ipc.client.connect.retry.interval & ipc.client.connect.max.retries)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2792 PreCommit Build #1097

2015-09-09 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2792
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1097/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3454 lines...]
[INFO] Final Memory: 79M/801M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12754887/TEZ-2792.1.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1097//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1097//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
bd3f5b0a5f1da8e16572c5adb6ba3da83e6e9895 logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #1088
Archived 53 artifacts
Archive block size is 32768
Received 6 blocks and 3053962 bytes
Compression is 6.0%
Took 1 sec
Description set: TEZ-2792
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737061#comment-14737061
 ] 

Hitesh Shah edited comment on TEZ-2792 at 9/9/15 3:49 PM:
--

Comment on this particular section: 

{code}
 else if(vertexMinIDs.size() > 0) {
560 for (Integer vertexMinID : vertexMinIDs) {
561   Vertex vertex = getVertexFromIndex(dag, vertexMinID);
562   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
563   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
564 
565   if(tasks.size() >= limit) {
566 break;
567   }
568 }
569   }
570   else {
571 Collection vertices = dag.getVertices().values();
572 for (Vertex vertex : vertices) {
573   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
574   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
575 
576   if(tasks.size() >= limit) {
577 break;
578   }
579 }
580   }
581 }
{code}

Is there a reason why all objects are first copied over into an array list and 
then a subset is pulled out? 

Could a different approach be taken? For example, if the ask is minTaskId = 501 
and limit/max = 100, then just search for a given task by Id ( i.e 501 to 600 ) 
and put all of them into an array instead of getting all 1 task objects and 
then splitting the array? This might require some changes to first check 
vertex::numTasks. 




was (Author: hitesh):
Comment on this particular section: 

{code}
 else if(vertexMinIDs.size() > 0) {
560 for (Integer vertexMinID : vertexMinIDs) {
561   Vertex vertex = getVertexFromIndex(dag, vertexMinID);
562   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
563   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
564 
565   if(tasks.size() >= limit) {
566 break;
567   }
568 }
569   }
570   else {
571 Collection vertices = dag.getVertices().values();
572 for (Vertex vertex : vertices) {
573   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
574   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
575 
576   if(tasks.size() >= limit) {
577 break;
578   }
579 }
580   }
581 }
{code}

Is there a reason why all objects are first copied over into an array list and 
then a subset is pulled out? 

Could a different approach be taken? For example, if the ask is minTaskId = 501 
and limit/max = 100, then just search for a given task by Id ( i.e 501 to 600 ) 
and put all of them into an array instead of getting all 1 task objects and 
then splitting the array? 



> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737122#comment-14737122
 ] 

Sreenath Somarajapuram edited comment on TEZ-2792 at 9/9/15 4:24 PM:
-

[~hitesh]
- '*'
-- Sorry that I missed that change by the IDE, will revert the same in patch 2.

- minID:
-- They are minimized version of complete entity ID.
-- Helps to shorten the url length.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)

- Why dagMinID:
-- The complete url looks like 
http://address:8088/proxy/application_1441301219877_0111/ws/v2/tez/tasksInfo?dagMinID=1=0_48,0_49,0_46,0_47,0_44,0_45,0_43,0_42,0_41,0_40,0_39,0_38,0_36,0_37,0_35,0_34,0_33,0_32,0_31,0_30,0_28,0_29,0_27,0_26,0_24
-- dagMinID was added considering applications that might have multiple DAGs.
-- Also RM uses the complete application id to proxy to AM. Once there the full 
dagID is not required.

- For getIntegersFromRequest() and getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in?
-- invalid paramters: Both the function will return null.
-- give me everything: Bith the function will give an empty array.

- It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest are 
similar except for the param name? Maybe the 2 can be combined?
-- Idea is to replace dagID with dagMinID. The function also will be removed.
-- A ticket have been created for the same TEZ-2793

- Making MAX_QUERIED configurable.
-- MAX_QUERIED is the default value, the UI can pass a higher limit.
-- Have created TEZ-2794 for making it configurable.



was (Author: sreenath):
[~hitesh]
- '*'
-- Sorry that I missed that change by the IDE, will revert the same in patch 2.

- minID:
-- They are minimized version of complete entity ID.
-- Helps to shorten the url length.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)

- Why dagMinID:
-- The complete url looks like 
http://address:8088/proxy/application_1441301219877_0111/ws/v2/tez/tasksInfo?dagMinID=1=0_48,0_49,0_46,0_47,0_44,0_45,0_43,0_42,0_41,0_40,0_39,0_38,0_36,0_37,0_35,0_34,0_33,0_32,0_31,0_30,0_28,0_29,0_27,0_26,0_24
-- dagMinID was added considering applications that might have multiple DAGs.
-- Also RM uses the complete application id to proxy to AM. Once there the full 
dagID is not required.

- For getIntegersFromRequest() and getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in?
-- invalid paramters: Both the function will return null.
-- give me everything: Bith the function will give an empty array.

- It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest are 
similar except for the param name? Maybe the 2 can be combined?
-- Idea is to replace dagID with dagMinID. The function also will be removed.
-- A ticket have been created for the same TEZ-2793

- Making MAX_QUERIED configurable.
-- MAX_QUERIED is the default value, the UI can pass a higher limit.
-- Have created TEZ-2794 for the same.


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2760) Tez UI: in-progress updates on UI

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737045#comment-14737045
 ] 

Sreenath Somarajapuram commented on TEZ-2760:
-

Take up 2776 once this master ticket is completed.

> Tez UI: in-progress updates on UI
> -
>
> Key: TEZ-2760
> URL: https://issues.apache.org/jira/browse/TEZ-2760
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Prakash Ramachandran
>Assignee: Prakash Ramachandran
>
> This is a placeholder for the tasks required for inprogress updates of the 
> Tez UI. The users would like to see the latest status for the 
> dag/vertex/task/attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737122#comment-14737122
 ] 

Sreenath Somarajapuram commented on TEZ-2792:
-

[~hitesh]
- '*'
-- Sorry that I missed that change by the IDE, will revert the same in patch 2.

- minID:
-- They are minimized version of complete entity ID.
-- Helps to shorten the url length.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)

- Why dagMinID:
-- The complete url looks like 
http://address:8088/proxy/application_1441301219877_0111/ws/v2/tez/tasksInfo?dagMinID=1=0_48,0_49,0_46,0_47,0_44,0_45,0_43,0_42,0_41,0_40,0_39,0_38,0_36,0_37,0_35,0_34,0_33,0_32,0_31,0_30,0_28,0_29,0_27,0_26,0_24
-- dagMinID was added considering applications that might have multiple DAGs.
-- Also RM uses the complete application id to proxy to AM. Once there the full 
dagID is not required.

- For getIntegersFromRequest() and getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in?
-- invalid paramters: Both the function will return null.
-- give me everything: Bith the function will give an empty array.

- It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest are 
similar except for the param name? Maybe the 2 can be combined?
-- Idea is to replace dagID with dagMinID. The function also will be removed.
-- A ticket have been created for the same TEZ-2793

- Making MAX_QUERIED configurable.
-- Will change the count to 500 in patch 2.
-- Have created TEZ-2794 for the same.


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2794) AM web service V2: Make default limit of queryable records configurable.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2794:

Issue Type: Sub-task  (was: Bug)
Parent: TEZ-2760

> AM web service V2: Make default limit of queryable records configurable.
> 
>
> Key: TEZ-2794
> URL: https://issues.apache.org/jira/browse/TEZ-2794
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>
> Currently its a constant(MAX_QUERIED) in AMWebController.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737061#comment-14737061
 ] 

Hitesh Shah edited comment on TEZ-2792 at 9/9/15 3:47 PM:
--

Comment on this particular section: 

{code}
 else if(vertexMinIDs.size() > 0) {
560 for (Integer vertexMinID : vertexMinIDs) {
561   Vertex vertex = getVertexFromIndex(dag, vertexMinID);
562   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
563   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
564 
565   if(tasks.size() >= limit) {
566 break;
567   }
568 }
569   }
570   else {
571 Collection vertices = dag.getVertices().values();
572 for (Vertex vertex : vertices) {
573   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
574   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
575 
576   if(tasks.size() >= limit) {
577 break;
578   }
579 }
580   }
581 }
{code}

Is there a reason why all objects are first copied over into an array list and 
then a subset is pulled out? 

Could a different approach be taken? For example, if the ask is minTaskId = 501 
and limit/max = 100, then just search for a given task by Id ( i.e 501 to 600 ) 
and put all of them into an array instead of getting all 1 task objects and 
then splitting the array? 




was (Author: hitesh):
Comment on this particular section: 

{code]
 else if(vertexMinIDs.size() > 0) {
560 for (Integer vertexMinID : vertexMinIDs) {
561   Vertex vertex = getVertexFromIndex(dag, vertexMinID);
562   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
563   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
564 
565   if(tasks.size() >= limit) {
566 break;
567   }
568 }
569   }
570   else {
571 Collection vertices = dag.getVertices().values();
572 for (Vertex vertex : vertices) {
573   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
574   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
575 
576   if(tasks.size() >= limit) {
577 break;
578   }
579 }
580   }
581 }
{code}

Is there a reason why all objects are first copied over into an array list and 
then a subset is pulled out? 

Could a different approach be taken? For example, if the ask is minTaskId = 501 
and limit/max = 100, then just search for a given task by Id ( i.e 501 to 600 ) 
and put all of them into an array instead of getting all 1 task objects and 
then splitting the array? 



> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737061#comment-14737061
 ] 

Hitesh Shah commented on TEZ-2792:
--

Comment on this particular section: 

{code]
 else if(vertexMinIDs.size() > 0) {
560 for (Integer vertexMinID : vertexMinIDs) {
561   Vertex vertex = getVertexFromIndex(dag, vertexMinID);
562   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
563   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
564 
565   if(tasks.size() >= limit) {
566 break;
567   }
568 }
569   }
570   else {
571 Collection vertices = dag.getVertices().values();
572 for (Vertex vertex : vertices) {
573   List vertexTasks = new 
ArrayList<>(vertex.getTasks().values());
574   tasks.addAll(vertexTasks.subList(0, 
Math.min(vertexTasks.size(), limit - tasks.size(;
575 
576   if(tasks.size() >= limit) {
577 break;
578   }
579 }
580   }
581 }
{code}

Is there a reason why all objects are first copied over into an array list and 
then a subset is pulled out? 

Could a different approach be taken? For example, if the ask is minTaskId = 501 
and limit/max = 100, then just search for a given task by Id ( i.e 501 to 600 ) 
and put all of them into an array instead of getting all 1 task objects and 
then splitting the array? 



> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2793) AM Web service V2: Make dagInfo & verticesInfo APIs accept dagMinID as query param

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2793:
---

 Summary: AM Web service V2: Make dagInfo & verticesInfo APIs 
accept dagMinID as query param
 Key: TEZ-2793
 URL: https://issues.apache.org/jira/browse/TEZ-2793
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram


- Currently we are expecting dag index on a query param with name dagID, which 
is counter intuitive.

- The concept of MinID was brought in to minimize the url size.
-- They are minimized version of complete entity ID.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)
--- For instance. A complete attempt id looks like 
attempt_1441301219877_0111_1_02_49_0. The same can be referenced under a 
DAG with a minimized version 2_49_0. 

- Looks meaningful to accept all query params as min-ids from API v2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2794) AM web service V2: Make default limit of queryable records configurable.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2794:
---

 Summary: AM web service V2: Make default limit of queryable 
records configurable.
 Key: TEZ-2794
 URL: https://issues.apache.org/jira/browse/TEZ-2794
 Project: Apache Tez
  Issue Type: Bug
Reporter: Sreenath Somarajapuram


Currently its a constant(MAX_QUERIED) in AMWebController.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737033#comment-14737033
 ] 

Hitesh Shah commented on TEZ-2792:
--

Comments:
   - {code}import java.util.*;{code} - "*" should not be used.

   - What does "minId" signify? Why is there a dagMinId? All webservice calls 
should always be for only a specific dagId therefore why is there a need for 
minId? 
   - Also, what does minId mean for vertices and tasks? 
   - For getIntegersFromRequest() and  getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in? 

   - It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest 
are similar except for the param name? Maybe the 2 can be combined? 

   - It might be better to set MAX_QUERIED to a higher value say 500 or 1000. 
The UI could be restricted to 100 or so but if the UI decides to change to a 
slightly higher value, the webservice then would not need to fixed. Maybe file 
a follow up jira for this value to be made configurable. 





> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737052#comment-14737052
 ] 

Hitesh Shah commented on TEZ-2792:
--

Some additional comments:
  -  if a user asks for 10 ids, if one id is not found, then a 404 is thrown. 
Is that intentional?
  
>From a general programming view point, I am not a fan of a called function 
>setting the return code and the callee then checking for null and doing a 
>return if null. Is there a better approach we could apply ( does not need to 
>be done in this jira as a lot of the code uses a similar approach ) which is 
>less error prone?





> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737122#comment-14737122
 ] 

Sreenath Somarajapuram edited comment on TEZ-2792 at 9/9/15 4:22 PM:
-

[~hitesh]
- '*'
-- Sorry that I missed that change by the IDE, will revert the same in patch 2.

- minID:
-- They are minimized version of complete entity ID.
-- Helps to shorten the url length.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)

- Why dagMinID:
-- The complete url looks like 
http://address:8088/proxy/application_1441301219877_0111/ws/v2/tez/tasksInfo?dagMinID=1=0_48,0_49,0_46,0_47,0_44,0_45,0_43,0_42,0_41,0_40,0_39,0_38,0_36,0_37,0_35,0_34,0_33,0_32,0_31,0_30,0_28,0_29,0_27,0_26,0_24
-- dagMinID was added considering applications that might have multiple DAGs.
-- Also RM uses the complete application id to proxy to AM. Once there the full 
dagID is not required.

- For getIntegersFromRequest() and getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in?
-- invalid paramters: Both the function will return null.
-- give me everything: Bith the function will give an empty array.

- It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest are 
similar except for the param name? Maybe the 2 can be combined?
-- Idea is to replace dagID with dagMinID. The function also will be removed.
-- A ticket have been created for the same TEZ-2793

- Making MAX_QUERIED configurable.
-- MAX_QUERIED is the default value, the UI can pass a higher limit.
-- Have created TEZ-2794 for the same.



was (Author: sreenath):
[~hitesh]
- '*'
-- Sorry that I missed that change by the IDE, will revert the same in patch 2.

- minID:
-- They are minimized version of complete entity ID.
-- Helps to shorten the url length.
-- dagMinID = dagIndex
-- vertexMinID = vertexIndex
-- taskMinID = vertexIndex_taskIndex
-- attemptMinID = vertexIndex_taskIndex_attemptNo (vertexIndex_taskIndex = 
vertexIndex_taskIndex_0 making it shorter in most cases)

- Why dagMinID:
-- The complete url looks like 
http://address:8088/proxy/application_1441301219877_0111/ws/v2/tez/tasksInfo?dagMinID=1=0_48,0_49,0_46,0_47,0_44,0_45,0_43,0_42,0_41,0_40,0_39,0_38,0_36,0_37,0_35,0_34,0_33,0_32,0_31,0_30,0_28,0_29,0_27,0_26,0_24
-- dagMinID was added considering applications that might have multiple DAGs.
-- Also RM uses the complete application id to proxy to AM. Once there the full 
dagID is not required.

- For getIntegersFromRequest() and getMinIDsFromRequest(), how do you 
distinguish a request which says give me everything to one which had invalid 
paramters passed in?
-- invalid paramters: Both the function will return null.
-- give me everything: Bith the function will give an empty array.

- It seems checkAndGetDAGFromRequestMinID() and checkAndGetDAGFromRequest are 
similar except for the param name? Maybe the 2 can be combined?
-- Idea is to replace dagID with dagMinID. The function also will be removed.
-- A ticket have been created for the same TEZ-2793

- Making MAX_QUERIED configurable.
-- Will change the count to 500 in patch 2.
-- Have created TEZ-2794 for the same.


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737138#comment-14737138
 ] 

Hitesh Shah commented on TEZ-2792:
--

MinID was a bit confusing as it seems to indicate "minimum id" - maybe use 
dagId or dagIndex, vertexId/vertexIndex and something more meaninful for task? 

bq. dagMinID was added considering applications that might have multiple DAGs.

No webservice call should support cross-dag queries. A single dag id/index is 
sufficient. 

Also, please use "!isEmpty()" instead of "size() == 0"

 




> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-2799:

Attachment: history.txt.appattempt_1441839730184_0001_01

> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737828#comment-14737828
 ] 

Bikas Saha commented on TEZ-2799:
-

java.lang.NullPointerException
at 
org.apache.tez.history.parser.SimpleHistoryParser.parseContents(SimpleHistoryParser.java:208)
at 
org.apache.tez.history.parser.SimpleHistoryParser.getDAGData(SimpleHistoryParser.java:77)
at 
org.apache.tez.analyzer.plugins.TezAnalyzerBase.run(TezAnalyzerBase.java:170)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.tez.analyzer.plugins.CriticalPathAnalyzer.main(CriticalPathAnalyzer.java:473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at 
org.apache.tez.analyzer.plugins.AnalyzerDriver.main(AnalyzerDriver.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)


> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2774) Reduce logging in the AM, and parts of the runtime

2015-09-09 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2774:

Attachment: TEZ-2774.3.txt

Updated patch, with some of the commit messages relaxed for speculation.
Logs the ContainerContext and resources once per vertex.

> Reduce logging in the AM, and parts of the runtime
> --
>
> Key: TEZ-2774
> URL: https://issues.apache.org/jira/browse/TEZ-2774
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2774.1.txt, TEZ-2774.2.txt, TEZ-2774.3.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-09-09 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2798:
--
Target Version/s: 0.8.1

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737830#comment-14737830
 ] 

Bikas Saha commented on TEZ-2799:
-

Looks like a simple null check is missing, but I could be wrong. There object 
might be expected to not be null ever. [~rajesh.balamohan] Could you please 
check?

> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737829#comment-14737829
 ] 

Bikas Saha commented on TEZ-2799:
-

hadoop jar tez-0.8.1-SNAPSHOT/tez-job-analyzer-0.8.1-SNAPSHOT.jar CriticalPath 
--dagId=dag_1441839730184_0001_16 
--eventFileName=history.txt.appattempt_1441839730184_0001_01 
--outputDir=/tmp

> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2774) Reduce logging in the AM, and parts of the runtime

2015-09-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737886#comment-14737886
 ] 

TezQA commented on TEZ-2774:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12755020/TEZ-2774.3.txt
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.rm.TestContainerReuse

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1101//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1101//console

This message is automatically generated.

> Reduce logging in the AM, and parts of the runtime
> --
>
> Key: TEZ-2774
> URL: https://issues.apache.org/jira/browse/TEZ-2774
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2774.1.txt, TEZ-2774.2.txt, TEZ-2774.3.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez and move to guava-18

2015-09-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737893#comment-14737893
 ] 

TezQA commented on TEZ-2164:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12755022/TEZ-2164.4.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 89 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1102//console

This message is automatically generated.

> Shade the guava version used by Tez and move to guava-18
> 
>
> Key: TEZ-2164
> URL: https://issues.apache.org/jira/browse/TEZ-2164
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Hitesh Shah
>Priority: Critical
> Attachments: TEZ-2164.3.patch, TEZ-2164.4.patch, 
> TEZ-2164.wip.2.patch, allow-guava-16.0.1.patch
>
>
> Should allow us to upgrade to a newer version without shipping a guava 
> dependency.
> Would be good to do this in 0.7 so that we stop shipping guava as early as 
> possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737995#comment-14737995
 ] 

Bikas Saha commented on TEZ-2799:
-

lgtm. minor. if {==null} else {} may be better than if {==null} if {!=null}

> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2799.1.patch, 
> history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-2164 PreCommit Build #1102

2015-09-09 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2164
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1102/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 449 lines...]


==
==
Determining number of patched javac warnings.
==
==


/home/jenkins/tools/maven/latest/bin/mvn clean test -DskipTests -Ptest-patch > 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/patchJavacWarnings.txt
 2>&1




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12755022/TEZ-2164.4.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 89 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1102//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
49c99240e63e14332a446250f880ae9a66720e06 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #1098
Archived 3 artifacts
Archive block size is 32768
Received 0 blocks and 831007 bytes
Compression is 0.0%
Took 0.27 sec
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (TEZ-2799) SimpleHistoryParser NPE

2015-09-09 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2799:
--
Attachment: TEZ-2799.1.patch

[~bikassaha] - This happens in scenario where the container details are not 
present in the related entities. E.g {noformat} 
attempt_1441839730184_0001_16_10_06_0, 
attempt_1441839730184_0001_16_10_05_0, 
attempt_1441839730184_0001_16_10_01_0 {noformat}  in the logs did not have 
those details as they failed with "CONTAINER_EXITED".

Attaching a patch to fix this.

> SimpleHistoryParser NPE
> ---
>
> Key: TEZ-2799
> URL: https://issues.apache.org/jira/browse/TEZ-2799
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2799.1.patch, 
> history.txt.appattempt_1441839730184_0001_01
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2796) AM web service API V2: Make AM web service smarter - Support sorting, filtering, pagination...

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2796:

Summary: AM web service API V2: Make AM web service smarter - Support 
sorting, filtering, pagination...  (was: AM web service API V2: Add filtering & 
sorting)

> AM web service API V2: Make AM web service smarter - Support sorting, 
> filtering, pagination...
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query params:
> - fromMinID - limit number of entities from the specified ID would be fetched.
> - fields - Comma separated list. All the specified fields + ID would be 
> returned.
> - sortOn - Filed to be sorted on
> - SortOrder - asc/dsc (ascending/descending). By default asc.
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2796) AM web service API V2: Add filtering & sorting

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738178#comment-14738178
 ] 

Sreenath Somarajapuram commented on TEZ-2796:
-

[~hitesh]
- Sorry assumed fields to be column wise filtering.
- Moving this out of TEZ-2760 and adding sub tickets.

> AM web service API V2: Add filtering & sorting
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query params:
> - fromMinID - limit number of entities from the specified ID would be fetched.
> - fields - Comma separated list. All the specified fields + ID would be 
> returned.
> - sortOn - Filed to be sorted on
> - SortOrder - asc/dsc (ascending/descending). By default asc.
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2796) AM web service API V2: Add filtering & sorting

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2796:

Issue Type: Task  (was: Sub-task)
Parent: (was: TEZ-2760)

> AM web service API V2: Add filtering & sorting
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query params:
> - fromMinID - limit number of entities from the specified ID would be fetched.
> - fields - Comma separated list. All the specified fields + ID would be 
> returned.
> - sortOn - Filed to be sorted on
> - SortOrder - asc/dsc (ascending/descending). By default asc.
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2796) AM web service API V2: Make AM web service smarter - Support sorting, filtering, pagination...

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2796:

Description: 
Functionalities are yet to be finalized. Following is tentative.

Query parameters:
- fromMinID
-- Items from the given ID would be fetched.
-limit
-- Maximum number of items to be returned.
- fields
-- Comma separated list.
-- All the specified fields + ID would be returned.
- filter
-- Coma separated list of fileter
-- A filter looks like :
- sort
-- Filed to be sorted on
-- Format: [:], []-optional
-- By default asc. 

* counters would be handled in another ticket.

  was:
Functionalities are yet to be finalized. Following is tentative.

Query params:
- fromMinID - limit number of entities from the specified ID would be fetched.
- fields - Comma separated list. All the specified fields + ID would be 
returned.
- sortOn - Filed to be sorted on
- SortOrder - asc/dsc (ascending/descending). By default asc.

* counters would be handled in another ticket.


> AM web service API V2: Make AM web service smarter - Support sorting, 
> filtering, pagination...
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query parameters:
> - fromMinID
> -- Items from the given ID would be fetched.
> -limit
> -- Maximum number of items to be returned.
> - fields
> -- Comma separated list.
> -- All the specified fields + ID would be returned.
> - filter
> -- Coma separated list of fileter
> -- A filter looks like :
> - sort
> -- Filed to be sorted on
> -- Format: [:], []-optional
> -- By default asc. 
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2796) AM web service API V2: Make AM web service smarter - Support sorting, filtering, pagination...

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2796:

Description: 
Functionalities are yet to be finalized. Following is tentative.

Query parameters:
- fromMinID
-- Items from the given ID would be fetched.
- fields
-- Comma separated list.
-- All the specified fields + ID would be returned.
- filter
-- Coma separated list of fileter
-- A filter looks like :
- sort
-- Filed to be sorted on
-- Format: [:], []-optional
-- By default asc. 

* counters would be handled in another ticket.

  was:
Functionalities are yet to be finalized. Following is tentative.

Query parameters:
- fromMinID
-- Items from the given ID would be fetched.
-limit
-- Maximum number of items to be returned.
- fields
-- Comma separated list.
-- All the specified fields + ID would be returned.
- filter
-- Coma separated list of fileter
-- A filter looks like :
- sort
-- Filed to be sorted on
-- Format: [:], []-optional
-- By default asc. 

* counters would be handled in another ticket.


> AM web service API V2: Make AM web service smarter - Support sorting, 
> filtering, pagination...
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query parameters:
> - fromMinID
> -- Items from the given ID would be fetched.
> - fields
> -- Comma separated list.
> -- All the specified fields + ID would be returned.
> - filter
> -- Coma separated list of fileter
> -- A filter looks like :
> - sort
> -- Filed to be sorted on
> -- Format: [:], []-optional
> -- By default asc. 
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2800) AM web service API V2: Add fromMinID

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2800:
---

 Summary: AM web service API V2: Add fromMinID
 Key: TEZ-2800
 URL: https://issues.apache.org/jira/browse/TEZ-2800
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2791) Reduce tez.runtime.shuffle.fetch.buffer.percent default value to avoid corner case OOMs

2015-09-09 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738026#comment-14738026
 ] 

Saikat commented on TEZ-2791:
-

[~jeagles] I believe this is the same setitngs I was taking about which had 
bloated the old gen heap space. Reducing this will ensure that fetched map 
outputs are merged-spilled early to disk by the merge manager.

> Reduce tez.runtime.shuffle.fetch.buffer.percent default value to avoid corner 
> case OOMs
> ---
>
> Key: TEZ-2791
> URL: https://issues.apache.org/jira/browse/TEZ-2791
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>
> Default value for tez.runtime.shuffle.fetch.buffer.percent is set to 0.9. In 
> corner cases & based on scheduling & data sizes, it is possible that JVM 
> crosses old-gen threshold and ends up throwing OOM. It would be better to set 
> the default value to 0.6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2802) AM web service API V2: Add filter to search for specific values

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2802:
---

 Summary: AM web service API V2: Add filter to search for specific 
values
 Key: TEZ-2802
 URL: https://issues.apache.org/jira/browse/TEZ-2802
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2800) AM web service API V2: Add fromMinID

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2800:

Description: 
Items from the given ID would be fetched.


> AM web service API V2: Add fromMinID
> 
>
> Key: TEZ-2800
> URL: https://issues.apache.org/jira/browse/TEZ-2800
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Items from the given ID would be fetched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2801) AM web service API V2: Add fields for specifying columns to be returned

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2801:

Description: 
- Comma separated list.
- All the specified fields + ID would be returned.

> AM web service API V2: Add fields for specifying columns to be returned
> ---
>
> Key: TEZ-2801
> URL: https://issues.apache.org/jira/browse/TEZ-2801
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> - Comma separated list.
> - All the specified fields + ID would be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2801) AM web service API V2: Add fields for specifying columns to be returned

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2801:
---

 Summary: AM web service API V2: Add fields for specifying columns 
to be returned
 Key: TEZ-2801
 URL: https://issues.apache.org/jira/browse/TEZ-2801
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2803) AM web service API V2: Add sort

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2803:
---

 Summary: AM web service API V2: Add sort
 Key: TEZ-2803
 URL: https://issues.apache.org/jira/browse/TEZ-2803
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2802) AM web service API V2: Add filter to search for specific values

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2802:

Description: 
- Coma separated list of fileter
- A filter looks like :

> AM web service API V2: Add filter to search for specific values
> ---
>
> Key: TEZ-2802
> URL: https://issues.apache.org/jira/browse/TEZ-2802
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> - Coma separated list of fileter
> - A filter looks like :



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2803) AM web service API V2: Add sort

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2803:

Description: 
- Filed to be sorted on
- Format: [:], []-optional
- By default asc.

> AM web service API V2: Add sort
> ---
>
> Key: TEZ-2803
> URL: https://issues.apache.org/jira/browse/TEZ-2803
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> - Filed to be sorted on
> - Format: [:], []-optional
> - By default asc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738221#comment-14738221
 ] 

Sreenath Somarajapuram commented on TEZ-2792:
-

[~hitesh]
MinID was a bit confusing as it seems to indicate "minimum id" - maybe use 
dagId or dagIndex, vertexId/vertexIndex and something more meaninful for task?
-- For tasks/applications wouldn't it be a bit confusing to pass 5_11/7_9_13 as 
ID or index? The values are neither ID nor index.
-- What might be a better name?

dagMinID was added considering "applications" that might have multiple DAGs.
-- No web-service call should support cross-dag queries. A single dag id/index 
is sufficient.
--- Yes, dagMinID accepts just one single id.

Also, please use "!isEmpty()" instead of "size() == 0"
-- Thanks will do that in patch 2.

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738221#comment-14738221
 ] 

Sreenath Somarajapuram edited comment on TEZ-2792 at 9/10/15 5:53 AM:
--

[~hitesh]
MinID was a bit confusing as it seems to indicate "minimum id" - maybe use 
dagId or dagIndex, vertexId/vertexIndex and something more meaninful for task?
- For tasks/applications wouldn't it be a bit confusing to pass 5_11/7_9_13 as 
ID or index? The values are neither ID nor index.
- What might be a better name?

dagMinID was added considering "applications" that might have multiple DAGs.
- No web-service call should support cross-dag queries. A single dag id/index 
is sufficient.
-- Yes, dagMinID accepts just one single id.

Also, please use "!isEmpty()" instead of "size() == 0"
- Thanks will do that in patch 2.


was (Author: sreenath):
[~hitesh]
MinID was a bit confusing as it seems to indicate "minimum id" - maybe use 
dagId or dagIndex, vertexId/vertexIndex and something more meaninful for task?
-- For tasks/applications wouldn't it be a bit confusing to pass 5_11/7_9_13 as 
ID or index? The values are neither ID nor index.
-- What might be a better name?

dagMinID was added considering "applications" that might have multiple DAGs.
-- No web-service call should support cross-dag queries. A single dag id/index 
is sufficient.
--- Yes, dagMinID accepts just one single id.

Also, please use "!isEmpty()" instead of "size() == 0"
-- Thanks will do that in patch 2.

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2097) TEZ-UI Add dag logs

2015-09-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737370#comment-14737370
 ] 

TezQA commented on TEZ-2097:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12754787/TEZ-2097.4.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1098//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1098//console

This message is automatically generated.

> TEZ-UI Add dag logs
> ---
>
> Key: TEZ-2097
> URL: https://issues.apache.org/jira/browse/TEZ-2097
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jeff Zhang
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-2097.1.patch, TEZ-2097.2.patch, TEZ-2097.3.patch, 
> TEZ-2097.4.patch
>
>
> If dag fails due to AM error, there's no way to check the dag logs on tez-ui. 
> Users have to grab the app logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2097) TEZ-UI Add dag logs

2015-09-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737376#comment-14737376
 ] 

Jonathan Eagles commented on TEZ-2097:
--

[~hitesh], can you give another review now that Hadoop QA has given its 
blessing? I'll file another ticket for the corresponding Tez UI change.

> TEZ-UI Add dag logs
> ---
>
> Key: TEZ-2097
> URL: https://issues.apache.org/jira/browse/TEZ-2097
> Project: Apache Tez
>  Issue Type: Bug
>  Components: UI
>Reporter: Jeff Zhang
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: TEZ-2097.1.patch, TEZ-2097.2.patch, TEZ-2097.3.patch, 
> TEZ-2097.4.patch
>
>
> If dag fails due to AM error, there's no way to check the dag logs on tez-ui. 
> Users have to grab the app logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2787) Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737428#comment-14737428
 ] 

Jonathan Eagles commented on TEZ-2787:
--

With the patch above here is the new addition to the DAGAppMaster java process 
being run.

{code}
-Djava.io.tmpdir=/Users/jeagles/run/tmp/nm-local-dir/usercache/jeagles/appcache/application_1441825297926_0002/container_1441825297926_0002_01_01/tmp
{code}

This is currently used for Jetty in the AM and is not too large.
{code}
du -sch 
/Users/jeagles/run/tmp/nm-local-dir/usercache/jeagles/appcache/application_1441825297926_0002/container_1441825297926_0002_01_01/tmp
404K
/Users/jeagles/run/tmp/nm-local-dir/usercache/jeagles/appcache/application_1441825297926_0002/container_1441825297926_0002_01_01/tmp
{code}

This is going to do a better job of cleaning up temp files with this setup and 
be less susceptible to tmpwatch process cleaning up unarchived files with old 
timestamps in the /tmp dir.

> Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks
> ---
>
> Key: TEZ-2787
> URL: https://issues.apache.org/jira/browse/TEZ-2787
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
> Attachments: TEZ-2787.1.patch
>
>
> TezRuntimeChildJVM ensures that tasks are launched with 
> -Djava.io.tmpdir=./tmp, but there's no corresponding code to ensure the Tez 
> AM also has a similar tmpdir setting.  The client should setup the AM launch 
> context to have -Djava.io.tmpdir=./tmp to be consistent with the tasks and to 
> prevent accidental leaking of files in /tmp by the Tez AM if it crashes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737275#comment-14737275
 ] 

Hitesh Shah edited comment on TEZ-2792 at 9/9/15 5:45 PM:
--

bq. Should that be graceful?

What will the UI do? Will it retry? Should it show everything as unavailable 
even if 9 out of 10 tasks had data available? 

bq. do we have a final idea on how filters and sorting must work?

Can be added in a follow-up jira. If the current plan is to get all task Ids 
from ATS and then update them live via data from the AM, that should be fine 
for an initial approach. 



was (Author: hitesh):
bq. Should that be graceful?

What will the UI do? Will it retry? Should it show everything as unavailable 
even if 9 out of 10 tasks had data available? 

bq. do we have a final idea on how filters and sorting must work?

Can be added in a follow-up jira. 


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2780:

Description: 
Modify table component to automatically update cell based on model change.
#. Upgraded pollster to manage polling for a specific entity.
#. Added progress column to All Tasks table.
#. Updated table to automatically reflect model change - Just need to set 
observePath to true in defaultColumnConfigs.
#. Updated table logic to send action on row change.

  was:Modify table component to automatically update cell based on model change.


> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.2.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on model change.
> #. Upgraded pollster to manage polling for a specific entity.
> #. Added progress column to All Tasks table.
> #. Updated table to automatically reflect model change - Just need to set 
> observePath to true in defaultColumnConfigs.
> #. Updated table logic to send action on row change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737336#comment-14737336
 ] 

Sreenath Somarajapuram commented on TEZ-2792:
-

What will the UI do? Will it retry?
- Its graceful at the UI side, the UI will retry. Guess the server can also be 
made to function gracefully.
- Will make it a part of TEZ-2795 (Error management)

Follow-up jira for filtering & sorting: Created TEZ-2796 for the same.

> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: TEZ-2097 PreCommit Build #1098

2015-09-09 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-2097
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1098/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3460 lines...]
[INFO] Final Memory: 82M/893M
[INFO] 




{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12754787/TEZ-2097.4.patch
  against master revision 00508f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1098//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1098//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
e28bfb41df6b68e37efc7673bbe3bc34f2ad logged out


==
==
Finished build.
==
==


Archiving artifacts
Sending artifact delta relative to PreCommit-TEZ-Build #1097
Archived 53 artifacts
Archive block size is 32768
Received 8 blocks and 2989381 bytes
Compression is 8.1%
Took 3.4 sec
Description set: TEZ-2097
Recording test results
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (TEZ-2787) Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-09 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-2787:
-
Attachment: TEZ-2787.1.patch

> Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks
> ---
>
> Key: TEZ-2787
> URL: https://issues.apache.org/jira/browse/TEZ-2787
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
> Attachments: TEZ-2787.1.patch
>
>
> TezRuntimeChildJVM ensures that tasks are launched with 
> -Djava.io.tmpdir=./tmp, but there's no corresponding code to ensure the Tez 
> AM also has a similar tmpdir setting.  The client should setup the AM launch 
> context to have -Djava.io.tmpdir=./tmp to be consistent with the tasks and to 
> prevent accidental leaking of files in /tmp by the Tez AM if it crashes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TEZ-2789) Backport events added in TEZ-2612 to branch-0.7

2015-09-09 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha resolved TEZ-2789.
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 0.7.1

Thanks!
commit bc56ca3157973971b7e0e21ed834d56ecc7cdd46
Author: Bikas Saha 
Date:   Wed Sep 9 13:54:23 2015 -0700

TEZ-2789. Backport events added in TEZ-2612 to branch-0.7 (bikas)



> Backport events added in TEZ-2612 to branch-0.7
> ---
>
> Key: TEZ-2789
> URL: https://issues.apache.org/jira/browse/TEZ-2789
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Fix For: 0.7.1
>
> Attachments: TEZ-2789.1.patch
>
>
> Having the events in the 0.7 line will allow them to be persisted to ATS or 
> SimpleHistory logging. After that, the latest analyzers from master or 0.8 
> could be used to analyze them. At some point when the analzyers are stable, 
> they could move into the UI directly or be back-ported in bulk to the 0.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2787) Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737581#comment-14737581
 ] 

Jonathan Eagles commented on TEZ-2787:
--

[~hitesh], [~jlowe], current behavior is to have the Tez Client just to 
hard-code the tmpdir to the yarn default container tmp dir setting exactly as 
the AM hard-codes the tez child setting for java.io.tmpdir. Should we be doing 
more with this setting?

> Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks
> ---
>
> Key: TEZ-2787
> URL: https://issues.apache.org/jira/browse/TEZ-2787
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
> Attachments: TEZ-2787.1.patch
>
>
> TezRuntimeChildJVM ensures that tasks are launched with 
> -Djava.io.tmpdir=./tmp, but there's no corresponding code to ensure the Tez 
> AM also has a similar tmpdir setting.  The client should setup the AM launch 
> context to have -Djava.io.tmpdir=./tmp to be consistent with the tasks and to 
> prevent accidental leaking of files in /tmp by the Tez AM if it crashes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2780:

Attachment: TEZ-2780.2.patch

> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.2.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on model change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2797) API V2: DAG details bug fixes

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2797:
---

 Summary: API V2: DAG details bug fixes
 Key: TEZ-2797
 URL: https://issues.apache.org/jira/browse/TEZ-2797
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram


- On connecting the UI with old APIs, the UI still tries to hit the new URLs 
and the progress in DAG details doesn't gets updated.
- amVertexInfo remains empty on Moving from DAG details -> Any vertex details 
-> Back to DAG details. Looks like the records were unloaded. They gets 
populated only after the next poll request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2784) optimize TaskImpl.isFinished()

2015-09-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737250#comment-14737250
 ] 

Bikas Saha commented on TEZ-2784:
-

does this need a backport to branch 7?

> optimize TaskImpl.isFinished()
> --
>
> Key: TEZ-2784
> URL: https://issues.apache.org/jira/browse/TEZ-2784
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Fix For: 0.8.1
>
> Attachments: AM_Profiler_snapshot.jpg, TEZ-2784.1.patch
>
>
> getInternalState() gets called multiple times within the same method within 
> read lock. This shows up in the AM profiler when executing large jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2796) AM web service API V2: Add filtering & sorting

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737281#comment-14737281
 ] 

Hitesh Shah commented on TEZ-2796:
--

Support for fields is not related to filtering/sorting. Any reason why it is 
not bundled into this and not a separate jira? 

> AM web service API V2: Add filtering & sorting
> --
>
> Key: TEZ-2796
> URL: https://issues.apache.org/jira/browse/TEZ-2796
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Functionalities are yet to be finalized. Following is tentative.
> Query params:
> - fromMinID - limit number of entities from the specified ID would be fetched.
> - fields - Comma separated list. All the specified fields + ID would be 
> returned.
> - sortOn - Filed to be sorted on
> - SortOrder - asc/dsc (ascending/descending). By default asc.
> * counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2795) AM web service V2: Refactoring & optimization

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2795:

Description: 
Make error handling better
- Currently on error the function returns null, and the callee checks for null 
value which is not the best approach.
- Guess the best is to modify the exception to give a meaning full message, and 
then re-throw the same. A base try..catch block can be put in to handle all 
exceptions.
- Gracefully fail invalid ids -  Server shouldn't return error even if ids are 
invalid.

  was:
Make error handling better
- Currently on error the function returns null, and the callee checks for null 
value which is not the best approach.
- Guess the best is to modify the exception to give a meaning full message, and 
then re-throw the same. A base try..catch block can be put in to handle all 
exceptions.


> AM web service V2: Refactoring & optimization
> -
>
> Key: TEZ-2795
> URL: https://issues.apache.org/jira/browse/TEZ-2795
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>
> Make error handling better
> - Currently on error the function returns null, and the callee checks for 
> null value which is not the best approach.
> - Guess the best is to modify the exception to give a meaning full message, 
> and then re-throw the same. A base try..catch block can be put in to handle 
> all exceptions.
> - Gracefully fail invalid ids -  Server shouldn't return error even if ids 
> are invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2795) AM web service V2: Refactoring & optimization

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2795:
---

 Summary: AM web service V2: Refactoring & optimization
 Key: TEZ-2795
 URL: https://issues.apache.org/jira/browse/TEZ-2795
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram


Make error handling better
- Currently on error the function returns null, and the callee checks for null 
value which is not the best approach.
- Guess the best is to modify the exception to give a meaning full message, and 
then re-throw the same. A base try..catch block can be put in to handle all 
exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737184#comment-14737184
 ] 

Sreenath Somarajapuram commented on TEZ-2792:
-

[~hitesh]
- if a user asks for 10 ids, if one id is not found, then a 404 is thrown. Is 
that intentional?
-- Yes. Something might be wrong on receiving an invalid id, and thought it 
might be better to throw.
-- Should that be graceful?

- Better error management
-- Totally agree, is it a better approach to modify and re-throw the exception, 
it can be caught by a base try-catch?
-- Have created a ticket to for the same TEZ-2795

- getRequestedTasks
-- The function is something I would like to make better. 
-- Probably better error management might help a bit.

- minTaskId:
-- Sounds good, will add the same in a later patch, do we have a final idea on 
how filters and sorting must work?


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-2796) AM web service API V2: Add filtering & sorting

2015-09-09 Thread Sreenath Somarajapuram (JIRA)
Sreenath Somarajapuram created TEZ-2796:
---

 Summary: AM web service API V2: Add filtering & sorting
 Key: TEZ-2796
 URL: https://issues.apache.org/jira/browse/TEZ-2796
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Sreenath Somarajapuram


Functionalities are yet to be finalized. Following is tentative.

Query params:
- fromMinID - limit number of entities from the specified ID would be fetched.
- fileds - Comma separated list. All the specified fields + ID would be 
returned.
- sortOn - Filed to be sorted on
- SortOrder - asc/dsc (ascending/descending). By default asc.

* counters would be handled in another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2792) Add AM web service API for tasks.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737184#comment-14737184
 ] 

Sreenath Somarajapuram edited comment on TEZ-2792 at 9/9/15 5:01 PM:
-

[~hitesh]
- if a user asks for 10 ids, if one id is not found, then a 404 is thrown. Is 
that intentional?
-- Yes. Something might be wrong on receiving an invalid id, and thought it 
might be better to throw.
-- Should that be graceful?

- Better error management
-- Totally agree, is it a better approach to modify and re-throw the exception, 
it can be caught by a base try-catch?
-- Have created a ticket to for the same TEZ-2795

- getRequestedTasks
-- The function is something I would like to make better. 
-- Probably better error management might help a bit.

- minTaskId:
-- Sounds good, will add the same in a later patch, do we have a final idea on 
how filters and sorting must work?
-- Created TEZ-2796 for the same.



was (Author: sreenath):
[~hitesh]
- if a user asks for 10 ids, if one id is not found, then a 404 is thrown. Is 
that intentional?
-- Yes. Something might be wrong on receiving an invalid id, and thought it 
might be better to throw.
-- Should that be graceful?

- Better error management
-- Totally agree, is it a better approach to modify and re-throw the exception, 
it can be caught by a base try-catch?
-- Have created a ticket to for the same TEZ-2795

- getRequestedTasks
-- The function is something I would like to make better. 
-- Probably better error management might help a bit.

- minTaskId:
-- Sounds good, will add the same in a later patch, do we have a final idea on 
how filters and sorting must work?


> Add AM web service API for tasks.
> -
>
> Key: TEZ-2792
> URL: https://issues.apache.org/jira/browse/TEZ-2792
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2792.1.patch
>
>
> Add AM API for fetching realtime tasks info:
> - API endpoint : /ws/v2/tez/tasksInfo
> - Query Params:
> -- dagMinID: dagMinID = dagIndex, (mandatory).
> -- vertexMinID: A comma separated list. vertexMinID = vertexIndex.
> -- taskMinID: A comma separated list. taskMinID = vertexIndex_taskIndex
> -- limit: Maximum number of items to be returned (Defaults to 100).
> - If taskMinID is passed: All (capped by limit) the specified tasks will be 
> returned. vertexMinID if present wont be considered.
> - IF vertexMinID is passed: All (capped by limit) tasks under the vertices 
> will be returned.
> - If just dagID is passed: All (capped by limit) tasks under the DAG will be 
> returned.
> - Data returned: complete task id, progress, status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2774) Reduce logging in the AM, and parts of the runtime

2015-09-09 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737237#comment-14737237
 ] 

Siddharth Seth commented on TEZ-2774:
-

Scenarios make all the difference. If speculation is the concern, some of these 
messages can be enabled only when speculation is turned on - instead of turning 
them on for all running jobs.
Task completions are already available via the history events.
The task scheduler logs for min-held and prewarm are alredy at INFO level - 
requested by other reviewers as well, for the same reason.
The input spec log is already present in the TaskSpec log.

The intent is to reduce the size of the AM log - which otherwise has a lot of 
duplicated information, and becomes difficult to read. Keeping all log lines 
wholesale because they've been helpful in the past means we'll keep these later 
as well. Instead of keeping all of them, it's better to look at where else this 
information is available, how relevant it is, and whether it can be obtained 
without excessive logging.
The task request to the scheduler - I don't really see how this helps anything. 
Logging the equivalent of the CLC once per vertex should give us enough 
information on what tasks are requesting.
So once again, please look at what is available after the changes go in, and 
suggest potential improvements - with scenarios in mind - rather than trying to 
keep all log lines with duplicated infromation.

> Reduce logging in the AM, and parts of the runtime
> --
>
> Key: TEZ-2774
> URL: https://issues.apache.org/jira/browse/TEZ-2774
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2774.1.txt, TEZ-2774.2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737324#comment-14737324
 ] 

Sreenath Somarajapuram commented on TEZ-2780:
-

[~pramachandran]
- The mixin was created for all table pages with the assumption that only the 
respective entity type would have to be polled.
- It was created with the intention to add polling functionality to a table 
controller with the least amount of code.
- That said, I totally agree that on considering a future use-case of multiple 
polling this would be of less use. Hence I have moved all the functionalities 
into a new class named EntityArrayPollster (inherited from pollster).

The changes can be found in patch 2. Please review.

> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.2.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on model change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2780) Tez UI: Update All Tasks page while in progress.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2780:

Description: 
Modify table component to automatically update cell based on in-progress data.
#. Upgrade polling logic to manage a specific entity.
#. Added progress column to All Tasks table.
#. Updated table to automatically reflect model change - Just need to set 
observePath to true in defaultColumnConfigs.
#. Updated table logic to send action on row change - so that polling logic can 
query for fresh records.

  was:
Modify table component to automatically update cell based on model change.
#. Upgraded pollster to manage polling for a specific entity.
#. Added progress column to All Tasks table.
#. Updated table to automatically reflect model change - Just need to set 
observePath to true in defaultColumnConfigs.
#. Updated table logic to send action on row change.


> Tez UI: Update All Tasks page while in progress.
> 
>
> Key: TEZ-2780
> URL: https://issues.apache.org/jira/browse/TEZ-2780
> Project: Apache Tez
>  Issue Type: Sub-task
>  Components: UI
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-2780.1.patch, TEZ-2780.2.patch, TEZ-2780.wip.1.patch
>
>
> Modify table component to automatically update cell based on in-progress data.
> #. Upgrade polling logic to manage a specific entity.
> #. Added progress column to All Tasks table.
> #. Updated table to automatically reflect model change - Just need to set 
> observePath to true in defaultColumnConfigs.
> #. Updated table logic to send action on row change - so that polling logic 
> can query for fresh records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats

2015-09-09 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737124#comment-14737124
 ] 

Rohini Palaniswamy commented on TEZ-2658:
-

[~saikatr],
  We will need external documentation similar to 
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#job
 . Can you also attach a document with the output of each commands so that it 
is easy to get an idea of all the features added and then review?

> Create a CLI utility tool to track Tez DAG/Application Stats
> 
>
> Key: TEZ-2658
> URL: https://issues.apache.org/jira/browse/TEZ-2658
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Saikat
>Assignee: Saikat
> Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, 
> TEZ-2658.4.patch, TEZ-2658.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2785) Tez UI: Implement an abstract controller & route to enclose basic generic functionalities.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2785:

Description: Move setup, reset & isActive from TablePageController to this 
abstract controller. Same for route.  (was: Move setup, reset is Active from 
TablePageController to this abstract controller.)

> Tez UI: Implement an abstract controller & route to enclose basic generic 
> functionalities.
> --
>
> Key: TEZ-2785
> URL: https://issues.apache.org/jira/browse/TEZ-2785
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Sreenath Somarajapuram
>Priority: Minor
>
> Move setup, reset & isActive from TablePageController to this abstract 
> controller. Same for route.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2785) Tez UI: Implement an abstract controller & route to enclose basic generic functionalities.

2015-09-09 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-2785:

Summary: Tez UI: Implement an abstract controller & route to enclose basic 
generic functionalities.  (was: Tez UI: Implement an abstract controller to 
enclose basic generic functionalities.)

> Tez UI: Implement an abstract controller & route to enclose basic generic 
> functionalities.
> --
>
> Key: TEZ-2785
> URL: https://issues.apache.org/jira/browse/TEZ-2785
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Sreenath Somarajapuram
>Priority: Minor
>
> Move setup, reset is Active from TablePageController to this abstract 
> controller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-2775) Reduce logging in runtime components

2015-09-09 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-2775:

Attachment: TEZ-2775.1.txt

First cut of the patch.

[~rajesh.balamohan] - please review.
This retains a log message for each successful fetch, which causes the logs to 
be quite large. Also retains the HttpConnection string used for fetches. Can 
either of these be removed ? Also, is this removing information which is 
critical to debugging.

> Reduce logging in runtime components
> 
>
> Key: TEZ-2775
> URL: https://issues.apache.org/jira/browse/TEZ-2775
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
> Attachments: TEZ-2775.1.txt
>
>
> Specifically Shuffle, which logs a lot for each event being processed and 
> data being fetched.
> Also PipelinedShuffle is fairly noisy - some of the information from here 
> could be consolidated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-2775) Reduce logging in runtime components

2015-09-09 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned TEZ-2775:
---

Assignee: Siddharth Seth

> Reduce logging in runtime components
> 
>
> Key: TEZ-2775
> URL: https://issues.apache.org/jira/browse/TEZ-2775
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: TEZ-2775.1.txt
>
>
> Specifically Shuffle, which logs a lot for each event being processed and 
> data being fetched.
> Also PipelinedShuffle is fairly noisy - some of the information from here 
> could be consolidated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2787) Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-09 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737646#comment-14737646
 ] 

Hitesh Shah commented on TEZ-2787:
--

+1 to [~jlowe]'s comment. This could be done by adding the code changes into 
the constructAMLaunchOpts() function before the user configs are applied. Would 
also make it easier to unit test this fix.  

> Tez AM should have java.io.tmpdir=./tmp to be consistent with tasks
> ---
>
> Key: TEZ-2787
> URL: https://issues.apache.org/jira/browse/TEZ-2787
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
> Attachments: TEZ-2787.1.patch
>
>
> TezRuntimeChildJVM ensures that tasks are launched with 
> -Djava.io.tmpdir=./tmp, but there's no corresponding code to ensure the Tez 
> AM also has a similar tmpdir setting.  The client should setup the AM launch 
> context to have -Djava.io.tmpdir=./tmp to be consistent with the tasks and to 
> prevent accidental leaking of files in /tmp by the Tez AM if it crashes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)