[ 
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706543#comment-14706543
 ] 

Rajesh Balamohan commented on TEZ-2628:
---------------------------------------


Tested this on small scale cluster. It works fine, but has some issues when 
trying to download data from ATS. There are 2 ways in which data can be 
downloaded from ATS for tez. 1. via UI "download data" button. 2. via 
"ATSImportTool" which is a command line utility.

In both cases, the data downloaded via ATS (with 1.5 patch YARN-3942) does not 
have the complete information. If "fromId" is specified in the URL, it provides 
data in random pattern. (e.g 
http://atsmachine:8188/ws/v1/timeline/TEZ_TASK_ID?limit=3&primaryFilter=TEZ_VERTEX_ID:vertex_1439860407967_0054_1_11&fromId=task_1439860407967_0054_1_11_000420
 will return different data than 
http://atsmachine:8188/ws/v1/timeline/TEZ_TASK_ID?limit=3&primaryFilter=TEZ_VERTEX_ID:vertex_1439860407967_0054_1_11).
 So if there is pagination involved (e.g downloading 100 tasks at a time), it 
runs into issues, where it would not be able to download complete data.

It is possible that with YARN-3942, it ends up using MemoryTimelineStore where 
the getEntities impl is different than in LevelDBStore. This could possibly be 
causing the issue, but not too sure.

Alternate workaround would be to specify limit=100000, so that it would 
download all tasks in a single fetch, but not sure if leveldb would impose any 
restrictions by default on the limits. TEZ-UI does not have the issue, as it 
adds some really high values for "limit" during first pull.


> History logging plugin to write ATS events to HDFS
> --------------------------------------------------
>
>                 Key: TEZ-2628
>                 URL: https://issues.apache.org/jira/browse/TEZ-2628
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: TEZ-2628.001.patch, TEZ-2628.002.patch, 
> hive-timeline.json
>
>
> This provides another history logging alternative that conceptually the same 
> as the timeline logging service but logs the entities to a file rather than 
> posting the events to the timeline server directly.  When coupled with the 
> timeline store plugin from YARN-3942 it allows the Tez job to be decoupled 
> from the timeline server yet the Tez UI can still function properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to