[ 
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695937#comment-14695937
 ] 

Jason Lowe commented on TEZ-2628:
---------------------------------

bq. It seems to me that all the code in EntityLogger relies on YARN configs and 
there does not seem anything tez specific. i.e. could all of this just be 
wrapped in a different impl of TimelineClient itself?

[~zjshen] brought up a similar point in YARN-3942.  I plan on updating the YARN 
patch to move almost all the logic there so it's easier for app frameworks to 
use it.  The Tez-side changes should be much smaller after that.

bq. How do you see say the Hive client writing data to Timeline via HDFS? Would 
that be feasible or is the assumption that the existing write path via the 
Timeline webservices continue to work as long as all entities being written to 
via webservices are configured in the list of summary entity types?

The timeline store still accepts data being posted via the web interface which 
will be stored to the traditional backend store (e.g.: leveldb).  Data posted 
in that manner can be queried properly as long as we select the traditional 
backend store as the query source.  That will occur if any of the following are 
true:
* entity type is configured as a summary entity type
* nothing that can be construed as an app ID found in query string
* app ID derived from query string but no corresponding app ID data located in 
HDFS

There will be issues if something is posting data via one interface (e.g.: 
HDFS) and another tries to update/override that data via another (e.g.: web 
service).  I'm not totally familiar with the Hive server use case, so there 
might be some problems there.  We might be able to make the query processing 
more sophisticated by looking at the main store _and_ HDFS as data sources 
rather than one or the other.  However I'm not sure if we'll have issues 
resolving ordering problems (e.g.: if entity was in both sources which is the 
correct one or is it some combination of the two).

> History logging plugin to write ATS events to HDFS
> --------------------------------------------------
>
>                 Key: TEZ-2628
>                 URL: https://issues.apache.org/jira/browse/TEZ-2628
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: TEZ-2628.001.patch
>
>
> This provides another history logging alternative that conceptually the same 
> as the timeline logging service but logs the entities to a file rather than 
> posting the events to the timeline server directly.  When coupled with the 
> timeline store plugin from YARN-3942 it allows the Tez job to be decoupled 
> from the timeline server yet the Tez UI can still function properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to