[
https://issues.apache.org/jira/browse/TEZ-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695937#comment-14695937
]
Jason Lowe commented on TEZ-2628:
---------------------------------
bq. It seems to me that all the code in EntityLogger relies on YARN configs and
there does not seem anything tez specific. i.e. could all of this just be
wrapped in a different impl of TimelineClient itself?
[~zjshen] brought up a similar point in YARN-3942. I plan on updating the YARN
patch to move almost all the logic there so it's easier for app frameworks to
use it. The Tez-side changes should be much smaller after that.
bq. How do you see say the Hive client writing data to Timeline via HDFS? Would
that be feasible or is the assumption that the existing write path via the
Timeline webservices continue to work as long as all entities being written to
via webservices are configured in the list of summary entity types?
The timeline store still accepts data being posted via the web interface which
will be stored to the traditional backend store (e.g.: leveldb). Data posted
in that manner can be queried properly as long as we select the traditional
backend store as the query source. That will occur if any of the following are
true:
* entity type is configured as a summary entity type
* nothing that can be construed as an app ID found in query string
* app ID derived from query string but no corresponding app ID data located in
HDFS
There will be issues if something is posting data via one interface (e.g.:
HDFS) and another tries to update/override that data via another (e.g.: web
service). I'm not totally familiar with the Hive server use case, so there
might be some problems there. We might be able to make the query processing
more sophisticated by looking at the main store _and_ HDFS as data sources
rather than one or the other. However I'm not sure if we'll have issues
resolving ordering problems (e.g.: if entity was in both sources which is the
correct one or is it some combination of the two).
> History logging plugin to write ATS events to HDFS
> --------------------------------------------------
>
> Key: TEZ-2628
> URL: https://issues.apache.org/jira/browse/TEZ-2628
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: TEZ-2628.001.patch
>
>
> This provides another history logging alternative that conceptually the same
> as the timeline logging service but logs the entities to a file rather than
> posting the events to the timeline server directly. When coupled with the
> timeline store plugin from YARN-3942 it allows the Tez job to be decoupled
> from the timeline server yet the Tez UI can still function properly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)