[
https://issues.apache.org/jira/browse/TEZ-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494529#comment-14494529
]
Jeff Zhang edited comment on TEZ-2319 at 4/14/15 6:24 PM:
----------------------------------------------------------
[~rohini]
* Regarding the jobconf.xml, I think it is serialized into payload of
Processor, and already be in the dagPlan of DAGSubmittedEvent
* Regarding the job history data, Is the TaskAttemptFinishedEvent &
TaskFinishedEvent sufficent for you ?
BTW shouldn't the job history data analysis based on the ATS ? Currently the
data written on HDFS is only for recovery.
was (Author: zjffdu):
[~rohini]
* Regarding the jobconf.xml, I think it is serialized into payload of
Processor, and already be in the dagPlan of DAGSubmittedEvent
* Regarding the job history data, Is the TaskAttemptFinishedEvent &
TaskFinishedEvent sufficent for you ?
BTW shouldn't the job history data analysis based on the ATS ? Currently the
data written on HDFS is only for recovery.
> DAG history in HDFS
> -------------------
>
> Key: TEZ-2319
> URL: https://issues.apache.org/jira/browse/TEZ-2319
> Project: Apache Tez
> Issue Type: New Feature
> Reporter: Rohini Palaniswamy
>
> We have processes, that parse jobconf.xml and job history details (map and
> reduce task details, etc) in avro files from HDFS and load them into hive
> tables for analysis for mapreduce jobs. Would like to have Tez also make this
> information written to a history file in HDFS when AM or each DAG completes
> so that we can do analytics on Tez jobs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)