[ 
https://issues.apache.org/jira/browse/TEZ-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-2076:
----------------------------------
    Attachment: TEZ-2076.12.patch

usage guidelines and javadocs still point to "./target" - should be removed. A 
simple "tez-history-parser-x.y.z.jar" should be sufficient.
- Fixed. 

org.apache.tez.history.ATSImportTool is the main class for the jar - why it is 
still needed to be specified on the command line?
- Fixed doc/usage to reflect this.

s/atsAddress/ with something related to yarn timeline
- Renamed to yarnTimelineAddress

batchSize is obtained from command line but "this.batchSize = 
conf.getInt(BATCH_SIZE, BATCH_SIZE_DEFAULT);" still exists?
- Fixed.

download() still has 4 catch exception blocks. Not sure why they all need a 
TezException wrapper.
- Fixed.

yarnConfig.useHttps - does this compile against older versions of hadoop?
- useHttps() was introduced in Hadoop 2.4 I believe.  Using the local impl now.

no checks on atsAddress being empty or null before using it.
- Fixed

Why does IOException need to be wrapped into TezException in logErrorMessage?
- There is no need to catch IOException as it is logged in run(). Removed from 
logErrorMessage.

logErrorMessage ends up using multiple lines for a single error response. 
Shouldn't it be on a single log line?
- Sometimes ATS returns in plain html format. It might be easier to look at it 
as it is, than in single line.

Why is this needed: "LOG.error(Throwables.getStackTraceAsString(e));" ?
- It was a System.out statement earlier which was replaced with LOG.error 
later.  Removed this unwanted line.

Might be good to log both started and finished parsing instead of just at the 
end e.g. "LOG.info("Parsed task information");"
- Fixed. Marked it to debug level.

dag info should also have tez version information - might be good to capture 
that into the object model.
- Version info is present in TEZ_APPLICATION. Added this under 
"TEZ_APPLICATION" tag when downloading data from ATS. However, need to add this 
in TEZ-UI's as well on "download" click. Will create a separate JIRA for that.
Also, Added version details in DagInfo when parsing. 

> Tez framework to extract/analyze data stored in ATS for specific dag
> --------------------------------------------------------------------
>
>                 Key: TEZ-2076
>                 URL: https://issues.apache.org/jira/browse/TEZ-2076
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2076.1.patch, TEZ-2076.10.patch, TEZ-2076.11.patch, 
> TEZ-2076.12.patch, TEZ-2076.2.patch, TEZ-2076.3.patch, TEZ-2076.4.patch, 
> TEZ-2076.5.patch, TEZ-2076.6.patch, TEZ-2076.7.patch, TEZ-2076.8.patch, 
> TEZ-2076.9.patch, TEZ-2076.WIP.2.patch, TEZ-2076.WIP.3.patch, 
> TEZ-2076.WIP.patch
>
>
> - Users should be able to download ATS data pertaining to a DAG from Tez-UI 
> (more like a zip file containing DAG/Vertex/Task/TaskAttempt info).
> - This can be plugged to an analyzer which parses the data, adds semantics 
> and provides an in-memory representation for further analysis.
> - This will enable to write different analyzer rules, which can be run on top 
> of this in-memory representation to come up with analysis on the DAG.
> - Results of this analyzer rules can be rendered on to UI (standalone webapp) 
> later point in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to