[
https://issues.apache.org/jira/browse/TEZ-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540874#comment-14540874
]
Hitesh Shah commented on TEZ-2076:
----------------------------------
Comments:
- usage guidelines and javadocs still point to "./target" - should be
removed. A simple "tez-history-parser-x.y.z.jar" should be sufficient.
- org.apache.tez.history.ATSImportTool is the main class for the jar - why it
is still needed to be specified on the command line?
- s/atsAddress/ with something related to yarn timeline
- batchSize is obtained from command line but "this.batchSize =
conf.getInt(BATCH_SIZE, BATCH_SIZE_DEFAULT);" still exists?
- there is both a ctor and init() which is causing some confusion in how some
fields are initialized. Not sure why there is a need for both.
- download() still has 4 catch exception blocks. Not sure why they all need a
TezException wrapper.
- yarnConfig.useHttps - does this compile against older versions of hadoop?
- no checks on atsAddress being empty or null before using it.
- Why does IOException need to be wrapped into TezException in
logErrorMessage?
- logErrorMessage ends up using multiple lines for a single error response.
Shouldn't it be on a single log line?
- Why is this needed: "LOG.error(Throwables.getStackTraceAsString(e));" ?
- Might be good to log both started and finished parsing instead of just at
the end e.g. "LOG.info("Parsed task information");"
- dag info should also have tez version information - might be good to
capture that into the object model.
> Tez framework to extract/analyze data stored in ATS for specific dag
> --------------------------------------------------------------------
>
> Key: TEZ-2076
> URL: https://issues.apache.org/jira/browse/TEZ-2076
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2076.1.patch, TEZ-2076.10.patch, TEZ-2076.11.patch,
> TEZ-2076.2.patch, TEZ-2076.3.patch, TEZ-2076.4.patch, TEZ-2076.5.patch,
> TEZ-2076.6.patch, TEZ-2076.7.patch, TEZ-2076.8.patch, TEZ-2076.9.patch,
> TEZ-2076.WIP.2.patch, TEZ-2076.WIP.3.patch, TEZ-2076.WIP.patch
>
>
> - Users should be able to download ATS data pertaining to a DAG from Tez-UI
> (more like a zip file containing DAG/Vertex/Task/TaskAttempt info).
> - This can be plugged to an analyzer which parses the data, adds semantics
> and provides an in-memory representation for further analysis.
> - This will enable to write different analyzer rules, which can be run on top
> of this in-memory representation to come up with analysis on the DAG.
> - Results of this analyzer rules can be rendered on to UI (standalone webapp)
> later point in time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)