[jira] [Commented] (TEZ-2076) Tez framework to extract/analyze data stored in ATS for specific dag

Hitesh Shah (JIRA) Sun, 26 Apr 2015 17:40:34 -0700

    [ 
https://issues.apache.org/jira/browse/TEZ-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513352#comment-14513352
 ]


Hitesh Shah commented on TEZ-2076:
----------------------------------

Comments: 
  - ATSImportTool 
    - no docs or usage information on how to invoke it 
    - It seems to be taking params via config and not command line. There is no 
way for a user to figure out what the parameters are. 

{code}
 } catch (TezException e) {
        return -1;
 }
{code}
  - not logging the exception means the user is blind when an error happens. 

{code}
yarn jar ./target/tez-history-parser-X.X.X.jar 
org.apache.tez.history.ATSImportTool
{code}
  - does the jar have other executable classes that require a mainClass to be 
provided always?
  - also please remove target from the usage info
  - s/X.X.X/x.y.z/g

The jar is not a fat jar but it does need additional dependencies. Are you 
expecting all dependencies to be provided in the yarn classpath? If yes, why 
and how?

What happens when some of the data is downloaded but some fails to? 

What happens if the tool is run when a dag is still in progress? Will it give 
invalid data back? Should that case be handled by throwing an error or just 
having the user warned as needed? 

AbstractInfo seems like a bad classname - name does not signify what it is for. 
Maybe BaseInfo and then use abstract class?

Should all info objects representing the data be moved to a package say 
parser.datamodel ? 

How is versioning being handled in the serialized zip structure? Also, why json 
as compared to say a protobuf structure? 

{code}
 JSONArray attemptsJson = jsonObject.optJSONArray(Constants.TASK_ATTEMPTS);
{code}
  - What if there are 100,000 attempts? or more? Does this require a large 
memory footprint? 
  - Should serialized data be loaded on an demand basis? Or does the analyser 
always take an initial hit to load all data into memory?

It seems like we have 2 data models. The runtime model and the analyser data 
model. It is going to be hard to keep them in sync. Any suggestions on how we 
can re-use a common model?

getAbsoluteSubmitTime() - is there a non-absolute timestamp elsewhere? Maybe 
simplify function names? 

Could you clarify why most classes are marked public? 

{code}
void setTaskInfo(TaskInfo taskInfo) {
  Preconditions.checkArgument(taskInfo != null, "Provide valid taskInfo");
  this.taskInfo = taskInfo;
  taskInfo.addTaskAttemptInfo(this);
}
{code}
  - this seems a bit non-intuitive. Both the parent and child are being updated 
in the same function. The same is applied to vertex too. Is there a reason why 
the bottom-up approach is taken and why a reference to the parent is needed as 
compared to just looking up the required object when needed? 

General comments: 
  - it would be good to try the tool with invalid data, corrupt zip files, etc 
to ensure that there is useful error messages.
  - the import tool should be run against an invalid timeline server for e.g 
make it hit the RM on port 8088 so that there is a valid webserver serving but 
returns back 404s, etc. , a server which times out, etc. 
  - invalid arguments are needed. A downloadDir named " " would be problematic 
for the tool. 
  - pom file needs 2 space tabs not 4. 
  - pom file contains too many dependencies. Not sure why there is a dependency 
on mr and hdfs jars for example.

 



 







 

 
















> Tez framework to extract/analyze data stored in ATS for specific dag
> --------------------------------------------------------------------
>
>                 Key: TEZ-2076
>                 URL: https://issues.apache.org/jira/browse/TEZ-2076
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2076.1.patch, TEZ-2076.2.patch, TEZ-2076.3.patch, 
> TEZ-2076.4.patch, TEZ-2076.5.patch, TEZ-2076.6.patch, TEZ-2076.7.patch, 
> TEZ-2076.8.patch, TEZ-2076.9.patch, TEZ-2076.WIP.2.patch, 
> TEZ-2076.WIP.3.patch, TEZ-2076.WIP.patch
>
>
> - Users should be able to download ATS data pertaining to a DAG from Tez-UI 
> (more like a zip file containing DAG/Vertex/Task/TaskAttempt info).
> - This can be plugged to an analyzer which parses the data, adds semantics 
> and provides an in-memory representation for further analysis.
> - This will enable to write different analyzer rules, which can be run on top 
> of this in-memory representation to come up with analysis on the DAG.
> - Results of this analyzer rules can be rendered on to UI (standalone webapp) 
> later point in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2076) Tez framework to extract/analyze data stored in ATS for specific dag

Reply via email to