[ 
https://issues.apache.org/jira/browse/TEZ-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063277#comment-17063277
 ] 

László Bodor edited comment on TEZ-4105 at 4/7/20, 12:16 PM:
-------------------------------------------------------------

{code}
wget -qO- 
"https://issues.apache.org/jira/secure/attachment/12998769/TEZ-4105.09.patch"; | 
git apply -p0 -3

mvn clean install -DskipTests -Ptools

mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser 
-pl ./tez-tools/analyzers/job-analyzer

wget 
https://issues.apache.org/jira/secure/attachment/12997207/dag_1583980529217_0000_18_1

wget 
https://issues.apache.org/jira/secure/attachment/12997208/dag_1583980529217_0000_18_1_1

# assuming that the downloaded files are in /Users/user/Downloads/
java -cp 
"./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*"
 org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer 
--dagId=dag_1583980529217_0000_18 --fromProtoHistory 
--eventFileName=/Users/user/Downloads/ --outputDir /tmp --saveResults
{code}

output under /tmp
{code}
vertex,node,numTasks,load
Map 
2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map 
3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map 
3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map 
3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map 
3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map 
3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map 
3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map 
3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map 
3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map 
3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map 
3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map 
3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map 
3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map 
3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map 
3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map 
3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map 
3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map 
3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map 
3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map 
3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map 
3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map 
1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer 
4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer 
4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer 
4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer 
4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer 
4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer 
4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer 
4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer 
4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer 
4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer 
4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer 
4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 
4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer 
4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 
4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer 
4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer 
4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 
4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 
4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer 
4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer 
4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}

could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:

1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject 
for analyzers (I would not recommend reviewing in details, it's 
long...basically it was copied from HistoryEventJsonConversion and converted, 
you can check javadoc comment on the class)

2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost 
exactly the same as simple history logging parser, as 
HistoryEventProtoJsonConversion was created in a way that it outputs the same 
JSON as HistoryEventJsonConversion

3. added some missing serialized values to HistoryEventProtoConverter, 
HistoryEventJsonConversion

4. extend parsers to accept multiple files (in my usecase, history events were 
split into 2 files), and implement it for ProtoHistoryParser + accept directory 
and find files by pattern inside (contains dagId)

5. added a new analyzer which helped me a lot with getting basic insights about 
the outcome of task attempts: TaskAttemptResultStatisticsAnalyzer

patch can be tested by attached protobuf history files like above


was (Author: abstractdog):
{code}
wget -qO- 
"https://issues.apache.org/jira/secure/attachment/12998769/TEZ-4105.09.patch"; | 
git apply -p0 -3

mvn clean install -DskipTests -Ptools

mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser 
-pl ./tez-tools/analyzers/job-analyzer

java -cp 
"./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*"
 org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer 
--dagId=dag_1583980529217_0000_18 --fromProtoHistory 
--eventFileName=/Users/lbodor/Downloads/ --outputDir /tmp --saveResults
{code}

output under /tmp
{code}
vertex,node,numTasks,load
Map 
2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map 
3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map 
3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map 
3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map 
3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map 
3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map 
3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map 
3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map 
3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map 
3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map 
3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map 
3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map 
3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map 
3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map 
3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map 
3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map 
3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map 
3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map 
3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map 
3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map 
3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map 
1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer 
4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer 
4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer 
4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer 
4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer 
4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer 
4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer 
4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer 
4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer 
4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer 
4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer 
4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 
4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer 
4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer 
4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer 
4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer 
4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 
4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer 
4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer 
4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer 
4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}

could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:

1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject 
for analyzers (I would not recommend reviewing in details, it's 
long...basically it was copied from HistoryEventJsonConversion and converted, 
you can check javadoc comment on the class)

2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost 
exactly the same as simple history logging parser, as 
HistoryEventProtoJsonConversion was created in a way that it outputs the same 
JSON as HistoryEventJsonConversion

3. added some missing serialized values to HistoryEventProtoConverter, 
HistoryEventJsonConversion

4. extend parsers to accept multiple files (in my usecase, history events were 
split into 2 files), and implement it for ProtoHistoryParser + accept directory 
and find files by pattern inside (contains dagId)

5. added a new analyzer which helped me a lot with getting basic insights about 
the outcome of task attempts: TaskAttemptResultStatisticsAnalyzer

patch can be tested by attached protobuf history files like above

> Tez job-analyzer tool to support proto logging history
> ------------------------------------------------------
>
>                 Key: TEZ-4105
>                 URL: https://issues.apache.org/jira/browse/TEZ-4105
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: TEZ-4105.01.patch, TEZ-4105.02.patch, TEZ-4105.03.patch, 
> TEZ-4105.04.patch, TEZ-4105.05.patch, TEZ-4105.06.patch, TEZ-4105.07.patch, 
> TEZ-4105.08.patch, TEZ-4105.09.patch, dag_1583980529217_0000_18_1, 
> dag_1583980529217_0000_18_1_1
>
>
> Currently analyzers in tez-tools can only work with output of ats (zipped 
> json files) and simple history logging (plain text json file) files. It would 
> be nice to have a parser that can create the needed info for analyzers from a 
> dag protobuf file. In order to achieve this, we need at least a converter 
> which can convert HistoryProtoEvent instances to a format which can be read 
> by TezAnalyzerBase (+ a new parser is needed, similarly to 
> SimpleHistoryParser)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to