[
https://issues.apache.org/jira/browse/TEZ-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063277#comment-17063277
]
László Bodor edited comment on TEZ-4105 at 4/7/20, 12:13 PM:
-------------------------------------------------------------
{code}
wget -qO-
"https://issues.apache.org/jira/secure/attachment/12998769/TEZ-4105.09.patch" |
git apply -p0 -3
mvn clean install -DskipTests -Ptools
mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser
-pl ./tez-tools/analyzers/job-analyzer
java -cp
"./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*"
org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer
--dagId=dag_1583980529217_0000_18 --fromProtoHistory
--eventFileName=/Users/lbodor/Downloads/ --outputDir /tmp --saveResults
{code}
output under /tmp
{code}
vertex,node,numTasks,load
Map
2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map
3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map
3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map
3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map
3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map
3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map
3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map
3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map
3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map
3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map
3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map
3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map
3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map
3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map
3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map
3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map
3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map
3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map
3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map
3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map
3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map
1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer
4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer
4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer
4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer
4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer
4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer
4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer
4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer
4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer
4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer
4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer
4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer
4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer
4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer
4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer
4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer
4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer
4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer
4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer
4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer
4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}
could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:
1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject
for analyzers (I would not recommend reviewing in details, it's
long...basically it was copied from HistoryEventJsonConversion and converted,
you can check javadoc comment on the class)
2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost
exactly the same as simple history logging parser, as
HistoryEventProtoJsonConversion was created in a way that it outputs the same
JSON as HistoryEventJsonConversion
3. added some missing serialized values to HistoryEventProtoConverter,
HistoryEventJsonConversion
4. extend parsers to accept multiple files (in my usecase, history events were
split into 2 files), and implement it for ProtoHistoryParser + accept directory
and find files by pattern inside (contains dagId)
5. added a new analyzer which helped me a lot with getting basic insights about
the outcome of task attempts: TaskAttemptResultStatisticsAnalyzer
patch can be tested by attached protobuf history files like above
was (Author: abstractdog):
{code}
wget -qO-
"https://issues.apache.org/jira/secure/attachment/12998769/TEZ-4105.09.patch" |
git apply -p0 -3
mvn clean install -DskipTests -Ptools
mvn dependency:copy-dependencies -Ptools -pl ./tez-plugins/tez-history-parser
-pl ./tez-tools/analyzers/job-analyzer
java -cp
"./tez-tools/analyzers/job-analyzer/target/*:./tez-tools/analyzers/job-analyzer/target/dependency/*"
org.apache.tez.analyzer.plugins.AnalyzerDriver TaskAssignmentAnalyzer
--dagId=dag_1583980529217_0000_18 --fromProtoHistory
--eventFileName=/Users/lbodor/Downloads/ --outputDir /tmp --saveResults
{code}
output under /tmp
{code}
vertex,node,numTasks,load
Map
2,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,2,100.00
Map
3,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,832,240.29
Map
3,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,173,49.96
Map
3,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,402,116.10
Map
3,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,132,38.12
Map
3,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,147,42.45
Map
3,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,28,8.09
Map
3,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,184,53.14
Map
3,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,810,233.94
Map
3,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,151,43.61
Map
3,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,118,34.08
Map
3,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,427,123.32
Map
3,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,25,7.22
Map
3,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,170,49.10
Map
3,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,579,167.22
Map
3,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,168,48.52
Map
3,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,1788,516.39
Map
3,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,175,50.54
Map
3,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,48,13.86
Map
3,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,492,142.09
Map
3,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,76,21.95
Map
1,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,89,100.00
Reducer
4,query-executor-0-8.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,65,88.98
Reducer
4,query-executor-1-5.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,63,86.24
Reducer
4,query-executor-0-6.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,80,109.51
Reducer
4,query-executor-1-7.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,114,156.06
Reducer
4,query-executor-1-9.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,35,47.91
Reducer
4,query-executor-0-4.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,4,5.48
Reducer
4,query-executor-1-3.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,68,93.09
Reducer
4,query-executor-0-0.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,109,149.21
Reducer
4,query-executor-1-1.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,120,164.27
Reducer
4,query-executor-1-6.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,60,82.14
Reducer
4,query-executor-0-2.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer
4,query-executor-0-7.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,12,16.43
Reducer
4,query-executor-1-4.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,94,128.68
Reducer
4,query-executor-0-9.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,95,130.05
Reducer
4,query-executor-1-2.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,102,139.63
Reducer
4,query-executor-0-5.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer
4,query-executor-1-8.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,66,90.35
Reducer
4,query-executor-0-1.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,5,6.84
Reducer
4,query-executor-0-3.query-executor-0-service.compute-1583980296-q4zs.svc.cluster.local,81,110.88
Reducer
4,query-executor-1-0.query-executor-1-service.compute-1583980296-q4zs.svc.cluster.local,128,175.22
{code}
could you please take a look [~rajesh.balamohan], [~gopalv]?
the patch is about the following:
1. HistoryEventProtoJsonConversion: Convert HistoryEventProto into JSONObject
for analyzers (I would not recommend reviewing in details, it's
long...basically it was copied from HistoryEventJsonConversion and converted,
you can check javadoc comment on the class)
2. ProtoHistoryParser extends SimpleHistoryParser: the new parser works almost
exactly the same as simple history logging parser, as
HistoryEventProtoJsonConversion was created in a way that it outputs the same
JSON as HistoryEventJsonConversion
3. added some missing serialized values to HistoryEventProtoConverter,
HistoryEventJsonConversion
4. extend parsers to accept multiple files (in my usecase, history events were
split into 2 files), and implement it for ProtoHistoryParser + accept directory
and find files by pattern inside (contains dagId)
patch can be tested by attached protobuf history files like above
> Tez job-analyzer tool to support proto logging history
> ------------------------------------------------------
>
> Key: TEZ-4105
> URL: https://issues.apache.org/jira/browse/TEZ-4105
> Project: Apache Tez
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Attachments: TEZ-4105.01.patch, TEZ-4105.02.patch, TEZ-4105.03.patch,
> TEZ-4105.04.patch, TEZ-4105.05.patch, TEZ-4105.06.patch, TEZ-4105.07.patch,
> TEZ-4105.08.patch, TEZ-4105.09.patch, dag_1583980529217_0000_18_1,
> dag_1583980529217_0000_18_1_1
>
>
> Currently analyzers in tez-tools can only work with output of ats (zipped
> json files) and simple history logging (plain text json file) files. It would
> be nice to have a parser that can create the needed info for analyzers from a
> dag protobuf file. In order to achieve this, we need at least a converter
> which can convert HistoryProtoEvent instances to a format which can be read
> by TezAnalyzerBase (+ a new parser is needed, similarly to
> SimpleHistoryParser)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)