[
https://issues.apache.org/jira/browse/HIVE-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493897#comment-16493897
]
Sahil Takiar commented on HIVE-19508:
-------------------------------------
A single Spark stage can be attempted multiple times - e.g. something like
{{Stage-1_0: ... Stage-1_1 ... Stage-2_0 ...}}. The comparator needs to compare
based on both stage id and attempt id.
If you up to do a bit of re-factoring, the implementation of {{getStageNum}}
isn't ideal. We shouldn't rely on string parsing to extract the stage id and
attempt id. {{SparkJobStatus#getSparkStageProgress}} should return a {{Map}}
whose key isn't a string, instead it should be a POJO that contains the stage
id and the attempt id.
Please add a unit test for this.
> SparkJobMonitor getReport doesn't print stage progress in order
> ---------------------------------------------------------------
>
> Key: HIVE-19508
> URL: https://issues.apache.org/jira/browse/HIVE-19508
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Sahil Takiar
> Assignee: Bharathkrishna Guruvayoor Murali
> Priority: Major
> Attachments: HIVE-19508.1.patch
>
>
> You can end up with a progress output like this:
> {code}
> Stage-10_0: 0/29 Stage-11_0: 0/44 Stage-12_0: 0/11
> Stage-13_0: 0/1 Stage-8_0: 258(+76)/468 Stage-9_0: 0/165
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)