[
https://issues.apache.org/jira/browse/HIVE-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361440#comment-16361440
]
Sahil Takiar commented on HIVE-18684:
-------------------------------------
Right now, it looks like the code in {{RemoteSparkJobMonitor}} is very
poll-based. It polls the {{RemoteDriver}} for information every second and
displays it. Ideally we would be more event driven here, and whenever the
{{SparkClient}} receives an update from the {{RemoteDriver}} it is logged
immediately. However, implementing an event-driven model would require
re-writing a lot of this code. Unless there is a more compelling reason to
implement an event-based model, we should probably just stick to the current
code. There should be a simpler workaround for the bug reported in this JIRA
anyway.
> Race condition in RemoteSparkJobMonitor
> ---------------------------------------
>
> Key: HIVE-18684
> URL: https://issues.apache.org/jira/browse/HIVE-18684
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Sahil Takiar
> Assignee: Sahil Takiar
> Priority: Major
>
> There is a race condition in {{RemoteSparkJobMonitor}}. Sometimes the info in
> {{RemoteSparkJobMonitor#startMonitor.STARTED}} gets printed out, sometimes it
> doesn't. This can be easily verified by running a qtest on
> {{TestMiniSparkOnYarnCliDriver}} and counting the number of times {{Query
> Hive on Spark job}} is printed vs. the number of times {{Finished
> successfully in}} gets printed.
> The issue is that {{RemoteSparkJobMonitor}} runs every one second, and checks
> the state of {{JobHandle}}. Depending on the state, it prints out some
> logging info. The content of the logs contain an implicit assumption that
> logs in the {{STARTED}} state are printed before the logs in the
> {{SUCCEEDED}} state. However, this isn't always the case. The state
> transitions are driven by how long the remote Spark job takes to run, and it
> it finishes within one second then the logs in the {{STARTED}} state never
> printed.
> This can be confusing to users, and there is key debugging information that
> is printed in the {{STARTED}} state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)