[
https://issues.apache.org/jira/browse/SPARK-23785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412130#comment-16412130
]
Marcelo Vanzin commented on SPARK-23785:
----------------------------------------
This is a little trickier than just the checks you have in the PR.
The check that is triggering in Hive is on the {{LauncherBackend}} side. So it
has somehow already been closed, and a {{setState}} call happens. That can
happen if there are two calls to {{LocalSchedulerBackend.stop}}, which can
happen if someone with a launcher handle calls {{stop()}} on the handle. But
the code should be safe against that and just ignore subsequent calls.
The race you describe also exists; it's not what the exception in the Hive bug
shows, though.
So perhaps it's better to do a few different things:
- add the checks in your PR
- in LauncherBackend.BackendConnection, set "isDisconnected" before calling
super.close()
- in that same class, override the "send()" method to ignore "SocketException",
to handle the second race.
> LauncherBackend doesn't check state of connection before setting state
> ----------------------------------------------------------------------
>
> Key: SPARK-23785
> URL: https://issues.apache.org/jira/browse/SPARK-23785
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.3.0
> Reporter: Sahil Takiar
> Priority: Major
>
> Found in HIVE-18533 while trying to integration with the
> {{InProcessLauncher}}. {{LauncherBackend}} doesn't check the state of its
> connection to the {{LauncherServer}} before trying to run {{setState}} -
> which sends a {{SetState}} message on the connection.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]