[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312482#comment-16312482
 ] 

Sahil Takiar commented on HIVE-18214:
-------------------------------------

[~aihuaxu] yes thats correct. It sends a shutdown message to the 
{{RemoteDriver}} asynchronously. Then it creates another {{RemoteDriver}}, 
which leads to the exception.

Yeah, we could add logic to do that, but again its not something that would 
happen in production because every {{RemoteDriver}} is spawned in a separate 
container. The {{RemoteDriver#main(String args[])}} is run in a YARN container. 
And each {{RemoteDriver}} creates a single {{SparkContext}} in its constructor.

We could just change {{TestSparkClient}} so that it always spawns the 
{{RemoteDriver}} in a separate process, I checked and it only makes the test 
take an extra 20 seconds. The code to run the {{RemoteDriver}} in the 
local-process was only ever meant for test purposes.

> Flaky test: TestSparkClient
> ---------------------------
>
>                 Key: HIVE-18214
>                 URL: https://issues.apache.org/jira/browse/HIVE-18214
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to