[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-15 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326077#comment-16326077
 ] 

Peter Vary commented on HIVE-18214:
---

+1

Thanks for the explanation [~stakiar]!

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-13 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325386#comment-16325386
 ] 

Sahil Takiar commented on HIVE-18214:
-

Yes, I basically removed the non-production code from the unit tests, so now 
the tests exercise a setup that more closely mimics what is run in production.

Are you referring to the {{TestLocalSparkCliDriver}}? Yes, that was added so 
everything is run in a single process, which simplifies debugging, but it was 
mainly geared at debugging q-tests. These changes won't affect that because it 
follows a different code path ({{LocalHiveSparkClient}} vs. 
{{RemoteHiveSparkClient}}).

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-11 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322113#comment-16322113
 ] 

Peter Vary commented on HIVE-18214:
---

[~stakiar]: I am happy to remove non-production code from here if we have a 
good solution which can be used in case of problems. I seem to remember, that 
you mentioned somewhere that it helped debugging remote driver problems. Do we 
have a better solution for that? 

Thanks,
Peter

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-10 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320885#comment-16320885
 ] 

Sahil Takiar commented on HIVE-18214:
-

[~aihuaxu], [~pvary] any other comments?

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316891#comment-16316891
 ] 

Sahil Takiar commented on HIVE-18214:
-

[~aihuaxu], [~pvary] updated - RB: https://reviews.apache.org/r/64986/

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314463#comment-16314463
 ] 

Hive QA commented on HIVE-18214:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904843/HIVE-18214.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 11548 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_union_remove_1] 
(batchId=85)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=177)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8477/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8477/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8477/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904843 - PreCommit-HIVE-Build

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314434#comment-16314434
 ] 

Hive QA commented on HIVE-18214:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m  
9s{color} | {color:red} spark-client: The patch generated 13 new + 17 unchanged 
- 14 fixed = 30 total (was 31) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  8m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / a6b88d9 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8477/yetus/diff-checkstyle-spark-client.txt
 |
| modules | C: spark-client U: spark-client |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8477/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch, HIVE-18214.2.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313619#comment-16313619
 ] 

Aihua Xu commented on HIVE-18214:
-

That also works for me to bring the tests similar to the production.

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-05 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312774#comment-16312774
 ] 

Peter Vary commented on HIVE-18214:
---

[~stakiar]: Running the {{RemoteDriver}} in a separate process sounds good to 
me.
Thanks,
Peter

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-04 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312482#comment-16312482
 ] 

Sahil Takiar commented on HIVE-18214:
-

[~aihuaxu] yes thats correct. It sends a shutdown message to the 
{{RemoteDriver}} asynchronously. Then it creates another {{RemoteDriver}}, 
which leads to the exception.

Yeah, we could add logic to do that, but again its not something that would 
happen in production because every {{RemoteDriver}} is spawned in a separate 
container. The {{RemoteDriver#main(String args[])}} is run in a YARN container. 
And each {{RemoteDriver}} creates a single {{SparkContext}} in its constructor.

We could just change {{TestSparkClient}} so that it always spawns the 
{{RemoteDriver}} in a separate process, I checked and it only makes the test 
take an extra 20 seconds. The code to run the {{RemoteDriver}} in the 
local-process was only ever meant for test purposes.

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-04 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312261#comment-16312261
 ] 

Aihua Xu commented on HIVE-18214:
-

[~stakiar] Try to understand the issue. So when the one test finishes, rpc is 
closing and it will close RemoteDriver to stop the SparkContext, but since it 
happens asynchronously, we don't know when it really shutdowns?

One thought: maybe we should add the logic to always make sure there is only 
JavaSparkContext instance created in one JVM. If there is one existing and we 
try to create a new one, we can shutdown the existing one and create a new one. 

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-04 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312200#comment-16312200
 ] 

Sahil Takiar commented on HIVE-18214:
-

[~pvary] thanks for taking a look.

* In production, {{RemoteDriver}} is run in a dedicated container; however, we 
have some unit tests which run it in the local process; so in production its 
not really possible to hit this issue
* I'm not a fan of exposing these methods publicly either, I can add a 
{{\@VisibleForTesting}} annotation; {{RemoteDriver}} is already marked as 
{{\@Private}}

The only other way I can think of doing this is to change the 
{{TestSparkClient}} so it runs the {{RemoteDriver}} in a dedicated process (so 
similar to what we do in production). The test will take longer to run, but we 
won't hit this issue.

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-04 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310956#comment-16310956
 ] 

Peter Vary commented on HIVE-18214:
---

Hi [~stakiar],
Thanks for looking into this.
Quick question since I am not entirely comfortable with how RemoteDriver works:
- Is it possible to run into this race condition in normal use case. For 
example changing spark configuration values and issuing commands which 
reinitializes the Driver in quick successions?

If this is only a testing issue then I would be more comfortable with a 
solution where we do not expose new methods only for this purpose. If it is an 
issue which we can have in a production environment we might want to 
encapsulate it into the RemoteDriver object.

What do you think?
Peter

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-03 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310134#comment-16310134
 ] 

Sahil Takiar commented on HIVE-18214:
-

Test failures look un-related.

[~pvary], [~aihuaxu] could you review?

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309017#comment-16309017
 ] 

Hive QA commented on HIVE-18214:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904278/HIVE-18214.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=177)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8406/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8406/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8406/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904278 - PreCommit-HIVE-Build

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308944#comment-16308944
 ] 

Hive QA commented on HIVE-18214:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} spark-client: The patch generated 0 new + 38 
unchanged - 1 fixed = 38 total (was 39) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  8m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| modules | C: spark-client U: spark-client |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8406/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)