[jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive

2017-03-19 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932133#comment-15932133
 ] 

Xiao Li commented on SPARK-19988:
-

I might need to reopen it soon. We have a bug in the code, instead of the test 
case.

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column 
> written by Hive
> 
>
> Key: SPARK-19988
> URL: https://issues.apache.org/jira/browse/SPARK-19988
> Project: Spark
>  Issue Type: Test
>  Components: SQL, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>Assignee: Xiao Li
>  Labels: flaky-test
> Fix For: 2.2.0
>
> Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by 
> Hive" fails a lot -- right now, I see about a 50% pass rate in the last 3 
> days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. 
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: 
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> SemanticException [Error 10072]: Database does not exist: db2
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive

2017-03-19 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932043#comment-15932043
 ] 

Apache Spark commented on SPARK-19988:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/17344

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column 
> written by Hive
> 
>
> Key: SPARK-19988
> URL: https://issues.apache.org/jira/browse/SPARK-19988
> Project: Spark
>  Issue Type: Test
>  Components: SQL, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>Assignee: Xiao Li
>  Labels: flaky-test
> Fix For: 2.2.0
>
> Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by 
> Hive" fails a lot -- right now, I see about a 50% pass rate in the last 3 
> days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. 
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: 
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> SemanticException [Error 10072]: Database does not exist: db2
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive

2017-03-16 Thread Kay Ousterhout (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929334#comment-15929334
 ] 

Kay Ousterhout commented on SPARK-19988:


With some help from [~joshrosen] I spent some time digging into this and found:

(1) if you look at the failures, they're all from the maven build.  In fact, 
100% of the maven builds shown there fail (and none of the SBT ones).  This is 
weird because this is also failing on the PR builder, which uses SBT. 

(2) The maven build failures are all accompanied by 3 other tests; the group of 
4 tests seems to consistently fail together.  3 tests fail with errors similar 
to this one (saying that some database does not exist).  The 4th test, 
org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite: create temporary 
view using, fails with a more real error.  I filed SPARK-19990 for that issue.

(3) A commit right around the time the tests started failing: 
https://github.com/apache/spark/commit/09829be621f0f9bb5076abb3d832925624699fa9#diff-b7094baa12601424a5d19cb930e3402fR46
 added code to remove all of the databases after each test.  I wonder if that's 
somehow getting run concurrently or asynchronously in the maven build (after 
the HiveCataloguedDDLSuite fails), which is why the error in the DDLSuite 
causes the other tests to fail saying that a database can't be found.  I have 
extremely limited knowledge of both (a) how the maven tests are executed and 
(b) the SQL code so it's possible these are totally unrelated issues.

None of this explains why the test is failing in the PR builder, where the 
failures have been isolated to this test.

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column 
> written by Hive
> 
>
> Key: SPARK-19988
> URL: https://issues.apache.org/jira/browse/SPARK-19988
> Project: Spark
>  Issue Type: Test
>  Components: SQL, Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>  Labels: flaky-test
> Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by 
> Hive" fails a lot -- right now, I see about a 50% pass rate in the last 3 
> days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. 
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: 
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> SemanticException [Error 10072]: Database does not exist: db2
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive

2017-03-16 Thread Herman van Hovell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929107#comment-15929107
 ] 

Herman van Hovell commented on SPARK-19988:
---

It is probably some other test changing the current database to {{db2}}. This 
is super annoying to debug, and the only solution I see is that we fix the 
database names in the test.

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column 
> written by Hive
> 
>
> Key: SPARK-19988
> URL: https://issues.apache.org/jira/browse/SPARK-19988
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>  Labels: flaky-test
> Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by 
> Hive" fails a lot -- right now, I see about a 50% pass rate in the last 3 
> days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. 
> https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: 
> org.apache.spark.sql.execution.QueryExecutionException: FAILED: 
> SemanticException [Error 10072]: Database does not exist: db2
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
>   at 
> org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org