Github user JoshRosen commented on the issue:
https://github.com/apache/spark/pull/14733
This looks like a legitimate test failure:
```
[info] - A cached table preserves the partitioning and ordering of its
cached SparkPlan *** FAILED *** (1 second, 305 milliseconds)
[info] Exception thrown while executing query:
[info] == Parsed Logical Plan ==
[info] 'Project [*]
[info] +- 'Join Inner, ('t1.key = 't2.a)
[info] :- 'UnresolvedRelation `t1`, t1
[info] +- 'UnresolvedRelation `t2`, t2
[info]
[info] == Analyzed Logical Plan ==
[info] key: int, value: string, a: int, b: int
[info] Project [key#17868, value#17869, a#19094, b#19095]
[info] +- Join Inner, (key#17868 = a#19094)
[info] :- SubqueryAlias t1, `t1`
[info] : +- RepartitionByExpression [key#17868], 5
[info] : +- SerializeFromObject [assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData, true], top level non-flat input
object).key AS key#17868, staticinvoke(class
org.apache.spark.unsafe.types.UTF8String, StringType, fromString,
assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true],
top level non-flat input object).value, true) AS value#17869]
[info] : +- ExternalRDD [obj#17867]
[info] +- SubqueryAlias t2, `t2`
[info] +- RepartitionByExpression [a#19094], 5
[info] +- SerializeFromObject [assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData2, true], top level non-flat
input object).a AS a#19094, assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData2, true], top level non-flat
input object).b AS b#19095]
[info] +- ExternalRDD [obj#19093]
[info]
[info] == Optimized Logical Plan ==
[info] Join Inner, (key#17868 = a#19094)
[info] :- Filter isnotnull(key#17868)
[info] : +- InMemoryRelation [key#17868, value#17869], true, 10000,
StorageLevel(disk, memory, deserialized, 1 replicas), t1
[info] : +- Exchange hashpartitioning(key#17868, 5)
[info] : +- *SerializeFromObject [assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData, true], top level non-flat input
object).key AS key#17868, staticinvoke(class
org.apache.spark.unsafe.types.UTF8String, StringType, fromString,
assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true],
top level non-flat input object).value, true) AS value#17869]
[info] : +- Scan ExternalRDDScan[obj#17867]
[info] +- Filter isnotnull(a#19094)
[info] +- InMemoryRelation [a#19094, b#19095], true, 10000,
StorageLevel(disk, memory, deserialized, 1 replicas), t2
[info] +- Exchange hashpartitioning(a#19094, 5)
[info] +- *SerializeFromObject [assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData2, true], top level non-flat
input object).a AS a#19094, assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData2, true], top level non-flat
input object).b AS b#19095]
[info] +- Scan ExternalRDDScan[obj#19093]
[info]
[info] == Physical Plan ==
[info] *SortMergeJoin [key#17868], [a#19094], Inner
[info] :- *Sort [key#17868 ASC], false, 0
[info] : +- *Filter isnotnull(key#17868)
[info] : +- InMemoryTableScan [key#17868, value#17869],
[isnotnull(key#17868)]
[info] : +- InMemoryRelation [key#17868, value#17869], true,
10000, StorageLevel(disk, memory, deserialized, 1 replicas), t1
[info] : +- Exchange hashpartitioning(key#17868, 5)
[info] : +- *SerializeFromObject
[assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true],
top level non-flat input object).key AS key#17868, staticinvoke(class
org.apache.spark.unsafe.types.UTF8String, StringType, fromString,
assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true],
top level non-flat input object).value, true) AS value#17869]
[info] : +- Scan ExternalRDDScan[obj#17867]
[info] +- *Sort [a#19094 ASC], false, 0
[info] +- *Filter isnotnull(a#19094)
[info] +- InMemoryTableScan [a#19094, b#19095], [isnotnull(a#19094)]
[info] +- InMemoryRelation [a#19094, b#19095], true, 10000,
StorageLevel(disk, memory, deserialized, 1 replicas), t2
[info] +- Exchange hashpartitioning(a#19094, 5)
[info] +- *SerializeFromObject
[assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true],
top level non-flat input object).a AS a#19094, assertnotnull(input[0,
org.apache.spark.sql.test.SQLTestData$TestData2, true], top level non-flat
input object).b AS b#19095]
[info] +- Scan ExternalRDDScan[obj#19093]
[info] == Exception ==
[info] java.lang.IllegalArgumentException: Can't zip RDDs with unequal
numbers of partitions: List(5, 3)
[info] java.lang.IllegalArgumentException: Can't zip RDDs with unequal
numbers of partitions: List(5, 3)
[info] at
org.apache.spark.rdd.ZippedPartitionsBaseRDD.getPartitions(ZippedPartitionsRDD.scala:57)
[info] at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
[info] at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
[info] at scala.Option.getOrElse(Option.scala:121)
[info] at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
[info] at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
[info] at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
[info] at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
[info] at scala.Option.getOrElse(Option.scala:121)
[info] at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
[info] at org.apache.spark.SparkContext.runJob(SparkContext.scala:1924)
[info] at
org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
[info] at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
[info] at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
[info] at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
[info] at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
[info] at
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:276)
[info] at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2226)
[info] at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
[info] at
org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2576)
[info] at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2225)
[info] at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2230)
[info] at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2230)
[info] at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2589)
[info] at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2230)
[info] at org.apache.spark.sql.Dataset.collect(Dataset.scala:2206)
[info] at
org.apache.spark.sql.QueryTest$.checkAnswer(QueryTest.scala:389)
[info] at
org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:175)
[info] at
org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:186)
[info] at
org.apache.spark.sql.CachedTableSuite$$anonfun$25$$anonfun$apply$mcV$sp$10$$anonfun$apply$mcVI$sp$1.apply$mcV$sp(CachedTableSuite.scala:424)
[info] at
org.apache.spark.sql.test.SQLTestUtils$class.withTempView(SQLTestUtils.scala:155)
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]