andygrove commented on PR #1298:
URL:
https://github.com/apache/datafusion-comet/pull/1298#issuecomment-2598503895
@viirya, I wonder if you could help me and @parthchandra understand why the
test in this PR is failing.
The error is:
```
Can't zip RDDs with unequal numbers of partitions: ArrayBuffer(10, 0)
java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of
partitions: ArrayBuffer(10, 0)
at
org.apache.spark.rdd.ZippedPartitionsBaseRDD.getPartitions(ZippedPartitionsRDD.scala:58)
```
From the debug logging (shown below) we can see that `CometUnionExec` always
reports its partitioning as `UnknownPartitioning(0)`, so that at least seems to
be part of the issue.
Debug logging:
```
------------------
wrapped UnknownPartitioning(0)
CometScanExec num Parts = 5
class org.apache.spark.sql.comet.CometScanExec has UnknownPartitioning(5)
------------------
containsBroadcastInput = false
firstNonBroadcastPlanNumPartitions (CometScan parquet
spark_catalog.default.dim_store) = 5
Setting broadcast partitions to 5
------------------
wrapped UnknownPartitioning(0)
CometScanExec num Parts = 26
class org.apache.spark.sql.comet.CometUnionExec has UnknownPartitioning(0);
first child has UnknownPartitioning(26)
class org.apache.spark.sql.execution.adaptive.BroadcastQueryStageExec has
UnknownPartitioning(0)
------------------
containsBroadcastInput = true
firstNonBroadcastPlanNumPartitions (CometUnion) = 0
Setting broadcast partitions to 0
------------------
class org.apache.spark.sql.comet.CometScanExec has UnknownPartitioning(26)
------------------
containsBroadcastInput = false
firstNonBroadcastPlanNumPartitions (CometScan parquet
spark_catalog.default.fact_sk) = 26
Setting broadcast partitions to 26
------------------
wrapped UnknownPartitioning(0)
CometScanExec num Parts = 26
class org.apache.spark.sql.comet.CometScanExec has UnknownPartitioning(26)
------------------
containsBroadcastInput = false
firstNonBroadcastPlanNumPartitions (CometScan parquet
spark_catalog.default.fact_stats) = 26
Setting broadcast partitions to 26
setNumPartitions(0)
```
If I modify `CometUnionExec` to report the same partitioning as its first
child, then I see:
```
Can't zip RDDs with unequal numbers of partitions: ArrayBuffer(10, 26)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]