andygrove commented on issue #2408: URL: https://github.com/apache/datafusion-comet/issues/2408#issuecomment-3304827570
> The error message looks similar to [#2389](https://github.com/apache/datafusion-comet/issues/2389). Do you have the full exception stack trace? Unfortunately this is all I see from PySpark: ``` 25/09/17 17:18:05 INFO ShufflePartitionsUtil: For shuffle(4, advisory target size: 67108864, actual target size 13055850, minimum partition size: 1048576 25/09/17 17:18:05 INFO CodeGenerator: Code generated in 10.130001 ms Traceback (most recent call last): File "/home/andy/git/apache/datafusion-comet/dev/benchmarks/tpcbench.py", line 120, in <module> main(args.benchmark, args.data, args.queries, int(args.iterations), args.output, args.name) File "/home/andy/git/apache/datafusion-comet/dev/benchmarks/tpcbench.py", line 83, in main rows = df.collect() File "/opt/spark-4.0.0-bin-hadoop3/python/lib/pyspark.zip/pyspark/sql/classic/dataframe.py", line 443, in collect File "/opt/spark-4.0.0-bin-hadoop3/python/lib/py4j-0.10.9.9-src.zip/py4j/java_gateway.py", line 1362, in __call__ File "/opt/spark-4.0.0-bin-hadoop3/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 288, in deco pyspark.errors.exceptions.captured.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions: List(9, 9, 9, 17) 25/09/17 17:18:05 INFO SparkContext: Invoking stop() from shutdown hook ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org