ShreyeshArangath opened a new issue, #2029: URL: https://github.com/apache/datafusion-comet/issues/2029
### Describe the bug Built the most recent version of Comet for some internal benchmarking, but running into this following issue ``` org.apache.spark.SparkException: Parquet column cannot be converted in file hdfs://cluster/jobs/x/y/tpcds-unpartitioned/tpcds-1000/store_returns/part-00088-85e008a2-66d5-4ca4-a957-da1b024bc0ae-c000.snappy.parquet. Column: [sr_return_amt], Expected: decimal(7,2), Found: DOUBLE. ``` I also tried to set spark.conf.set("spark.sql.parquet.enableVectorizedReader","false") but that didn't fix the issue (and I do not think that it is the recommended path). Any help would be great! 🙂 Submission Conf ``` ./tpcds/spark-tpcds-datagen/bin/run-tpcds-benchmark --master=yarn --conf spark.submit.deployMode=cluster --conf spark.yarn.queue=<redacted> --conf spark.driver.memory=4G --conf spark.executor.cores=8 --conf spark.cores.max=8 --conf spark.executor.memory=16g --conf spark.memory.offHeap.enabled=true --conf spark.memory.offHeap.size=16g --conf "spark.files=$COMET_ROOT/$COMET_JAR" --conf spark.driver.extraJavaOptions='-Dspark.sql.test.master=yarn -Dspark.sql.shuffle.partitions=10' --conf "spark.dynamicAllocation.enabled=false" --conf "spark.executor.instances=50" --conf "spark.driver.extraClassPath=./$COMET_JAR" --conf "spark.executor.extraClassPath=./$COMET_JAR" --conf spark.plugins=org.apache.spark.CometPlugin --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager --conf spark.comet.enabled=true --conf spark.comet.cast.allowIncompatible=true --conf spark.comet.exec.replaceSortMergeJoin=true --conf spark .comet.exec.shuffle.enabled=true --conf spark.comet.exec.shuffle.fallbackToColumnar=true --conf spark.comet.exec.shuffle.compression.codec=lz4 --conf spark.comet.exec.shuffle.compression.level=1 --conf spark.comet.exec.shuffle.mode=auto --data-location hdfs://cluster/jobs/x/y/tpcds-unpartitioned/tpcds-1000 ``` ### Steps to reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org