andygrove commented on code in PR #1756: URL: https://github.com/apache/datafusion-comet/pull/1756#discussion_r2098530916
########## spark/src/test/scala/org/apache/comet/CometFuzzTestSuite.scala: ########## @@ -99,6 +100,56 @@ class CometFuzzTestSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + test("select column with default value") { + // This test fails in Spark's vectorized Parquet reader for DECIMAL(36,18) or BINARY default values. + withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> "false") { + // This test relies on two tables: 1) t1 the Parquet file generated by ParquetGenerator with random values, and + // 2) t2 is a new table created with one column which we add a second column with different types and random values. + // We use the schema and values of t1 to simplify random value generation for the default column value in t2. + val df = spark.read.parquet(filename) + df.createOrReplaceTempView("t1") + for (col <- df.columns + .slice(1, 14)) { // All the primitive columns based on ParquetGenerator.makeParquetFile. Review Comment: this code could break if we change the parquet generator in the future. It would probably be best to filter instead. Something like: ```scala val columns = df.schema.fields.filter(f => !isComplexType(f.dataType)).map(_.name) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org