Yohahaha commented on code in PR #10697: URL: https://github.com/apache/incubator-gluten/pull/10697#discussion_r2389631210
########## backends-velox/src/test/scala/org/apache/gluten/execution/VeloxScanSuite.scala: ########## @@ -207,4 +207,78 @@ class VeloxScanSuite extends VeloxWholeStageTransformerSuite { } } } + + test("parquet index based schema evolution") { + withSQLConf(VeloxConfig.PARQUET_USE_COLUMN_NAMES.key -> "false") { + withTempDir { + dir => + val path = dir.getCanonicalPath + spark + .range(2) + .selectExpr("id as a", "cast(id + 10 as string) as b") + .write + .mode("overwrite") + .parquet(path) + + withTable("test") { + sql(s"""create table test (c long, d string, e float) using parquet options + |(path '$path')""".stripMargin) + var df = sql("select c, d from test") + checkAnswer(df, Seq(Row(0L, "10"), Row(1L, "11"))) + + df = sql("select d from test") + checkAnswer(df, Seq(Row("10"), Row("11"))) + + df = sql("select c from test") + checkAnswer(df, Seq(Row(0L), Row(1L))) + + df = sql("select d, c from test") + checkAnswer(df, Seq(Row("10", 0L), Row("11", 1L))) + + df = sql("select c, d, e from test") + checkAnswer(df, Seq(Row(0L, "10", null), Row(1L, "11", null))) + + df = sql("select e, d, c from test") + checkAnswer(df, Seq(Row(null, "10", 0L), Row(null, "11", 1L))) + } + } + } + } + + test("ORC index based schema evolution") { + withSQLConf(VeloxConfig.ORC_USE_COLUMN_NAMES.key -> "false") { + withTempDir { + dir => + val path = dir.getCanonicalPath + spark + .range(2) + .selectExpr("id as a", "cast(id + 10 as string) as b") + .write + .mode("overwrite") + .orc(path) + + withTable("test") { + sql(s"""create table test (c long, d string, e float) using orc options + |(path '$path')""".stripMargin) + var df = sql("select c, d from test") + checkAnswer(df, Seq(Row(0L, "10"), Row(1L, "11"))) Review Comment: > Unfortunately, I tried and it doesn't work because this compares against Vanilla Spark and Vanilla Spark doesn't support this out of the box (see the discussion with Rui above). does vanilla spark need any extra configs? users always expects gluten's behavior align with spark, if not, I think we need rethinking this feature. @rui-mo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org For additional commands, e-mail: commits-h...@gluten.apache.org