andygrove commented on code in PR #958: URL: https://github.com/apache/datafusion-comet/pull/958#discussion_r1822970181
########## spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala: ########## @@ -1707,6 +1707,29 @@ class CometExecSuite extends CometTestBase { } } + test("SparkToColumnar override node name for row input") { + withSQLConf( + SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true", + CometConf.COMET_SHUFFLE_MODE.key -> "jvm") { + val df = spark + .range(1000) + .selectExpr("id as key", "id % 8 as value") + .toDF("key", "value") + .groupBy("key") + .count() + df.collect() + + val planAfter = df.queryExecution.executedPlan + assert(planAfter.toString.startsWith("AdaptiveSparkPlan isFinalPlan=true")) + val adaptivePlan = planAfter.asInstanceOf[AdaptiveSparkPlanExec].executedPlan + val nodeNames = adaptivePlan.collect { case c: CometSparkToColumnarExec => + c.nodeName + } + assert(nodeNames.length == 1) + assert(nodeNames.head == "CometSparkRowToColumnar") Review Comment: Could you also add a test that will generate a plan that uses `CometSparkColumnarToColumnar` so that we are testing both cases? I think you could have a copy of this test that writes the dataframe to a Parquet file and then reads the Parquet file back with the following configs. This will use Spark's vectorized Parquet reader which returns Spark columns. ``` SQLConf.USE_V1_SOURCE_LIST.key -> "", CometConf.COMET_NATIVE_SCAN_ENABLED.key -> "false", CometConf.COMET_CONVERT_FROM_PARQUET_ENABLED.key -> "true") { ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org