Copilot commented on code in PR #12294:
URL: https://github.com/apache/gluten/pull/12294#discussion_r3457350250


##########
cpp-ch/local-engine/Parser/aggregate_function_parser/SimpleStatisticsFunctions.cpp:
##########
@@ -63,7 +63,14 @@ class AggregateFunctionParserStddev final : public 
AggregateFunctionParser
         return func_node;
     }
 };
+// for skewness

Review Comment:
   The registration reuses `AggregateFunctionParserStddev` for skewness. Since 
the class name suggests it’s stddev-specific, it’d help future readers to 
document (at the call site) that this is intentional and primarily provides the 
NaN→NULL conversion needed for Spark compatibility (count < 3).



##########
backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickhouseCountDistinctSuite.scala:
##########
@@ -100,12 +100,17 @@ class GlutenClickhouseCountDistinctSuite extends 
GlutenClickHouseWholeStageTrans
   }
 
   test("check count distinct with agg fallback") {
-    // skewness agg is not supported, will cause fallback
     val sql = "select count(distinct(a,b)) , skewness(b) from " +
       "values (0, null,1), (0,null,1), (1, 1,1), (2, 2, 1) ,(2,2,2),(3,3,3) as 
data(a,b,c)"
-    assertThrows[UnsupportedOperationException] {
-      spark.sql(sql).show
-    }
+    val df = spark.sql(sql)
+    GlutenQueryComparisonTest.checkFallBack(df, noFallback = true)

Review Comment:
   The new skewness coverage here doesn’t exercise the key behavioral change 
described in the PR (ClickHouse `skewSamp` returning NaN when there are fewer 
than 3 non-null inputs, and Gluten converting that NaN to NULL to match Spark). 
With the current data (4 non-null `b` values), this can pass even if NaN→NULL 
handling is broken.
   
   Consider adding a small query (and a grouped variant) where each aggregation 
sees < 3 non-null values, and also rename the test since it’s no longer 
asserting fallback.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to