gemini-code-assist[bot] commented on code in PR #38871:
URL: https://github.com/apache/beam/pull/38871#discussion_r3404550075


##########
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/VarianceFn.java:
##########
@@ -133,7 +147,15 @@ public Coder<VarianceAccumulator> getAccumulatorCoder(
 
   @Override
   public T extractOutput(VarianceAccumulator accumulator) {
-    return decimalConverter.apply(getVariance(accumulator));
+    BigDecimal result = getVariance(accumulator);
+    if (isStddev) {
+      // Take the square root in IEEE double precision so the result matches 
Spark / numpy bit for
+      // bit (both compute stddev as Math.sqrt over a double). 
BigDecimal.sqrt(MATH_CTX) would round
+      // to 10 significant digits and fail the test harness's exact comparison 
of standard
+      // deviation.
+      result = BigDecimal.valueOf(Math.sqrt(result.doubleValue()));
+    }
+    return decimalConverter.apply(result);

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   If the accumulator is empty (or contains only 1 element for sample standard 
deviation), `getVariance(accumulator)` returns `null`. Calling 
`result.doubleValue()` on a `null` reference will throw a 
`NullPointerException`.
   
   We should add a null check for `result` before attempting to compute the 
square root.
   
   ```suggestion
       BigDecimal result = getVariance(accumulator);
       if (isStddev && result != null) {
         // Take the square root in IEEE double precision so the result matches 
Spark / numpy bit for
         // bit (both compute stddev as Math.sqrt over a double). 
BigDecimal.sqrt(MATH_CTX) would round
         // to 10 significant digits and fail the test harness's exact 
comparison of standard
         // deviation.
         result = BigDecimal.valueOf(Math.sqrt(result.doubleValue()));
       }
       return decimalConverter.apply(result);
   ```



##########
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/VarianceFn.java:
##########
@@ -76,6 +76,10 @@ public class VarianceFn<T extends Number> extends 
Combine.CombineFn<T, VarianceA
   private static final boolean POP = false;
 
   private boolean isSample; // flag to determine return value should be 
Variance Pop or Sample
+  // When true, extractOutput returns the square root of the variance (i.e. 
standard deviation).
+  // Beam's enumerable bridge cannot translate a SQRT call layered on top of a 
window VAR_SAMP, so
+  // STDDEV_SAMP / STDDEV_POP are computed end-to-end inside this combiner 
instead.
+  private boolean isStddev;

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   Since `VarianceFn` is a `CombineFn` that is serialized and distributed 
across workers, its fields should be immutable and thread-safe. Marking 
`isStddev` as `final` ensures proper visibility and prevents accidental 
modification.
   
   ```suggestion
     private final boolean isStddev;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to