gemini-code-assist[bot] commented on code in PR #38871:
URL: https://github.com/apache/beam/pull/38871#discussion_r3404550075
##########
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/VarianceFn.java:
##########
@@ -133,7 +147,15 @@ public Coder<VarianceAccumulator> getAccumulatorCoder(
@Override
public T extractOutput(VarianceAccumulator accumulator) {
- return decimalConverter.apply(getVariance(accumulator));
+ BigDecimal result = getVariance(accumulator);
+ if (isStddev) {
+ // Take the square root in IEEE double precision so the result matches
Spark / numpy bit for
+ // bit (both compute stddev as Math.sqrt over a double).
BigDecimal.sqrt(MATH_CTX) would round
+ // to 10 significant digits and fail the test harness's exact comparison
of standard
+ // deviation.
+ result = BigDecimal.valueOf(Math.sqrt(result.doubleValue()));
+ }
+ return decimalConverter.apply(result);
Review Comment:

If the accumulator is empty (or contains only 1 element for sample standard
deviation), `getVariance(accumulator)` returns `null`. Calling
`result.doubleValue()` on a `null` reference will throw a
`NullPointerException`.
We should add a null check for `result` before attempting to compute the
square root.
```suggestion
BigDecimal result = getVariance(accumulator);
if (isStddev && result != null) {
// Take the square root in IEEE double precision so the result matches
Spark / numpy bit for
// bit (both compute stddev as Math.sqrt over a double).
BigDecimal.sqrt(MATH_CTX) would round
// to 10 significant digits and fail the test harness's exact
comparison of standard
// deviation.
result = BigDecimal.valueOf(Math.sqrt(result.doubleValue()));
}
return decimalConverter.apply(result);
```
##########
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/VarianceFn.java:
##########
@@ -76,6 +76,10 @@ public class VarianceFn<T extends Number> extends
Combine.CombineFn<T, VarianceA
private static final boolean POP = false;
private boolean isSample; // flag to determine return value should be
Variance Pop or Sample
+ // When true, extractOutput returns the square root of the variance (i.e.
standard deviation).
+ // Beam's enumerable bridge cannot translate a SQRT call layered on top of a
window VAR_SAMP, so
+ // STDDEV_SAMP / STDDEV_POP are computed end-to-end inside this combiner
instead.
+ private boolean isStddev;
Review Comment:

Since `VarianceFn` is a `CombineFn` that is serialized and distributed
across workers, its fields should be immutable and thread-safe. Marking
`isStddev` as `final` ensures proper visibility and prevents accidental
modification.
```suggestion
private final boolean isStddev;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]