morrySnow commented on code in PR #64820: URL: https://github.com/apache/doris/pull/64820#discussion_r3519122664
########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/Ipv4StringToNumOrDefault.java: ########## @@ -36,7 +37,7 @@ * scalar function ipv4_string_to_num_or_default */ public class Ipv4StringToNumOrDefault extends ScalarFunction Review Comment: **Redundant `NullToNonNullFunction`:** `Ipv4StringToNumOrDefault` already implements `AlwaysNotNullable` and has input slots. The `canConvertNullToNonNull()` fallback branch (`e instanceof AlwaysNotNullable && !e.getInputSlots().isEmpty()`) already catches this class. Adding `NullToNonNullFunction` here is decorative — it provides no additional filtering. Consider either: (1) dropping `NullToNonNullFunction` from all `AlwaysNotNullable`-with-input classes and relying on the fallback alone, or (2) consistently marking ALL `AlwaysNotNullable`-with-input classes (including `IsTrue`, `IsFalse`, `NonNullable`, `Array`). The current selective marking creates confusion about which classes need the marker. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/NotNullOrEmpty.java: ########## @@ -36,7 +37,7 @@ * ScalarFunction 'not_null_or_empty'. This class is generated by GenerateFunction. */ public class NotNullOrEmpty extends ScalarFunction Review Comment: Same redundancy as the `OrDefault` classes: `NotNullOrEmpty` implements both `AlwaysNotNullable` and `NullToNonNullFunction`, but the `canConvertNullToNonNull()` fallback already covers it. If the intent is documentation, consider using Javadoc instead of a marker interface. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/NullToNonNullFunction.java: ########## @@ -0,0 +1,65 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.trees.expressions.functions.AlwaysNotNullable; +import org.apache.doris.nereids.trees.expressions.functions.agg.AggregateFunction; + +/** + * Marker interface for expressions that can convert NULL input into a non-NULL output. + * + * For example: Coalesce(NULL, 2) → 2, Nvl(NULL, 0) → 0, NullOrEmpty(NULL) → true. + * + * This is significant for outer-join push-down safety: when an aggregate function contains + * a NullToNonNull expression wrapping a column from the nullable side of an outer join, + * the aggregation must NOT be pushed down. Null-extended rows (produced by the join for + * unmatched rows) have NULL for all nullable-side columns. The NullToNonNull expression + * would convert those NULLs to non-NULL values, and the pre-aggregation would miss those + * contributions because null-extended rows do not exist in the base table. + * + * <p>Note: {@link AlwaysNotNullable} expressions with input slots (e.g. Array, JsonArray, + * JsonObject, CreateStruct, CreateMap) are also blocked from being pushed to the nullable + * side of outer joins via a separate check in {@link #canConvertNullToNonNull(Expression)}. + */ +public interface NullToNonNullFunction { + + /** + * Check whether an expression can convert NULL input to non-NULL output. Review Comment: **Fragile `AggregateFunction` exclusion:** The `if (e instanceof AggregateFunction) return false` guard exists because `anyMatch` traverses into the aggregate function node itself (e.g., `Count extends NotNullableAggregateFunction extends AggregateFunction implements AlwaysNotNullable`), which would incorrectly match. This is a bandaid for the wrong traversal depth — a cleaner fix would be to change the call site to `aggregateFunc.children().anyMatch(e -> canConvertNullToNonNull((Expression) e))` instead of `aggregateFunc.anyMatch(...)`, eliminating the need for type-based exclusion entirely. This matters because: (a) if a new `AggregateFunction` subclass also legitimately implements `NullToNonNullFunction` in the future, it would be silently skipped; (b) the exclusion hardcodes knowledge about which expression types are "containers" into the utility method. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java: ########## @@ -265,9 +266,7 @@ public Plan visitLogicalAggregate(LogicalAggregate<? extends Plan> agg, JobConte } LogicalAggregate<Plan> eagerAgg = agg.withAggOutputChild(newOutputExpressions, child); Review Comment: **`normalizeAgg` removal and `normalized` flag propagation:** The eager aggregate `eagerAgg = agg.withAggOutputChild(newOutputExpressions, child)` inherits the `normalized` flag from `agg` (typically `true` at this point in the pipeline). The global `NormalizeAggregate` rule at `NormalizeAggregate.java:124` only fires `whenNot(LogicalAggregate::isNormalized)`, so it will **not** re-normalize this aggregate. The new aggregate has different output expressions and a different child — if any of these require normalization (e.g., complex expressions in aggregate function arguments that should be projected out), they will remain unnormalized. Please verify that the new output expressions are always simple enough to not need normalization, or add a re-normalization step specifically for the newly constructed aggregate. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/NullOrEmpty.java: ########## @@ -36,7 +37,7 @@ * ScalarFunction 'null_or_empty'. This class is generated by GenerateFunction. */ public class NullOrEmpty extends ScalarFunction Review Comment: Same redundancy — `NullOrEmpty` implements `AlwaysNotNullable` so the second branch of `canConvertNullToNonNull()` catches it. The `NullToNonNullFunction` marker adds no behavioral difference here. ########## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/NullToNonNullFunction.java: ########## @@ -0,0 +1,65 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.trees.expressions.functions.AlwaysNotNullable; +import org.apache.doris.nereids.trees.expressions.functions.agg.AggregateFunction; + +/** + * Marker interface for expressions that can convert NULL input into a non-NULL output. + * + * For example: Coalesce(NULL, 2) → 2, Nvl(NULL, 0) → 0, NullOrEmpty(NULL) → true. + * + * This is significant for outer-join push-down safety: when an aggregate function contains + * a NullToNonNull expression wrapping a column from the nullable side of an outer join, + * the aggregation must NOT be pushed down. Null-extended rows (produced by the join for + * unmatched rows) have NULL for all nullable-side columns. The NullToNonNull expression + * would convert those NULLs to non-NULL values, and the pre-aggregation would miss those + * contributions because null-extended rows do not exist in the base table. + * + * <p>Note: {@link AlwaysNotNullable} expressions with input slots (e.g. Array, JsonArray, + * JsonObject, CreateStruct, CreateMap) are also blocked from being pushed to the nullable + * side of outer joins via a separate check in {@link #canConvertNullToNonNull(Expression)}. + */ +public interface NullToNonNullFunction { + + /** + * Check whether an expression can convert NULL input to non-NULL output. + * This covers both {@link NullToNonNullFunction} (e.g. Coalesce, Nvl, NullOrEmpty) + * and {@link AlwaysNotNullable} expressions with input slots (e.g. Array, JsonArray, Review Comment: **Missing `IsTrue` and `IsFalse` from the set of marked classes:** `IsNull` is marked `NullToNonNullFunction` in this PR, but `IsTrue` and `IsFalse` (which also implement `AlwaysNotNullable` and convert NULL input to non-NULL output: `NULL IS TRUE → FALSE`, `NULL IS FALSE → FALSE`) are not. All three have identical null-to-non-null semantics. While the `AlwaysNotNullable` fallback catches them, the inconsistency with `IsNull` being explicitly marked suggests the selection is arbitrary. Also: `NonNullable` (Javadoc: "change nullable input col to non_nullable col") is the canonical NullToNonNullFunction but is not marked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
