github-actions[bot] commented on code in PR #65031:
URL: https://github.com/apache/doris/pull/65031#discussion_r3519169657


##########
be/src/exprs/aggregate/aggregate_function_histogram.h:
##########
@@ -221,7 +231,8 @@ class AggregateFunctionHistogram final
 
     void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn& 
to) const override {
         const std::string bucket_json = this->data(place).get(_argument_type);
-        assert_cast<ColumnString&>(to).insert_data(bucket_json.c_str(), 
bucket_json.length());
+        assert_cast<ColumnString&, 
TypeCheckOnRelease::DISABLE>(to).insert_data(
+                bucket_json.c_str(), bucket_json.length());

Review Comment:
   Please add a physical result-column check here, or support `ColumnString64`. 
The new evaluator guard calls the default `check_result_column_type()`, which 
delegates to `DataTypeString::check_column()` and accepts `ColumnString64`, but 
this result path still writes through a release-disabled `ColumnString&` cast. 
That leaves `histogram` with the same unchecked result-layout mismatch the PR 
is trying to remove.



##########
be/src/exprs/aggregate/aggregate_function_datasketches_hll_union_agg.h:
##########
@@ -204,7 +204,8 @@ class AggregateFunctionDataSketchesHllUnionAgg final
         this->data(place).read(buf);
     }
     void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn& 
to) const override {
-        
assert_cast<ColumnFloat64&>(to).get_data().push_back(this->data(place).get_result());
+        assert_cast<ColumnFloat64&, 
TypeCheckOnRelease::DISABLE>(to).get_data().push_back(
+                this->data(place).get_result());

Review Comment:
   Please add an input-column override for the string/varchar variants, or make 
`add_one()` handle `ColumnString64`. This aggregate inherits the default 
logical input check, and for `String`/`Varchar` that check accepts 
`ColumnString64`; the hot path later reads the input with 
`PrimitiveTypeTraits<T>::ColumnType`, i.e. `ColumnString`, using 
release-disabled casts. The new evaluator guard can therefore approve a 
physical string layout this aggregate still cannot read safely.



##########
be/src/exprs/aggregate/aggregate_function_collect.h:
##########
@@ -460,9 +460,9 @@ class AggregateFunctionCollect final
     }
 
     void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn& 
to) const override {
-        auto& to_arr = assert_cast<ColumnArray&>(to);
+        auto& to_arr = assert_cast<ColumnArray&, 
TypeCheckOnRelease::DISABLE>(to);
         auto& to_nested_col = to_arr.get_data();
-        auto* col_null = assert_cast<ColumnNullable*>(&to_nested_col);
+        auto* col_null = assert_cast<ColumnNullable*, 
TypeCheckOnRelease::DISABLE>(&to_nested_col);
         this->data(place).insert_result_into(col_null->get_nested_column());

Review Comment:
   Please add a result-column check that unwraps the array and nullable layers 
and validates the nested string storage, or support `ColumnString64`. 
`collect_list`/`collect_set` return `Array(Nullable(String))`, so the default 
check can accept `Array(Nullable(ColumnString64))`; after this unwrap, the 
string collect helpers cast the nested column to `ColumnString` with release 
checks disabled. That leaves the new result guard unable to catch the physical 
layout this implementation requires.



##########
be/src/exprs/aggregate/aggregate_function_ai_agg.h:
##########
@@ -331,7 +338,8 @@ class AggregateFunctionAIAgg final
     void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn& 
to) const override {
         std::string result = data(place)._execute_task();
         DCHECK(!result.empty()) << "AI returns an empty result";
-        assert_cast<ColumnString&>(to).insert_data(result.data(), 
result.size());
+        assert_cast<ColumnString&, 
TypeCheckOnRelease::DISABLE>(to).insert_data(result.data(),
+                                                                               
 result.size());

Review Comment:
   Please add a physical result-column check here, or support `ColumnString64`. 
`check_result_column_type()` still falls back to 
`DataTypeString::check_column()`, and that returns OK for `ColumnString64`, but 
this path writes through `assert_cast<ColumnString&, 
TypeCheckOnRelease::DISABLE>`. A `ColumnString64` result column can therefore 
pass the new evaluator guard and then be treated as `ColumnString` in release 
builds.



##########
be/src/exprs/aggregate/aggregate_function_map_combinator.cpp:
##########
@@ -284,6 +285,47 @@ class AggregateFunctionMapCombinatorTyped final
         offsets.push_back(value_column.size());
     }
 
+    void check_input_columns_type(const IColumn** columns) const override {
+        const IColumn* column = columns[0];
+        if (const auto* const_column = 
check_and_get_column<ColumnConst>(*column)) {
+            column = &const_column->get_data_column();
+        }
+
+        const IColumn* checked_columns[1] = {column};
+        this->check_columns_type(this->argument_types, checked_columns);
+

Review Comment:
   Please also validate the physical key column for string-key map combinators, 
both on input and result, or support `ColumnString64` in the key helpers. This 
new check still relies on `DataTypeMap::check_column()` for keys, which 
recurses to `DataTypeString::check_column()` and accepts `ColumnString64`; 
later `get_key()` and `insert_non_null_key()` cast the unwrapped key storage to 
`ColumnString` with release checks disabled. The value-column checks below do 
not cover that key layout.



##########
be/src/exprs/aggregate/aggregate_function_array_agg.h:
##########
@@ -177,10 +179,11 @@ struct AggregateFunctionArrayAggData<T> {
     }
 
     void insert_result_into(IColumn& to) const {
-        auto& to_arr = assert_cast<ColumnArray&>(to);
+        auto& to_arr = assert_cast<ColumnArray&, 
TypeCheckOnRelease::DISABLE>(to);
         auto& to_nested_col = to_arr.get_data();
-        auto* col_null = assert_cast<ColumnNullable*>(&to_nested_col);
-        auto& vec = assert_cast<ColVecType&>(col_null->get_nested_column());
+        auto* col_null = assert_cast<ColumnNullable*, 
TypeCheckOnRelease::DISABLE>(&to_nested_col);
+        auto& vec = assert_cast<ColVecType&, TypeCheckOnRelease::DISABLE>(
+                col_null->get_nested_column());

Review Comment:
   Please add physical validation for the nested string storage, or support 
`ColumnString64`. The return type is `Array(Nullable(String))`, so the default 
result check recurses through `DataTypeArray` and `DataTypeNullable` to 
`DataTypeString::check_column()`, which accepts `ColumnString64`; this 
implementation then casts the nested column to `ColumnString` here with release 
checks disabled. The nullable input and streaming paths have the same nested 
`ColumnString` assumption.



##########
be/src/exprs/aggregate/aggregate_function_map.h:
##########
@@ -120,10 +120,11 @@ struct AggregateFunctionMapAggData {
     }
 
     void insert_result_into(IColumn& to) const {
-        auto& dst = assert_cast<ColumnMap&>(to);
+        auto& dst = assert_cast<ColumnMap&, TypeCheckOnRelease::DISABLE>(to);
         size_t num_rows = _key_column->size();

Review Comment:
   Please add a physical input check for string/varchar keys, including the 
nullable-key path, or make the key read support `ColumnString64`. `map_agg_v1` 
uses the default logical input check, which accepts `ColumnString64` for 
`String`, but `add()` later casts either the nullable nested key or direct key 
to `ColumnString` with release checks disabled. That leaves the new input guard 
unable to catch the key layout this aggregate requires.



##########
be/src/exprs/aggregate/aggregate_function_linear_histogram.h:
##########
@@ -171,7 +173,7 @@ struct AggregateFunctionLinearHistogramData {
         rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
         doc.Accept(writer);
 
-        auto& column = assert_cast<ColumnString&>(to);
+        auto& column = assert_cast<ColumnString&, 
TypeCheckOnRelease::DISABLE>(to);
         column.insert_data(buffer.GetString(), buffer.GetSize());

Review Comment:
   Please add a physical result-column check here, or support `ColumnString64`. 
`linear_histogram` returns `DataTypeString`, so the default result validation 
accepts `ColumnString64`, but this helper still writes through 
`assert_cast<ColumnString&, TypeCheckOnRelease::DISABLE>`. A `ColumnString64` 
result column can pass validation and then be accessed as the wrong physical 
string column.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to