github-actions[bot] commented on code in PR #65031:
URL: https://github.com/apache/doris/pull/65031#discussion_r3519169657
##########
be/src/exprs/aggregate/aggregate_function_histogram.h:
##########
@@ -221,7 +231,8 @@ class AggregateFunctionHistogram final
void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn&
to) const override {
const std::string bucket_json = this->data(place).get(_argument_type);
- assert_cast<ColumnString&>(to).insert_data(bucket_json.c_str(),
bucket_json.length());
+ assert_cast<ColumnString&,
TypeCheckOnRelease::DISABLE>(to).insert_data(
+ bucket_json.c_str(), bucket_json.length());
Review Comment:
Please add a physical result-column check here, or support `ColumnString64`.
The new evaluator guard calls the default `check_result_column_type()`, which
delegates to `DataTypeString::check_column()` and accepts `ColumnString64`, but
this result path still writes through a release-disabled `ColumnString&` cast.
That leaves `histogram` with the same unchecked result-layout mismatch the PR
is trying to remove.
##########
be/src/exprs/aggregate/aggregate_function_datasketches_hll_union_agg.h:
##########
@@ -204,7 +204,8 @@ class AggregateFunctionDataSketchesHllUnionAgg final
this->data(place).read(buf);
}
void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn&
to) const override {
-
assert_cast<ColumnFloat64&>(to).get_data().push_back(this->data(place).get_result());
+ assert_cast<ColumnFloat64&,
TypeCheckOnRelease::DISABLE>(to).get_data().push_back(
+ this->data(place).get_result());
Review Comment:
Please add an input-column override for the string/varchar variants, or make
`add_one()` handle `ColumnString64`. This aggregate inherits the default
logical input check, and for `String`/`Varchar` that check accepts
`ColumnString64`; the hot path later reads the input with
`PrimitiveTypeTraits<T>::ColumnType`, i.e. `ColumnString`, using
release-disabled casts. The new evaluator guard can therefore approve a
physical string layout this aggregate still cannot read safely.
##########
be/src/exprs/aggregate/aggregate_function_collect.h:
##########
@@ -460,9 +460,9 @@ class AggregateFunctionCollect final
}
void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn&
to) const override {
- auto& to_arr = assert_cast<ColumnArray&>(to);
+ auto& to_arr = assert_cast<ColumnArray&,
TypeCheckOnRelease::DISABLE>(to);
auto& to_nested_col = to_arr.get_data();
- auto* col_null = assert_cast<ColumnNullable*>(&to_nested_col);
+ auto* col_null = assert_cast<ColumnNullable*,
TypeCheckOnRelease::DISABLE>(&to_nested_col);
this->data(place).insert_result_into(col_null->get_nested_column());
Review Comment:
Please add a result-column check that unwraps the array and nullable layers
and validates the nested string storage, or support `ColumnString64`.
`collect_list`/`collect_set` return `Array(Nullable(String))`, so the default
check can accept `Array(Nullable(ColumnString64))`; after this unwrap, the
string collect helpers cast the nested column to `ColumnString` with release
checks disabled. That leaves the new result guard unable to catch the physical
layout this implementation requires.
##########
be/src/exprs/aggregate/aggregate_function_ai_agg.h:
##########
@@ -331,7 +338,8 @@ class AggregateFunctionAIAgg final
void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn&
to) const override {
std::string result = data(place)._execute_task();
DCHECK(!result.empty()) << "AI returns an empty result";
- assert_cast<ColumnString&>(to).insert_data(result.data(),
result.size());
+ assert_cast<ColumnString&,
TypeCheckOnRelease::DISABLE>(to).insert_data(result.data(),
+
result.size());
Review Comment:
Please add a physical result-column check here, or support `ColumnString64`.
`check_result_column_type()` still falls back to
`DataTypeString::check_column()`, and that returns OK for `ColumnString64`, but
this path writes through `assert_cast<ColumnString&,
TypeCheckOnRelease::DISABLE>`. A `ColumnString64` result column can therefore
pass the new evaluator guard and then be treated as `ColumnString` in release
builds.
##########
be/src/exprs/aggregate/aggregate_function_map_combinator.cpp:
##########
@@ -284,6 +285,47 @@ class AggregateFunctionMapCombinatorTyped final
offsets.push_back(value_column.size());
}
+ void check_input_columns_type(const IColumn** columns) const override {
+ const IColumn* column = columns[0];
+ if (const auto* const_column =
check_and_get_column<ColumnConst>(*column)) {
+ column = &const_column->get_data_column();
+ }
+
+ const IColumn* checked_columns[1] = {column};
+ this->check_columns_type(this->argument_types, checked_columns);
+
Review Comment:
Please also validate the physical key column for string-key map combinators,
both on input and result, or support `ColumnString64` in the key helpers. This
new check still relies on `DataTypeMap::check_column()` for keys, which
recurses to `DataTypeString::check_column()` and accepts `ColumnString64`;
later `get_key()` and `insert_non_null_key()` cast the unwrapped key storage to
`ColumnString` with release checks disabled. The value-column checks below do
not cover that key layout.
##########
be/src/exprs/aggregate/aggregate_function_array_agg.h:
##########
@@ -177,10 +179,11 @@ struct AggregateFunctionArrayAggData<T> {
}
void insert_result_into(IColumn& to) const {
- auto& to_arr = assert_cast<ColumnArray&>(to);
+ auto& to_arr = assert_cast<ColumnArray&,
TypeCheckOnRelease::DISABLE>(to);
auto& to_nested_col = to_arr.get_data();
- auto* col_null = assert_cast<ColumnNullable*>(&to_nested_col);
- auto& vec = assert_cast<ColVecType&>(col_null->get_nested_column());
+ auto* col_null = assert_cast<ColumnNullable*,
TypeCheckOnRelease::DISABLE>(&to_nested_col);
+ auto& vec = assert_cast<ColVecType&, TypeCheckOnRelease::DISABLE>(
+ col_null->get_nested_column());
Review Comment:
Please add physical validation for the nested string storage, or support
`ColumnString64`. The return type is `Array(Nullable(String))`, so the default
result check recurses through `DataTypeArray` and `DataTypeNullable` to
`DataTypeString::check_column()`, which accepts `ColumnString64`; this
implementation then casts the nested column to `ColumnString` here with release
checks disabled. The nullable input and streaming paths have the same nested
`ColumnString` assumption.
##########
be/src/exprs/aggregate/aggregate_function_map.h:
##########
@@ -120,10 +120,11 @@ struct AggregateFunctionMapAggData {
}
void insert_result_into(IColumn& to) const {
- auto& dst = assert_cast<ColumnMap&>(to);
+ auto& dst = assert_cast<ColumnMap&, TypeCheckOnRelease::DISABLE>(to);
size_t num_rows = _key_column->size();
Review Comment:
Please add a physical input check for string/varchar keys, including the
nullable-key path, or make the key read support `ColumnString64`. `map_agg_v1`
uses the default logical input check, which accepts `ColumnString64` for
`String`, but `add()` later casts either the nullable nested key or direct key
to `ColumnString` with release checks disabled. That leaves the new input guard
unable to catch the key layout this aggregate requires.
##########
be/src/exprs/aggregate/aggregate_function_linear_histogram.h:
##########
@@ -171,7 +173,7 @@ struct AggregateFunctionLinearHistogramData {
rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
doc.Accept(writer);
- auto& column = assert_cast<ColumnString&>(to);
+ auto& column = assert_cast<ColumnString&,
TypeCheckOnRelease::DISABLE>(to);
column.insert_data(buffer.GetString(), buffer.GetSize());
Review Comment:
Please add a physical result-column check here, or support `ColumnString64`.
`linear_histogram` returns `DataTypeString`, so the default result validation
accepts `ColumnString64`, but this helper still writes through
`assert_cast<ColumnString&, TypeCheckOnRelease::DISABLE>`. A `ColumnString64`
result column can pass validation and then be accessed as the wrong physical
string column.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]