github-actions[bot] commented on code in PR #63001:
URL: https://github.com/apache/doris/pull/63001#discussion_r3260858575
##########
be/src/core/column/column_const.cpp:
##########
@@ -108,7 +109,10 @@ void ColumnConst::get_permutation(bool /*reverse*/, size_t
/*limit*/, int /*nan_
}
void ColumnConst::replace_float_special_values() {
- data->replace_float_special_values();
+ // COW: get exclusive ownership of data before mutating
Review Comment:
This moves the live nested column out of `ColumnConst::data` before the
mutation is guaranteed to finish. If `IColumn::mutate()` has to clone and
throws, or if `replace_float_special_values()` throws while normalizing a
nested nullable/complex column, `data` remains moved-from and the `ColumnConst`
object is left invalid under the surrounding catch-and-convert paths. Please
detach from a copied `ColumnPtr` (or use a restore guard) and assign back only
after the throwing work succeeds.
##########
be/src/exec/common/variant_util.cpp:
##########
@@ -2147,10 +2146,15 @@ Status _parse_and_materialize_variant_columns(Block&
block,
for (size_t i = 0; i < variant_pos.size(); ++i) {
auto column_ref = block.get_by_position(variant_pos[i]).column;
bool is_nullable = column_ref->is_nullable();
Review Comment:
This detaches a clone from the block slot, but the non-scalar `continue`
path below drops that clone after `finalize()`. `column_ref` is a copy of
`block.get_by_position(...).column`, so the block still owns the original
column while `mutate()` can finalize a detached copy; non-scalar variants then
skip the assignment at the end and lose the finalize side effects. Please write
the finalized owner back to the block before continuing, or only
detach/finalize on the scalar-conversion path that actually replaces the slot.
This is distinct from the existing nullable-variant COW thread because it
affects the already-parsed/non-scalar early-exit path.
##########
be/src/core/column/column_variant.cpp:
##########
@@ -2101,7 +2114,24 @@ void ColumnVariant::clear_sparse_column() {
}
#endif
- serialized_sparse_column->clear();
+ serialized_sparse_column = ColumnPtr(create_binary_column_fn());
+}
+
+void ColumnVariant::ensure_binary_columns_rows() {
+ auto resize_if_empty = [this](WrappedPtr& column) {
+ const auto& const_column = static_cast<const IColumn::Ptr&>(column);
+ if (const_column->size() == num_rows) {
Review Comment:
`ensure_binary_columns_rows()` moves the live binary column member out
before `resize(num_rows)` succeeds. If detaching or resizing the
sparse/doc-value binary column throws, the corresponding `ColumnVariant` member
stays moved-from, and callers such as `finalize()` continue unwinding with a
partially corrupted variant. Please resize a detached temporary from a copied
pointer, or install a scoped restore guard before consuming the member.
##########
be/src/core/block/block.cpp:
##########
@@ -655,9 +815,26 @@ void Block::clear_column_data(int64_t column_size)
noexcept {
}
for (auto& d : data) {
if (d.column) {
- // Temporarily disable reference count check because a column
might be referenced multiple times within a block.
Review Comment:
This fast path only checks top-level exclusivity, but composite columns can
be uniquely owned at the parent while sharing child columns. For example, a
cloned `ColumnArray` can have an exclusive array wrapper with shared
data/offset children; `assume_mutable()->clear()` then clears the shared
children and corrupts the alias. The new `scoped_mutate_column()` already uses
recursive exclusivity for the same reason, so `clear_column_data()` should
either use that recursive check or clone an empty column whenever the whole
column tree is not exclusive. The vector overload below has the same pattern.
##########
be/src/core/column/column_const.h:
##########
@@ -321,7 +322,11 @@ class ColumnConst final : public COWHelper<IColumn,
ColumnConst> {
size_t deserialize_impl(const char* pos) override {
++s;
- return data->deserialize_impl(pos);
Review Comment:
`deserialize_impl()` now increments `s` and moves `data` out before nested
deserialization succeeds. A malformed input or allocation failure inside
`IColumn::mutate()`/`mutable_data->deserialize_impl(pos)` leaves the const
column with `s` advanced and/or `data` null because the write-back is skipped.
Please mutate a copied owner and update both `data` and `s` only after
successful nested deserialization, or restore them on every exception path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]