github-actions[bot] commented on code in PR #63522: URL: https://github.com/apache/doris/pull/63522#discussion_r3295847695
########## regression-test/suites/variant_p0/test_variant_timestamptz_sparse.groovy: ########## @@ -0,0 +1,127 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// Reproduces DORIS-25915: +// When VARIANT has typed paths that exceed variant_max_subcolumns_count and +// variant_enable_typed_paths_to_sparse=true, timestamptz values that fall to +// the sparse path are read back as only the timezone suffix (e.g. "+00:00") +// instead of the full timestamp. +suite("test_variant_timestamptz_sparse", "p0"){ + sql " set time_zone = '+08:00' " + + def tableName = "test_variant_timestamptz_sparse_repro" Review Comment: This new regression test does not follow the Doris regression-test rules in `AGENTS.md`: simple table names should be hardcoded instead of `def tableName`, deterministic checks should use `qt_*`/`.out` instead of manual `assertTrue`, and tests should not drop tables at the end (drop before use only, to preserve the environment for debugging). Please rewrite this to hardcode `test_variant_timestamptz_sparse_repro`, express the checked values through ordered `qt_*` output, and remove the final drop. ########## be/src/core/data_type_serde/data_type_timestamptz_serde.cpp: ########## @@ -250,4 +250,21 @@ std::string DataTypeTimeStampTzSerDe::to_olap_string(const Field& field) const { return CastToString::from_timestamptz(field.get<TYPE_TIMESTAMPTZ>(), 6); } +void DataTypeTimeStampTzSerDe::write_one_cell_to_binary(const IColumn& src_column, + ColumnString::Chars& chars, + int64_t row_num) const { + const auto type = static_cast<uint8_t>(FieldType::OLAP_FIELD_TYPE_TIMESTAMPTZ); + const auto& data_ref = assert_cast<const ColumnTimeStampTz&>(src_column).get_data_at(row_num); + const auto sc = static_cast<uint8_t>(_scale); + + const size_t old_size = chars.size(); + const size_t new_size = old_size + sizeof(uint8_t) + sizeof(uint8_t) + data_ref.size; Review Comment: This fixes the encoding for newly written sparse TIMESTAMPTZ values, but it also changes a persisted sparse-column value layout from the old `[type][8-byte value]` bytes to `[type][scale][8-byte value]` without any reader compatibility for rowsets already written by the old code. The variant sparse path stores these bytes in `ColumnString` (`ColumnVariant::serialize_to_binary_column` / `SparseColumnMergeIterator::_serialize_nullable_column_to_sparse`), and `ColumnVariant::deserialize_from_binary_column` later calls `DataTypeSerDe::deserialize_binary_to_field` followed by `CHECK_EQ(end - start_data, data_ref.size)`. For an existing 9-byte TIMESTAMPTZ value, the current reader consumes a scale byte plus 8 value bytes (10 bytes total), so after this change old persisted sparse values can read past the StringRef and/or trip the size check during upgrade. Please add a compatibility read path for the legacy 9-byte TIMESTAMPTZ encoding, and add a test that deserializes those old bytes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
