github-actions[bot] commented on code in PR #64010:
URL: https://github.com/apache/doris/pull/64010#discussion_r3377703990


##########
be/src/storage/segment/segment_iterator.cpp:
##########
@@ -586,6 +470,48 @@ Status SegmentIterator::_init_impl(const 
StorageReadOptions& opts) {
     return Status::OK();
 }
 
+void SegmentIterator::_init_schema_block_id_map() {
+    _schema_block_id_map.assign(_schema->columns().size(), -1);
+    for (int i = 0; i < _schema->num_column_ids(); i++) {
+        auto cid = _schema->column_id(i);
+        _schema_block_id_map[cid] = i;
+    }
+}
+
+void SegmentIterator::_init_project_schema() {
+    _init_schema_block_id_map();
+    if (_opts.project_columns == nullptr || *_opts.project_columns == 
_schema->column_ids()) {
+        _project_schema = _schema;
+    } else {
+        _project_schema =
+                std::make_shared<Schema>(_opts.tablet_schema->columns(), 
*_opts.project_columns);
+    }
+}
+
+Block* SegmentIterator::_build_project_block(Block* block, Block* 
project_block) {
+    DORIS_CHECK(_project_schema != nullptr);
+    if (_project_schema == _schema) {
+        return block;
+    }
+
+    project_block->clear();
+    const auto& project_column_ids = _project_schema->column_ids();
+    for (size_t i = 0; i < project_column_ids.size(); ++i) {
+        auto cid = project_column_ids[i];
+        auto& output_column = block->get_by_position(i);
+        auto type = output_column.type;

Review Comment:
   This still assumes that the projected schema ordinal is the same as the 
current block ordinal. That is not true in non-direct aggregation reads: 
`BetaRowsetReader` now sets `project_columns` from `origin_return_columns`, but 
the segment block is built from the expanded `return_columns` (all key columns 
first, then requested value columns). For an AGG_KEYS table with keys `k1,k2` 
and a query projecting only value column `v1`, `project_column_ids[0] == v1` 
while `block->get_by_position(0)` is `k1`; a pushed-down expression on `v1` 
will be evaluated using `k1`'s column data/type. Please map each project cid to 
its position in the actual block layout (the expanded reader/output schema, 
excluding delete-only extras) instead of using `i` directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to