github-actions[bot] commented on code in PR #61518:
URL: https://github.com/apache/doris/pull/61518#discussion_r3408185640


##########
be/src/io/cache/fs_file_cache_storage.cpp:
##########
@@ -879,6 +891,7 @@ void 
FSFileCacheStorage::load_cache_info_into_memory_from_db(BlockFileCache* mgr
         LOG(WARNING) << "Failed to create iterator for meta store";
         return;
     }
+    std::unordered_map<uint64_t, std::pair<std::string, std::string>> 
context_cache;
 
     while (iterator->valid()) {

Review Comment:
   This restores the persisted context for the async DB iterator load, but 
`load_blocks_directly_unlocked()` still builds `context_original` from the same 
`BlockMeta` without copying `context_id` or resolving table/partition. During 
async open, a query can hit a persisted block before this iterator has loaded 
it; that on-demand path adds the in-memory cell with `context_id = 0`. If that 
block later changes cache type, `change_key_meta_type()` rewrites RocksDB with 
the zero context id and permanently drops the table/partition context for this 
cache block. Please apply the same `context_id` / `get_context()` restoration 
in `load_blocks_directly_unlocked()` before `add_cell()`.



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java:
##########
@@ -884,9 +886,11 @@ private Split createIcebergSplit(FileScanTask 
fileScanTask) {
                 split.setPartitionDataJson(IcebergUtils.getPartitionDataJson(
                         partitionData, partitionSpec, 
sessionVariable.getTimeZone()));
                 Map<String, String> partitionInfoMap = 
partitionMapInfos.computeIfAbsent(
-                        partitionData, k -> 
IcebergUtils.getIdentityPartitionInfoMap(

Review Comment:
   This helper returns keys from the Iceberg partition field names, while 
`getOrderedPathPartitionKeys()` still returns source column names via 
`getIdentityPartitionColumns()`. For an identity partition whose partition 
field name differs from the source column name, `partitionValues` contains the 
partition field key but `fillPartitionContextFromMap()` looks for the source 
column key, skips it, and leaves both `columns_from_path` and the cache 
partition context empty. The previous `getIdentityPartitionInfoMap()` used 
`partitionField.sourceId()` and `table.schema().findColumnName(...)`, so it 
produced source-column keys. Please preserve source-column keys while still 
rejecting non-identity specs, for example by adding a helper that returns null 
on non-identity fields but maps identity values through the source column name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to