github-actions[bot] commented on code in PR #61518:
URL: https://github.com/apache/doris/pull/61518#discussion_r3408185640
##########
be/src/io/cache/fs_file_cache_storage.cpp:
##########
@@ -879,6 +891,7 @@ void
FSFileCacheStorage::load_cache_info_into_memory_from_db(BlockFileCache* mgr
LOG(WARNING) << "Failed to create iterator for meta store";
return;
}
+ std::unordered_map<uint64_t, std::pair<std::string, std::string>>
context_cache;
while (iterator->valid()) {
Review Comment:
This restores the persisted context for the async DB iterator load, but
`load_blocks_directly_unlocked()` still builds `context_original` from the same
`BlockMeta` without copying `context_id` or resolving table/partition. During
async open, a query can hit a persisted block before this iterator has loaded
it; that on-demand path adds the in-memory cell with `context_id = 0`. If that
block later changes cache type, `change_key_meta_type()` rewrites RocksDB with
the zero context id and permanently drops the table/partition context for this
cache block. Please apply the same `context_id` / `get_context()` restoration
in `load_blocks_directly_unlocked()` before `add_cell()`.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java:
##########
@@ -884,9 +886,11 @@ private Split createIcebergSplit(FileScanTask
fileScanTask) {
split.setPartitionDataJson(IcebergUtils.getPartitionDataJson(
partitionData, partitionSpec,
sessionVariable.getTimeZone()));
Map<String, String> partitionInfoMap =
partitionMapInfos.computeIfAbsent(
- partitionData, k ->
IcebergUtils.getIdentityPartitionInfoMap(
Review Comment:
This helper returns keys from the Iceberg partition field names, while
`getOrderedPathPartitionKeys()` still returns source column names via
`getIdentityPartitionColumns()`. For an identity partition whose partition
field name differs from the source column name, `partitionValues` contains the
partition field key but `fillPartitionContextFromMap()` looks for the source
column key, skips it, and leaves both `columns_from_path` and the cache
partition context empty. The previous `getIdentityPartitionInfoMap()` used
`partitionField.sourceId()` and `table.schema().findColumnName(...)`, so it
produced source-column keys. Please preserve source-column keys while still
rejecting non-identity specs, for example by adding a helper that returns null
on non-identity fields but maps identity values through the source column name.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]