wenzhenghu opened a new pull request, #64579:
URL: https://github.com/apache/doris/pull/64579
### What problem does this PR solve?
Issue Number: None
Related PR: #64035
Problem Summary:
This PR aims to reduce the time that Doris FE holds internal table plan-time
read locks during Nereids planning.
Problem
In mixed queries that involve both internal tables and external tables, FE
may access external table metadata while internal table read locks are already
held. For some external catalogs, metadata loading is lazy and may be slow,
such as schema initialization or latest snapshot loading. As a result, the
internal table lock can be held much longer than necessary, which increases the
chance of blocking concurrent operations that need the table write lock.
Solution
This PR introduces an external metadata preload phase before internal table
locks are acquired in Nereids planning.
The main idea is:
1. Collect external tables during relation collection.
2. Preload the metadata that will likely be needed later, before locking
internal tables.
3. Acquire internal table locks only after the preload phase finishes.
4. Continue the normal analysis flow with the preloaded metadata already
cached.
The preload currently covers:
- Hive/Hudi external tables
- Iceberg external tables
- Paimon external tables
- JDBC external tables
For snapshot-based engines, this PR only preloads the latest snapshot when
the query is using the latest view of the table. It does not preload explicit
historical snapshots.
For JDBC catalogs, this PR preloads schema metadata before the lock phase,
so the lazy schema initialization no longer extends the internal table lock
holding window.
### Implementation summary
This PR reduces the internal table plan-time read lock window by moving
eligible external metadata loading ahead of `statementContext.lock()`. The
implementation now follows this flow:
`collect relations -> register external preload candidates -> preload
external metadata -> lock internal tables -> analyze`
The key point is that the external metadata work is no longer triggered
lazily after internal table locks are acquired. Instead, for eligible external
tables, it is executed before the lock stage.
### 1. Preload capability is implemented as a table trait
Instead of hard-coding table type checks in planner logic, preload
capability is now declared on `TableIf`:
- `supportsExternalMetadataPreload()`
- `supportsLatestSnapshotPreload()`
This keeps the capability decision close to the table implementation itself.
Current coverage is:
- `HMSExternalTable`
- supports preload for Hive/Hudi
- supports latest-snapshot preload only for Hudi
- `IcebergExternalTable`
- supports preload
- supports latest-snapshot preload
- `PaimonExternalTable`
- supports preload
- supports latest-snapshot preload
- `PluginDrivenExternalTable`
- preload is currently limited to JDBC plugin catalogs only
### 2. StatementContext only records preload candidates
`StatementContext` no longer owns the preload execution logic.
During relation collection, when an external table is encountered, it
records relation-level preload metadata through
`registerExternalTableForPreload(...)`.
This metadata is represented by `ExternalTablePreloadInfo`, which tracks
whether the same external table is referenced as:
- a latest relation
- a non-latest relation (for example snapshot / branch / tag / time-travel
style access)
This distinction is important because snapshot-aware external tables should
not warm latest schema/partitions when they are referenced only through
non-latest relations.
### 3. Preload execution is implemented as a Nereids analysis rule
The actual preload logic is implemented in a dedicated rule:
`PreloadExternalMetadata`.
This rule runs after relation collection and before internal table locks are
acquired.
It executes at most once per statement context and produces an
`ExternalMetadataPreloadResult`, which records:
- whether preload actually ran
- candidate table count
- preloaded table count
- skip reason
- elapsed time
### 4. Preload is gated by explicit conditions
The preload rule skips execution when any of the following is true:
- `enable_preload_external_metadata` is disabled
- no eligible external preload candidates were collected
- the statement does not involve any internal table that requires a
plan-time read lock
This means the optimization only runs when it can actually help reduce
internal lock holding time.
### 5. Preload behavior is table-type aware
For each external table candidate, preload may do one or more of the
following:
- preload latest snapshot metadata
- preload schema
- preload selected partition metadata
For snapshot-aware tables, latest snapshot/schema/partition warmup is now
gated by whether the table is referenced by latest-only relations.
In particular, if a table is referenced only by non-latest relations, this
PR avoids warming the latest schema/partitions. That prevents useless cache
warmup for time-travel / branch / tag queries.
### 6. Planner/profile integration was adjusted to avoid double counting
`NereidsPlanner` now reads the preload result produced by the collect-phase
rule and records it into a dedicated profile counter:
- `Nereids Preload External Metadata Time`
At the same time, `Nereids Lock Table Time` was narrowed to cover only the
actual `statementContext.lock()` call.
This avoids double counting preload time into both:
- `Nereids Preload External Metadata Time`
- `Nereids Lock Table Time`
After this change:
- when preload is disabled, external schema initialization can still show up
in `Nereids Analysis Time`
- when preload is enabled, that cost is shifted into `Nereids Preload
External Metadata Time`
### 7. Session variable
This PR introduces the session variable:
- `enable_preload_external_metadata`
It is currently default-off and acts as the main switch for this
optimization.
### Why this helps
Before this change, slow external metadata operations could extend the
duration for which internal table plan-time read locks were held.
After this change, the eligible external metadata work is moved before lock
acquisition, so the internal lock window is shorter and less sensitive to slow
external metadata paths.
### Release note
Improve FE planning by moving external metadata preload ahead of internal
table plan-time read locks.
### Check List (For Author)
- Test: FE Unit Test / Manual test
- FE Unit Test: `./run-fe-ut.sh --run
org.apache.doris.nereids.StatementContextTest,org.apache.doris.common.profile.SummaryProfileTest,org.apache.doris.datasource.PluginDrivenExternalTableEngineTest`
- Manual test: validated the JDBC preload profile case against a custom
Doris environment with an equivalent setup
- Behavior changed: Yes
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]