wenzhenghu opened a new pull request, #64579:
URL: https://github.com/apache/doris/pull/64579

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: #64035
   
   Problem Summary:
   
   This PR aims to reduce the time that Doris FE holds internal table plan-time 
read locks during Nereids planning.
   
   Problem
   In mixed queries that involve both internal tables and external tables, FE 
may access external table metadata while internal table read locks are already 
held. For some external catalogs, metadata loading is lazy and may be slow, 
such as schema initialization or latest snapshot loading. As a result, the 
internal table lock can be held much longer than necessary, which increases the 
chance of blocking concurrent operations that need the table write lock.
   
   Solution
   This PR introduces an external metadata preload phase before internal table 
locks are acquired in Nereids planning.
   
   The main idea is:
   1. Collect external tables during relation collection.
   2. Preload the metadata that will likely be needed later, before locking 
internal tables.
   3. Acquire internal table locks only after the preload phase finishes.
   4. Continue the normal analysis flow with the preloaded metadata already 
cached.
   
   The preload currently covers:
   - Hive/Hudi external tables
   - Iceberg external tables
   - Paimon external tables
   - JDBC external tables
   
   For snapshot-based engines, this PR only preloads the latest snapshot when 
the query is using the latest view of the table. It does not preload explicit 
historical snapshots.
   
   For JDBC catalogs, this PR preloads schema metadata before the lock phase, 
so the lazy schema initialization no longer extends the internal table lock 
holding window.
   
   ### Implementation summary
   
   This PR reduces the internal table plan-time read lock window by moving 
eligible external metadata loading ahead of `statementContext.lock()`. The 
implementation now follows this flow:
   
   `collect relations -> register external preload candidates -> preload 
external metadata -> lock internal tables -> analyze`
   
   The key point is that the external metadata work is no longer triggered 
lazily after internal table locks are acquired. Instead, for eligible external 
tables, it is executed before the lock stage.
   
   ### 1. Preload capability is implemented as a table trait
   
   Instead of hard-coding table type checks in planner logic, preload 
capability is now declared on `TableIf`:
   
   - `supportsExternalMetadataPreload()`
   - `supportsLatestSnapshotPreload()`
   
   This keeps the capability decision close to the table implementation itself.
   
   Current coverage is:
   
   - `HMSExternalTable`
     - supports preload for Hive/Hudi
     - supports latest-snapshot preload only for Hudi
   - `IcebergExternalTable`
     - supports preload
     - supports latest-snapshot preload
   - `PaimonExternalTable`
     - supports preload
     - supports latest-snapshot preload
   - `PluginDrivenExternalTable`
     - preload is currently limited to JDBC plugin catalogs only
   
   ### 2. StatementContext only records preload candidates
   
   `StatementContext` no longer owns the preload execution logic.
   
   During relation collection, when an external table is encountered, it 
records relation-level preload metadata through 
`registerExternalTableForPreload(...)`.
   
   This metadata is represented by `ExternalTablePreloadInfo`, which tracks 
whether the same external table is referenced as:
   
   - a latest relation
   - a non-latest relation (for example snapshot / branch / tag / time-travel 
style access)
   
   This distinction is important because snapshot-aware external tables should 
not warm latest schema/partitions when they are referenced only through 
non-latest relations.
   
   ### 3. Preload execution is implemented as a Nereids analysis rule
   
   The actual preload logic is implemented in a dedicated rule: 
`PreloadExternalMetadata`.
   
   This rule runs after relation collection and before internal table locks are 
acquired.
   
   It executes at most once per statement context and produces an 
`ExternalMetadataPreloadResult`, which records:
   
   - whether preload actually ran
   - candidate table count
   - preloaded table count
   - skip reason
   - elapsed time
   
   ### 4. Preload is gated by explicit conditions
   
   The preload rule skips execution when any of the following is true:
   
   - `enable_preload_external_metadata` is disabled
   - no eligible external preload candidates were collected
   - the statement does not involve any internal table that requires a 
plan-time read lock
   
   This means the optimization only runs when it can actually help reduce 
internal lock holding time.
   
   ### 5. Preload behavior is table-type aware
   
   For each external table candidate, preload may do one or more of the 
following:
   
   - preload latest snapshot metadata
   - preload schema
   - preload selected partition metadata
   
   For snapshot-aware tables, latest snapshot/schema/partition warmup is now 
gated by whether the table is referenced by latest-only relations.
   
   In particular, if a table is referenced only by non-latest relations, this 
PR avoids warming the latest schema/partitions. That prevents useless cache 
warmup for time-travel / branch / tag queries.
   
   ### 6. Planner/profile integration was adjusted to avoid double counting
   
   `NereidsPlanner` now reads the preload result produced by the collect-phase 
rule and records it into a dedicated profile counter:
   
   - `Nereids Preload External Metadata Time`
   
   At the same time, `Nereids Lock Table Time` was narrowed to cover only the 
actual `statementContext.lock()` call.
   
   This avoids double counting preload time into both:
   
   - `Nereids Preload External Metadata Time`
   - `Nereids Lock Table Time`
   
   After this change:
   
   - when preload is disabled, external schema initialization can still show up 
in `Nereids Analysis Time`
   - when preload is enabled, that cost is shifted into `Nereids Preload 
External Metadata Time`
   
   ### 7. Session variable
   
   This PR introduces the session variable:
   
   - `enable_preload_external_metadata`
   
   It is currently default-off and acts as the main switch for this 
optimization.
   
   ### Why this helps
   
   Before this change, slow external metadata operations could extend the 
duration for which internal table plan-time read locks were held.
   
   After this change, the eligible external metadata work is moved before lock 
acquisition, so the internal lock window is shorter and less sensitive to slow 
external metadata paths.
   
   ### Release note
   
   Improve FE planning by moving external metadata preload ahead of internal 
table plan-time read locks.
   
   ### Check List (For Author)
   
   - Test: FE Unit Test / Manual test
       - FE Unit Test: `./run-fe-ut.sh --run 
org.apache.doris.nereids.StatementContextTest,org.apache.doris.common.profile.SummaryProfileTest,org.apache.doris.datasource.PluginDrivenExternalTableEngineTest`
       - Manual test: validated the JDBC preload profile case against a custom 
Doris environment with an equivalent setup
   - Behavior changed: Yes
   - Does this need documentation: No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to