924060929 commented on code in PR #63612:
URL: https://github.com/apache/doris/pull/63612#discussion_r3315193467
##########
fe/fe-core/src/main/java/org/apache/doris/persist/OperationType.java:
##########
@@ -433,6 +433,11 @@ public class OperationType {
public static final short OP_BEGIN_SNAPSHOT = 1100;
public static final short OP_META_SYNC_POINT = 1101;
+ // Generic "an operation modified this table's metadata" signal broadcast
from
+ // master to followers so that every FE-local cache keyed by table can be
+ // invalidated (NereidsSqlCacheManager,
NereidsSortedPartitionsCacheManager, …).
+ public static final short OP_TABLE_META_CHANGE = 1102;
Review Comment:
No existing `OperationType` matches the semantics of `OP_TABLE_META_CHANGE`.
The closest candidates and why they don't fit:
- `OP_REFRESH_EXTERNAL_TABLE` / `OP_REFRESH_CATALOG` /
`OP_REFRESH_EXTERNAL_DB` — scoped to external catalogs by name. Repurposing
them for internal OLAP tables would conflate "go re-fetch from upstream
metadata source" with "your local table-keyed cache is stale", and existing
handlers would need to branch on table type, which is worse than a new op.
- `OP_MODIFY_TABLE_PROPERTIES` / `OP_MODIFY_TABLE_LIGHT_SCHEMA_CHANGE` /
other `OP_MODIFY_TABLE_*` — each carries an operation-specific payload (the
property bag, the schema change, etc.). Folding a generic "this table changed,
invalidate downstream caches" into any one of them would require expanding its
payload schema and would make replay handlers ambiguous about whether a real
DDL happened.
`OP_TABLE_META_CHANGE` is intentionally a single generic broadcast that any
DDL on any table type can fan into, with a minimal payload (catalog/db/table
ids+names). Subscribers (`NereidsSqlCacheManager`,
`NereidsSortedPartitionsCacheManager`, and future per-table caches) all key off
the same triplet, so one op carries them all. Keeping it separate avoids
overloading existing ops.
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java:
##########
@@ -7131,7 +7132,9 @@ public void setTableStatusInternal(String dbName, String
tableName, OlapTableSta
LOG.warn("ignore set same state {} for table {}. is replay:
{}.",
olapTable.getState(), tableName, isReplay);
}
-
Env.getCurrentEnv().getSqlCacheManager().invalidateAboutTable(olapTable);
+ if (!isReplay) {
+ notifyTableMetaChange(olapTable);
Review Comment:
SQL cache invalidation is best-effort by design — every cache entry already
goes through `tablesOrDataChanged` on lookup, which compares `visibleVersion`
(Layer 1) and now `baseSchemaVersion` (Layer 2) against the live table. These
two layers cover the correctness-critical surface (data writes + schema DDLs)
without any dependency on `OP_TABLE_META_CHANGE` and without any crash /
replay-format / version-skew window.
Layer 3 (`OP_TABLE_META_CHANGE`) only adds coverage for DDLs that do not
bump either version — alter properties, alter partition status, etc. The
rolling-upgrade case you describe affects exactly this Layer-3-only set: during
the upgrade, the new follower replays an old master's journal that lacks the
companion record, and entries for those rarer DDLs may remain in cache until
the next lookup naturally evicts them or until the master finishes upgrading
and resumes emitting `OP_TABLE_META_CHANGE`. The window is bounded by the
upgrade duration, and Layer 1 + Layer 2 still hold throughout, so we accept
this gap rather than keeping the unconditional replay-side invalidation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]