github-actions[bot] commented on code in PR #63298:
URL: https://github.com/apache/doris/pull/63298#discussion_r3290002811
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/MaterializedIndex.java:
##########
@@ -109,18 +111,26 @@ public Tablet getTablet(long tabletId) {
return idToTablets.get(tabletId);
}
- public void clearTabletsForRestore() {
+ public synchronized void clearTabletsForRestore() {
idToTablets.clear();
- tablets.clear();
+ tablets = new ArrayList<>(); // volatile write
}
- public void addTablet(Tablet tablet, TabletMeta tabletMeta) {
+ public synchronized void addTablet(Tablet tablet, TabletMeta tabletMeta) {
addTablet(tablet, tabletMeta, false);
}
- public void addTablet(Tablet tablet, TabletMeta tabletMeta, boolean
isRestore) {
+ // Writers are synchronized on this index to prevent concurrent
lost-update:
+ // some callers (e.g. InternalCatalog.createTablets) do NOT hold the
OlapTable
+ // write lock when adding tablets.
+ // Copy-on-write makes incremental add O(n); bulk creation is thus O(n^2),
but n
+ // (bucket count) is small and creation cost is dominated by replica/RPC
work — the
+ // copy is negligible, and CME-safe reads on the hot query path are worth
it.
+ public synchronized void addTablet(Tablet tablet, TabletMeta tabletMeta,
boolean isRestore) {
idToTablets.put(tablet.getId(), tablet);
Review Comment:
This makes bulk tablet creation quadratic. `InternalCatalog.createTablets()`
loops over `distributionInfo.getBucketNum()` and calls `index.addTablet(...)`
once per bucket, so this new `new ArrayList<>(tablets)` copies 0 + 1 + ... +
(bucketNum - 1) entries while holding the synchronized index monitor. The
default bucket limit is already 768 and `max_bucket_num_per_partition` can be
raised or disabled, so CREATE TABLE / ADD PARTITION / restore with many buckets
or rollups now spend avoidable FE time and allocations just rebuilding the same
prefix repeatedly. Please avoid per-tablet copy-on-write for bulk creation, for
example by adding a bulk publish path that builds the new list once, or by
otherwise preserving append as O(1) while still returning reader snapshots.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]