bobhan1 opened a new pull request, #60832:
URL: https://github.com/apache/doris/pull/60832
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Fix a race condition in `CloudTabletMgr::get_tablet()` introduced by #57922
(commit `0918952c70`) that causes tablets to permanently disappear from
`_tablet_map`, making them invisible to the compaction scheduler. This leads to
MOW (Merge-on-Write) tables accumulating
hundreds of rowsets without any compaction under high-frequency import.
## Root Cause
Commit `0918952c70` refactored `get_tablet()` by moving `_cache->insert()`
and `_tablet_map->put()` from inside the `SingleFlight` lambda to outside it.
This introduced a race condition:
1. When N concurrent `get_tablet()` calls arrive for the same `tablet_id`,
`SingleFlight` executes the load lambda only once (by the "leader"), but all N
callers receive a `shared_ptr` pointing to the same `CloudTablet` object.
2. After `SingleFlight::load()` returns, all N callers independently
execute:
```cpp
_cache->insert(key, value, ...) // each creates a competing LRU cache
entry
_tablet_map->put(tablet) // each inserts into tablet_map
```
3. Each `_cache->insert()` evicts the previous entry for the same key. The
evicted `Value`'s destructor calls `_tablet_map.erase(tablet.get())`. The
safety check `it->second.get() == tablet` was designed to prevent erasing a
newer tablet object — but here all callers
share the same raw pointer from `SingleFlight`, so the check always passes,
and the erase succeeds.
4. After the last caller's old cache handle is released (when its returned
`shared_ptr` goes out of scope), the destructor erases the entry from
`_tablet_map`.
5. Crucially, subsequent `get_tablet()` calls find the tablet in the LRU
cache (cache hit path), which never touches `_tablet_map`. So the tablet is
permanently invisible to `_tablet_map` and can never re-enter it.
6. The compaction scheduler uses `get_weak_tablets()` which iterates
`_tablet_map`, so it never sees these tablets and never schedules compaction
for them.
## Before (original correct code, prior to #57922):
```cpp
auto load_tablet = [this, &key, ...](int64_t tablet_id) {
// load from meta service...
// Cache insert + tablet_map put INSIDE lambda — only leader executes
auto* handle = _cache->insert(key, value.release(), ...);
_tablet_map->put(std::move(tablet));
return ret;
};
s_singleflight_load_tablet.load(tablet_id, std::move(load_tablet));
```
## After #57922 (buggy code):
```cpp
auto load_tablet = [this, ...](int64_t tablet_id) {
// load from meta service...
return tablet; // just return raw tablet
};
auto result = s_singleflight_load_tablet.load(tablet_id,
std::move(load_tablet));
// Cache insert + tablet_map put OUTSIDE lambda — ALL concurrent callers
execute
_cache->insert(key, value.release(), ...);
_tablet_map->put(std::move(tablet));
```
## Fix
Move `_cache->insert()` and `_tablet_map->put()` back inside the
`SingleFlight` lambda, ensuring only the leader caller performs cache insertion
and `_tablet_map` registration. This restores the invariant that a single
`get_tablet()` cache miss produces exactly one
LRU cache entry and one `_tablet_map` entry, eliminating the race condition.
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [ ] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]