[PR] [fix](cloud) Fix tablets permanently invisible to compaction scheduler due to race condition in `CloudTabletMgr::get_tablet` [doris]

via GitHub Wed, 25 Feb 2026 03:35:14 -0800


bobhan1 opened a new pull request, #60832:
URL: https://github.com/apache/doris/pull/60832


   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Related PR: #xxx
   
   Fix a race condition in `CloudTabletMgr::get_tablet()` introduced by #57922 
(commit `0918952c70`) that causes tablets to permanently disappear from 
`_tablet_map`, making them invisible to the compaction scheduler. This leads to 
MOW (Merge-on-Write) tables accumulating
     hundreds of rowsets without any compaction under high-frequency import.
   
    ## Root Cause
   
    Commit `0918952c70` refactored `get_tablet()` by moving `_cache->insert()` 
and `_tablet_map->put()` from inside the `SingleFlight` lambda to outside it. 
This introduced a race condition:
   
    1. When N concurrent `get_tablet()` calls arrive for the same `tablet_id`, 
`SingleFlight` executes the load lambda only once (by the "leader"), but all N 
callers receive a `shared_ptr` pointing to the same `CloudTablet` object.
   
    2. After `SingleFlight::load()` returns, all N callers independently 
execute:
   
    ```cpp
    _cache->insert(key, value, ...)   // each creates a competing LRU cache 
entry
    _tablet_map->put(tablet)          // each inserts into tablet_map
    ```
   
    3. Each `_cache->insert()` evicts the previous entry for the same key. The 
evicted `Value`'s destructor calls `_tablet_map.erase(tablet.get())`. The 
safety check `it->second.get() == tablet` was designed to prevent erasing a 
newer tablet object — but here all callers
    share the same raw pointer from `SingleFlight`, so the check always passes, 
and the erase succeeds.
   
    4. After the last caller's old cache handle is released (when its returned 
`shared_ptr` goes out of scope), the destructor erases the entry from 
`_tablet_map`.
   
    5. Crucially, subsequent `get_tablet()` calls find the tablet in the LRU 
cache (cache hit path), which never touches `_tablet_map`. So the tablet is 
permanently invisible to `_tablet_map` and can never re-enter it.
   
    6. The compaction scheduler uses `get_weak_tablets()` which iterates 
`_tablet_map`, so it never sees these tablets and never schedules compaction 
for them.
   
    ## Before (original correct code, prior to #57922):
   
    ```cpp
    auto load_tablet = [this, &key, ...](int64_t tablet_id) {
        // load from meta service...
        // Cache insert + tablet_map put INSIDE lambda — only leader executes
        auto* handle = _cache->insert(key, value.release(), ...);
        _tablet_map->put(std::move(tablet));
        return ret;
    };
    s_singleflight_load_tablet.load(tablet_id, std::move(load_tablet));
    ```
   
    ## After #57922 (buggy code):
   
    ```cpp
    auto load_tablet = [this, ...](int64_t tablet_id) {
        // load from meta service...
        return tablet;  // just return raw tablet
    };
    auto result = s_singleflight_load_tablet.load(tablet_id, 
std::move(load_tablet));
    // Cache insert + tablet_map put OUTSIDE lambda — ALL concurrent callers 
execute
    _cache->insert(key, value.release(), ...);
    _tablet_map->put(std::move(tablet));
    ```
   
    ## Fix
   
    Move `_cache->insert()` and `_tablet_map->put()` back inside the 
`SingleFlight` lambda, ensuring only the leader caller performs cache insertion 
and `_tablet_map` registration. This restores the invariant that a single 
`get_tablet()` cache miss produces exactly one
    LRU cache entry and one `_tablet_map` entry, eliminating the race condition.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [ ] Regression test
       - [ ] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [fix](cloud) Fix tablets permanently invisible to compaction scheduler due to race condition in `CloudTabletMgr::get_tablet` [doris]

Reply via email to