Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-03-05 Thread via GitHub


github-actions[bot] commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-4010127793

   ## Code Review Summary
   
   **PR #60533: [fix](filecache) avoid SIGSEGV in background LRU update when 
clear cache**
   
   ### Overview
   This PR fixes a use-after-free / dangling pointer crash in the background 
LRU update thread. The root cause: `update_block_lru()` previously dereferenced 
`block->cell` — a raw pointer back into the `_files` map — which becomes 
dangling when `clear_file_cache_directly()` destroys all cells.
   
   ### Review Checklist
   
   **Correctness**: PASS  
   The fix replaces direct `block->cell` dereference with a safe `get_cell()` 
lookup into `_files` under the cache lock. If the cell was removed, 
`get_cell()` returns `nullptr` and the function returns early. The identity 
check (`cell->file_block.get() != block.get()`) correctly guards against 
ABA-style reuse of the same hash+offset slot by a different block.
   
   **Thread Safety**: PASS  
   `update_block_lru()` is called only from 
`run_background_block_lru_update()`, which holds `_mutex` via 
`SCOPED_CACHE_LOCK`. Both cache clearing paths (`clear_file_cache_async` and 
`clear_file_cache_directly`) also hold `_mutex`. The `get_cell()` lookup is 
therefore safe under the same lock.
   
   **Initialization**: PASS  
   `FileBlockCell* cell` was previously uninitialized (`FileBlockCell* cell;`) 
— reading it before assignment was undefined behavior. The `{nullptr}` 
initializer is the correct fix.
   
   **Behavioral Change**: Minor, acceptable  
   `cell->update_atime()` is now called even when `cell->queue_iterator` is not 
set (but only after confirming the cell exists and matches). Previously it was 
inside the `if (cell)` block but also conditional on cell existing. This is a 
benign change — updating atime even without a queue iterator is reasonable.
   
   **Test Coverage**: PASS  
   The new test covers both the null-block path and the stale/corrupted cell 
pointer path. The test directly accesses `mgr._mutex` (via friend 
relationship), which is acceptable for this kind of internal unit test.
   
   **No issues found.** The fix is well-targeted and correct.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-03-05 Thread via GitHub


dataroaring commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-4010108540

   /review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-03-05 Thread via GitHub


github-actions[bot] commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-4010100974

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-03-05 Thread via GitHub


github-actions[bot] commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-4010100833

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-10 Thread via GitHub


hello-stephen commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3878729869

   # BE Regression && UT Coverage Report
   Increment line coverage `100.00% (14/14)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/60533_86c0a22d28ad54ddd809f806891a3ac1e729508c_merge/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/60533_86c0a22d28ad54ddd809f806891a3ac1e729508c_merge/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 71.58% (25859/36124) |
   | Line Coverage | 54.23% (270120/498085) |
   | Region Coverage   | 51.69% (224784/434878) |
   | Branch Coverage   | 53.14% (96487/181570) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-10 Thread via GitHub


freemandealer commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3877646992

   run nonConcurrent


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


hello-stephen commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3854284239

   # BE Regression && UT Coverage Report
   Increment line coverage `100.00% (14/14)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/60533_86c0a22d28ad54ddd809f806891a3ac1e729508c_merge/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/60533_86c0a22d28ad54ddd809f806891a3ac1e729508c_merge/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 71.56% (25850/36124) |
   | Line Coverage | 54.20% (269976/498085) |
   | Region Coverage   | 51.68% (224742/434878) |
   | Branch Coverage   | 53.11% (96438/181570) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


hello-stephen commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3853148394

   # BE UT Coverage Report
   Increment line coverage `100.00% (14/14)` :tada:
   
   [Increment coverage 
report](http://coverage.selectdb-in.cc/coverage/86c0a22d28ad54ddd809f806891a3ac1e729508c_86c0a22d28ad54ddd809f806891a3ac1e729508c/increment_report/index.html)
   [Complete coverage 
report](http://coverage.selectdb-in.cc/coverage/86c0a22d28ad54ddd809f806891a3ac1e729508c_86c0a22d28ad54ddd809f806891a3ac1e729508c/report/index.html)
   | Category  | Coverage   |
   |---||
   | Function Coverage | 52.55% (19371/36861) |
   | Line Coverage | 36.04% (179953/499273) |
   | Region Coverage   | 32.43% (139585/430460) |
   | Branch Coverage   | 33.41% (60425/180838) |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


doris-robot commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3852519743

   
   
   ClickBench: Total hot run time: 28.41 s
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 86c0a22d28ad54ddd809f806891a3ac1e729508c, 
data reload: false
   
   query1   0.060.050.05
   query2   0.100.050.05
   query3   0.250.080.08
   query4   1.600.120.11
   query5   0.260.240.25
   query6   1.160.680.68
   query7   0.030.030.03
   query8   0.060.040.04
   query9   0.570.510.49
   query10  0.540.540.54
   query11  0.130.090.09
   query12  0.130.100.10
   query13  0.640.610.62
   query14  1.061.091.04
   query15  0.880.860.86
   query16  0.390.390.40
   query17  1.101.121.15
   query18  0.220.210.20
   query19  2.101.951.95
   query20  0.030.020.02
   query21  15.43   0.280.15
   query22  5.110.050.06
   query23  15.95   0.300.11
   query24  1.890.250.42
   query25  0.080.110.09
   query26  0.140.130.13
   query27  0.070.070.05
   query28  4.701.150.97
   query29  12.55   3.923.16
   query30  0.290.140.12
   query31  2.820.640.42
   query32  3.250.600.50
   query33  3.233.313.31
   query34  16.24   5.424.77
   query35  4.824.784.81
   query36  0.630.490.48
   query37  0.120.070.07
   query38  0.080.040.04
   query39  0.050.030.03
   query40  0.200.180.15
   query41  0.100.030.03
   query42  0.040.040.03
   query43  0.050.040.03
   Total cold run time: 99.15 s
   Total hot run time: 28.41 s
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


doris-robot commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3852405109

   
   
   TPC-H: Total hot run time: 31410 ms
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 86c0a22d28ad54ddd809f806891a3ac1e729508c, 
data reload: false
   
   -- Round 1 --
   q1   17658   445043624362
   q2   2009390 237 237
   q3   10120   1321721 721
   q4   10217   923 338 338
   q5   7560222319741974
   q6   195 180 147 147
   q7   886 732 602 602
   q8   9270144111981198
   q9   5280488348404840
   q10  6899197515741574
   q11  554 295 288 288
   q12  381 388 227 227
   q13  17783   410032063206
   q14  235 240 214 214
   q15  914 815 818 815
   q16  685 686 634 634
   q17  645 820 522 522
   q18  7333663863886388
   q19  11191005609 609
   q20  392 366 254 254
   q21  2725216019861986
   q22  372 320 274 274
   Total cold run time: 103232 ms
   Total hot run time: 31410 ms
   
   - Round 2, with runtime_filter_mode=off -
   q1   4353436743424342
   q2   277 344 261 261
   q3   2140265822262226
   q4   1390175212821282
   q5   4310421142254211
   q6   219 179 137 137
   q7   1827181620251816
   q8   2736249124712471
   q9   7548746174547454
   q10  2994330725822582
   q11  539 474 454 454
   q12  684 765 615 615
   q13  3987436635813581
   q14  285 323 284 284
   q15  864 810 806 806
   q16  682 733 693 693
   q17  1140130313441303
   q18  8076800380478003
   q19  893 878 856 856
   q20  2186216220012001
   q21  4934457244184418
   q22  602 577 525 525
   Total cold run time: 52666 ms
   Total hot run time: 50321 ms
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


Thearas commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3851699065

   
   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your 
PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR).
   
   Please clearly describe your PR:
   1. What problem was fixed (it's best to include specific error reporting 
information). How it was fixed.
   2. Which behaviors were modified. What was the previous behavior, what is it 
now, why was it modified, and what possible impacts might there be.
   3. What features were added. Why was this function added?
   4. Which code was refactored and why was this part of the code refactored?
   5. Which functions were optimized and what is the difference before and 
after the optimization?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [fix](filecache) avoid SIGSEGV in background LRU update when clear cache [doris]

2026-02-05 Thread via GitHub


freemandealer commented on PR #60533:
URL: https://github.com/apache/doris/pull/60533#issuecomment-3851699215

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]