dataroaring commented on PR #59947:
URL: https://github.com/apache/doris/pull/59947#issuecomment-3774162439

   # Code Review: [fix](storage) fix IOContext Use-After-Free
   
   ## Summary
   
   This PR addresses a use-after-free crash by changing `IOContext` lifetime 
management from raw pointers to `shared_ptr`. The core issue is that 
`IOContext` objects were being destroyed while file readers still held 
references to them.
   
   `★ Insight ─────────────────────────────────────`
   This is a classic C++ lifetime management problem. When async operations 
(like prefetch) capture raw pointers to stack-allocated objects, the objects 
may be destroyed before the async operation completes. Using `shared_ptr` with 
weak capture or explicit lifetime extension is the standard solution.
   `─────────────────────────────────────────────────`
   
   ---
   
   ## Critical Issues
   
   ### 1. Incomplete Fix - IOContext Contains Pointers (HIGH)
   
   The backward-compatible API copies `IOContext` to the heap:
   
   ```cpp
   // buffered_reader.cpp:853-856
   if (io_ctx != nullptr) {
       // Old API: best-effort safety by copying the IOContext onto the heap.
       io_ctx_holder = std::make_shared<IOContext>(*io_ctx);
   }
   ```
   
   However, `IOContext` contains pointer members that point to other objects:
   
   ```cpp
   // From io_common.h (inferred from internal_service.cpp changes)
   io_ctx->file_cache_stats = &file_cache_statis;  // raw pointer
   io_ctx->file_reader_stats = &file_reader_stats; // raw pointer
   ```
   
   **Copying IOContext copies these pointers, not the objects they point to.** 
If the original `FileCacheStatistics` or `FileReaderStats` objects are 
destroyed, the copied `IOContext` now has dangling pointers.
   
   **Evidence**: The `internal_service.cpp` change shows these need to also be 
heap-allocated:
   ```cpp
   auto file_cache_statis = std::make_shared<io::FileCacheStatistics>();
   auto file_reader_stats = std::make_shared<io::FileReaderStats>();
   io_ctx->file_cache_stats = file_cache_statis.get();
   io_ctx->file_reader_stats = file_reader_stats.get();
   ```
   
   **Recommendation**: 
   - Document that the "old API" copy is NOT safe if `IOContext` contains 
pointers to stack-allocated stats objects
   - Consider making `IOContext` store `shared_ptr` to stats instead of raw 
pointers
   - Or add validation/assertion in the copy path
   
   ### 2. Dual-Ownership Pattern is Error-Prone (MEDIUM)
   
   The pattern of keeping both `_io_ctx_holder` (shared_ptr) and `_io_ctx` (raw 
pointer) is fragile:
   
   ```cpp
   std::shared_ptr<const IOContext> _io_ctx_holder;
   const IOContext* _io_ctx = nullptr;
   ```
   
   With initialization like:
   ```cpp
   _io_ctx_holder(std::move(io_ctx_holder)),
   _io_ctx(_io_ctx_holder.get()),
   ```
   
   **Risk**: If `_io_ctx_holder` is moved from or reset, `_io_ctx` becomes 
dangling.
   
   **Recommendation**: Either:
   - Always access through `_io_ctx_holder.get()` (no raw pointer member), or
   - Use a getter method that returns `_io_ctx_holder.get()` 
   
   ### 3. Inconsistent Constructor Patterns (MEDIUM)
   
   The constructors have inconsistent handling:
   
   **CsvReader**:
   ```cpp
   CsvReader(..., io::IOContext* io_ctx, std::shared_ptr<io::IOContext> 
io_ctx_holder = nullptr)
       : ..., _io_ctx(io_ctx), _io_ctx_holder(std::move(io_ctx_holder)) {
       if (_io_ctx == nullptr && _io_ctx_holder) {
           _io_ctx = _io_ctx_holder.get();
       }
   }
   ```
   
   This allows callers to pass mismatched parameters (raw pointer to one 
object, shared_ptr to another). 
   
   **Recommendation**: Use a single parameter - either always `shared_ptr` or a 
factory method that handles both cases.
   
   ---
   
   ## Code Quality Issues
   
   ### 4. Code Duplication in `_create_file_reader`
   
   Multiple files have this duplicated pattern:
   
   ```cpp
   io::FileReaderSPtr file_reader;
   if (_io_ctx_holder) {
       file_reader = DORIS_TRY(io::DelegateReader::create_file_reader(
               ..., std::static_pointer_cast<const 
io::IOContext>(_io_ctx_holder), ...));
   } else {
       file_reader = DORIS_TRY(io::DelegateReader::create_file_reader(
               ..., _io_ctx, ...));
   }
   ```
   
   **Recommendation**: Centralize this in a helper or always use the shared_ptr 
overload (which already handles nullptr).
   
   ### 5. Missing Documentation
   
   The PR description lacks:
   - Issue number (what crash was observed?)
   - Stack trace or reproduction steps
   - Which code paths triggered the UAF
   
   **Recommendation**: Add details for future debugging reference.
   
   ---
   
   ## Positive Aspects
   
   1. **Backward Compatibility**: The old API signature is preserved with 
best-effort safety
   2. **Consistent Pattern**: All file readers (CSV, JSON, ORC, Parquet) are 
updated
   3. **Benchmark Results**: No performance regression observed (TPC-H, TPC-DS, 
ClickBench results look normal)
   
   ---
   
   ## Suggestions
   
   ### Short-term
   1. Add comments explaining when to use shared_ptr vs raw pointer API
   2. Add assertion in copy path to warn about non-null stats pointers
   3. Consider deprecating the raw pointer API
   
   ### Long-term
   1. Refactor `IOContext` to own its stats objects via shared_ptr
   2. Remove the dual-pointer pattern in favor of single shared_ptr member
   
   ---
   
   ## Verdict
   
   The fix addresses the immediate UAF issue, but the solution has gaps that 
could lead to similar issues:
   
   1. **The "copy IOContext" backward-compat path doesn't deep-copy pointer 
members** - this can still cause UAF if stats objects are stack-allocated
   2. **The dual-ownership pattern (`_io_ctx` + `_io_ctx_holder`) is fragile**
   
   **Recommendation**: Address issue #1 before merging, or at minimum add clear 
documentation that the old API requires heap-allocated stats objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to