junrushao opened a new issue, #457:
URL: https://github.com/apache/tvm-ffi/issues/457

   ## Summary
   
   The destructors of `SeqBaseObj`, `SmallMapObj`, and `DenseMapObj` iterate 
over their elements and invoke `Any::~Any()` in a plain loop without any 
exception-safety guard. If any single `Any::~Any()` call propagates an 
exception (via `Object::DecRef()` → user-defined `deleter`), the remaining 
elements in the container are never destroyed, and the backing buffer is never 
deallocated. This constitutes a resource leak — and in practice, a 
reference-count leak on every surviving element's pointee, which in turn 
prevents those objects from ever being freed.
   
   ## Affected Code Paths
   
   ### 1. `SeqBaseObj::~SeqBaseObj()` — impacts `Array` and `List`
   
   
https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/container/seq_base.h#L53-L61
   
   ```cpp
   ~SeqBaseObj() {
     Any* begin = MutableBegin();
     for (int64_t i = 0; i < TVMFFISeqCell::size; ++i) {
       (begin + i)->Any::~Any();          // ← can throw
     }
     if (data_deleter != nullptr) {
       data_deleter(data);                // ← never reached on throw
     }
   }
   ```
   
   If the destructor of element `i` throws, elements `i+1 … size-1` are never 
destroyed and `data_deleter` is never called.
   
   ### 2. `SmallMapObj::~SmallMapObj()`
   
   
https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/container/map.h#L272-L284
   
   ```cpp
   ~SmallMapObj() {
     KVType* begin = static_cast<KVType*>(data_);
     for (uint64_t index = 0; index < size_; ++index) {
       (begin + index)->first.Any::~Any();   // ← can throw
       (begin + index)->second.Any::~Any();  // ← can throw; also skipped if 
`first` throws
     }
     if (data_deleter_ != nullptr) {
       data_deleter_(data_);                 // ← never reached on throw
     }
   }
   ```
   
   Identical pattern. An additional subtlety: if `.first`'s destructor throws 
at some index, `.second` of that same entry is also leaked.
   
   ### 3. `DenseMapObj::Reset()` (called from `~DenseMapObj()`)
   
   
https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/container/map.h#L857-L871
   
   ```cpp
   void Reset() {
     uint64_t n_blocks = CalcNumBlocks(this->NumSlots());
     for (uint64_t bi = 0; bi < n_blocks; ++bi) {
       uint8_t* meta_ptr = GetBlock(bi)->bytes;
       ItemType* data_ptr = reinterpret_cast<ItemType*>(GetBlock(bi)->bytes + 
kBlockCap);
       for (int j = 0; j < kBlockCap; ++j, ++meta_ptr, ++data_ptr) {
         uint8_t& meta = *meta_ptr;
         if (meta != kProtectedSlot && meta != kEmptySlot) {
           meta = kEmptySlot;
           data_ptr->ItemType::~ItemType();  // ← can throw
         }
       }
     }
     ReleaseMemory();                        // ← never reached on throw
   }
   ```
   
   Same issue across a two-level loop (blocks × slots). An exception at any 
slot leaks the rest of that block plus all subsequent blocks, plus the backing 
allocation itself.
   
   ### 4. `ListObj::Reserve()` — secondary site
   
   
https://github.com/apache/tvm-ffi/blob/main/include/tvm/ffi/container/list.h#L78-L93
   
   ```cpp
   void Reserve(int64_t n) {
     // ...
     for (int64_t j = 0; j < TVMFFISeqCell::size; ++j) {
       (old_data + j)->Any::~Any();   // ← can throw
     }
     data_deleter(data);              // ← never reached on throw
     // ...
   }
   ```
   
   The comment above this function correctly notes that the *move* loop is safe 
because `Any`'s move constructor is `noexcept`, but the subsequent 
*destruction* loop of the old buffer has no such guarantee.
   
   ## Root Cause
   
   `Any::~Any()` calls `Any::reset()`:
   
   ```cpp
   // any.h:249-256
   void reset() {
     if (data_.type_index >= TVMFFITypeIndex::kTVMFFIStaticObjectBegin) {
       details::ObjectUnsafe::DecRefObjectHandle(data_.v_obj);
     }
     // ...
   }
   ```
   
   `DecRefObjectHandle` calls `Object::DecRef()`, which invokes the 
user-registered `header_.deleter` function pointer when the reference count 
drops to zero. This deleter is declared as a bare C function pointer with no 
`noexcept` specification:
   
   ```c
   // c_api.h:266
   void (*deleter)(void* self, int flags);
   ```
   
   Nothing in the contract or the type system prevents a deleter from throwing 
(e.g., custom C++ deleters registered via `make_object` or language bindings 
that throw on cleanup errors). Even if we *intend* deleters to be non-throwing, 
the compiler cannot enforce this with a bare function pointer, and a single 
misbehaving deleter is enough to trigger the leak across an arbitrarily large 
container.
   
   Furthermore, none of the destructors listed above are marked 
`noexcept(false)` — they inherit the implicit `noexcept` from C++11. A throwing 
destructor in a `noexcept` context calls `std::terminate()`, which is arguably 
worse than a leak, but the real concern is that if anyone wraps these in a 
`try`/`catch` or if the `noexcept` default is relaxed in a build configuration, 
the leak becomes silently reachable.
   
   ## Impact
   
   - **Memory leak**: Backing buffers (`data`) are never freed.
   - **Reference-count leak**: Every un-destroyed `Any` that holds an 
`ObjectRef` retains a strong reference. The pointed-to objects are never 
deallocated, which may transitively keep alive entire object graphs.
   - **Cascading resource leak**: If the leaked objects themselves own OS 
resources (file descriptors, GPU memory, etc.), those are leaked as well.
   - **Non-deterministic**: The bug only manifests when `DecRef` triggers 
deletion *and* that deletion throws, so it depends on aliasing patterns and the 
specific deleter registered on the object — making it extremely difficult to 
reproduce or diagnose in production.
   
   ## Suggested Fix
   
   Wrap each element-destruction loop in a pattern that guarantees all elements 
are destroyed regardless of exceptions. A minimal approach:
   
   ```cpp
   ~SeqBaseObj() {
     Any* begin = MutableBegin();
     std::exception_ptr first_exception;
     for (int64_t i = 0; i < TVMFFISeqCell::size; ++i) {
       try {
         (begin + i)->Any::~Any();
       } catch (...) {
         if (!first_exception) {
           first_exception = std::current_exception();
         }
       }
     }
     if (data_deleter != nullptr) {
       try {
         data_deleter(data);
       } catch (...) {
         if (!first_exception) {
           first_exception = std::current_exception();
         }
       }
     }
     // Optionally: log or rethrow first_exception, though rethrowing
     // from a destructor is generally inadvisable. At minimum, all
     // resources are now cleaned up.
   }
   ```
   
   Alternatively (and preferably as a defense-in-depth measure), enforce 
`noexcept` on the deleter contract:
   
   1. Make `Any::reset()` explicitly `noexcept`.
   2. Document and/or `static_assert` that `Object::deleter` must be 
non-throwing.
   3. Add a `noexcept` wrapper around `header_.deleter()` invocations in 
`DecRef()` that calls `std::terminate()` on violation, making the contract 
self-documenting and fail-fast rather than silently leaking.
   
   The two approaches are complementary: the loop-level guard ensures 
robustness, while the `noexcept` deleter contract prevents the issue at the 
source.
   
   ## Labels
   
   `bug`, `exception-safety`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to