cyx-6 opened a new pull request, #593:
URL: https://github.com/apache/tvm-ffi/pull/593
## Summary
Make `a.x is a.x` and `id(a.x)` stable in Python by attaching every Python
wrapper to its underlying C++ object via a 32-byte `PyCustomizeAllocHeader`
prepended to every Object allocation. Reported by Junru Shao as the largest
source of agent / OAI-monorepo migration failures. Implements the design in
Tianqi Chen's *PyObjectTying* doc (2026-05-01).
Before:
```python
a = MyClass(Inner(...))
assert a.x is a.x # False — fresh wrapper per attribute access
assert id(a.x) == id(a.x) # flaky
```
After: both hold, and identity is preserved across a wrapper
finalize-and-revive cycle whenever the C++ object outlives the wrapper.
## Design
The prepended header has a three-state machine:
| State | `py_obj` | `cached_mem` | Meaning |
|-------|----------|--------------|---------|
| (a) | NULL | NULL | C++ obj exists; never wrapped |
| (b) | non-NULL | non-NULL | Live Python wrapper |
| (c) | NULL | non-NULL | Wrapper finalized; memory cached |
```
malloc start
+------------------------------+--------+----------------------+
| PyCustomizeAllocHeader | <pad> | Object (T) |
| void* py_obj | | TVMFFIObject header_|
| void* cached_mem | | ... |
| void(*)(void*) free_cb | | |
+------------------------------+--------+----------------------+
^ T*, Object*, TVMFFIObject*
```
`kPyHeaderOffset = 32` bytes on x86_64 (rounded up to
`alignof(max_align_t)`). Header recovery is `(char*)tptr - kPyHeaderOffset`.
## Implementation
- **Allocator funnel** — `SimpleObjAllocator::Handler<T>::New` and the array
variant in `include/tvm/ffi/memory.h` allocate `kPyHeaderOffset + sizeof(T)`,
zero-init the header, place T at the offset. The Weak-branch deleter calls
`free_cb(cached_mem)` if set, then frees the whole block.
- **Python-class bypass sites** — `PyClassDeleter`, `__ffi_new__`, and
`__ffi_shallow_copy__` in `src/ffi/extra/dataclass.cc` were patched in parallel
to keep the layout uniform across all Object allocations.
- **State-(c) tp_finalize hook** — `def __del__` on `CObject` (object.pxi)
is mapped to `tp_finalize` (PEP 442). On wrapper death with C++ strong-count >
1, it clears `h.py_obj`, DecRefs the chandle, and `Py_INCREF(self)` to
resurrect — CPython then skips `tp_dealloc` so the wrapper memory survives in
place.
- **CYTHON_USE_TP_FINALIZE=1 under USE_SABI** — Cython 3.x disables this for
limited-API targets because `PyObject_CallFinalizerFromDealloc` was added to
the limited API only in 3.13. A 16-line shim in `tvm_ffi_python_helpers.h`
reimplements it from limited-ABI primitives (`PyType_GetSlot`, `Py_SET_REFCNT`,
`Py_REFCNT`). Comment in CMakeLists.txt explains the dependency.
- **Cache opt-in per call site** — `make_ret` gains a `cache_lookup`
parameter. Default False. `FieldGetter` (attribute access) sets True. FFI
return values, callback arg unpacking, and rvalue-ref paths use
`make_ret_object_no_cache` so callbacks see wrappers distinct from the caller,
preserving classical move/refcount semantics.
- **RValueRef path** — `TVMFFIPyArgSetterObjectRValueRef_` eagerly clears
the source's header binding via `_detach_chandle_binding` before the C++ side
nulls `chandle`. Without this, the deleter would later call `PyObject_GC_Del`
on a still-live, still-referenced Python wrapper. Regression test included.
- **PyNativeObject exemption** — `String`, `Bytes` discard the transient
`Object` wrapper after construction; no header binding is installed and no
identity stability is needed.
## Test plan
- [x] Full Python suite (`uv run pytest tests/python`) passes
- [x] New `tests/python/test_pyobject_tying.py` covers:
- State (b): `a.x is a.x`, stable `id(a.x)` over 100 accesses, distinct
wrappers for distinct chandles
- State (b→c→b) revive: 2000-cycle stress test asserts exactly one wrapper
address; chandle preserved across revive
- State (c) skipped when wrapper holds last C++ ref (regression for
clean-free use-after-free)
- RValueRef move (regression for the original `test_rvalue_ref` blocker)
- Pickle round-trip preserves attribute identity
- PyNativeObject (String/Bytes) exempt — no state (c) accumulation
- Type-mismatch fallthrough on revive
- Field setter after revive (mutable dataclass)
- [ ] CI: lint, clang-tidy, doc, C++/Python/Rust on Linux x86_64 + aarch64,
macOS arm64, Windows AMD64
## Out-of-scope follow-ups
Documented in the design doc, not in this PR:
- **Per-type opt-out** for hot small types via a `kSkipPyHeader` trait
- **Header compression to 2 pointers** by replacing per-instance `free_cb`
with a global module-init callback
- **Free-threaded CPython 3.13t+** — replace GIL assumption with CAS on
`h->py_obj` for the fast path
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]