The GitHub Actions job "CI" on tvm-ffi.git/pyobject has failed.
Run started by GitHub user cyx-6 (triggered by cyx-6).

Head commit for run:
1b751b2ae49beffbf94adaecc974500e5b32978d / Yaxing Cai <[email protected]>
[FEAT][Python] Tie Python wrapper lifetime to underlying C++ FFI object

Make a.x is a.x, id(a.x) stable across drop+refetch, and f(x) is x for
FFI returns by attaching a 16-byte tagged-pointer header to every
Object allocation. Implements Tianqi's PyObjectTying doc (2026-05-01)
with adaptations forced by Py_LIMITED_API + Cython 3.x constraints.

Design:
  - Two-layer custom-allocator hook in core libtvm_ffi.so:
    `TVMFFICustomAllocHeader { delete_space }`,
    `TVMFFICustomAllocator { allocate }`, plus
    `TVMFFIGetCustomAllocator` / `TVMFFISetCustomAllocator`. libtvm_ffi
    installs a builtin default at registry construction so every
    Object always carries the 8-byte base header. The Python Cython
    module overrides the global default at module load with
    `TVMFFIPyAllocate`, which prepends a 16-byte `PyCustomAllocHeader`
    encoding the Python wrapper binding.
  - Tagged-pointer state machine. `PyCustomAllocHeader` is a single
    `uintptr_t tagged_wrapper` field plus the base header (16 B
    total). Low bit of `tagged_wrapper` is the "live" tag:
      (a) tagged_wrapper == 0                  : never wrapped
      (b) tagged_wrapper == (uintptr_t)w | 1   : live wrapper at w
      (c) tagged_wrapper == (uintptr_t)w       : preserved-mem at w
    CPython PyObjects are >= 8-byte aligned (PyObject_HEAD = 16 B), so
    bit 0 is always free for the tag. Eliminates the 8-byte reserved
    pad an earlier draft needed.
  - Universal cache-on for `make_ret`. Every FFI return (function
    results, callback args, FieldGetter) funnels through
    `make_ret_object` which returns the canonical wrapper for a
    chandle when one exists. Combined with state-(c) revive on the
    cached path, this delivers `a.x is a.x`, stable `id(a.x)` across
    drop+refetch, and `f(x) is x` when a function returns the same
    chandle it received. The doc proposed a `tp_alloc` + TLS hand-off
    for revive; not viable under Py_LIMITED_API
    (`PyType_SetSlot` not in 3.12 SABI; Cython 3.x forces
    `CYTHON_USE_TYPE_SPECS=0` for limited-API targets). The inline
    `make_ret_object` revive branch is the SABI-compatible equivalent
    -- bypasses `tp_alloc` by returning the cached PyObject directly,
    strictly cheaper than the doc's design (no alloc+dealloc cycle on
    revive).
  - State (c) via Cython `def __del__` on `CObject`, mapping to
    `tp_finalize` (PEP 442). When other C++ holders keep the chandle
    alive past the wrapper's Python refcount hitting 0, clear the
    live bit (keep the pointer), DecRef the chandle, `Py_INCREF(self)`
    to resurrect. Wrapper memory survives at the same address until
    revive or chandle death. Shim for
    `PyObject_CallFinalizerFromDealloc` (added to SABI in 3.13)
    using `PyType_GetSlot` + `Py_SET_REFCNT`/`Py_REFCNT` on 3.12;
    paired with `-DCYTHON_USE_TP_FINALIZE=1`.
  - Shutdown guard. `g_tvm_ffi_python_alive` flag is set to 0 by an
    atexit handler registered at Cython module init; read by
    `TVMFFIPyDeleteSpace` before `PyGILState_Ensure` to avoid GIL
    acquire after `Py_Finalize` has begun. State-(c) wrapper bytes
    on chandles still alive at that point are intentionally leaked
    -- process is exiting.
  - Frontend-allocation detection by `delete_space` pointer comparison
    (`TVMFFIPyIsCanonical`) avoids a flag bit on `TVMFFIObject`.
    Pre-Python-init chandles (statically-initialized global functions
    in libtvm_ffi.so) carry only the base header; the Python side
    detects this and skips the binding install.

Trade-off on `_move()` semantics. Universal cache-on means callback
args alias the caller's wrapper (one wrapper, one chandle ref) and
function returns of the same chandle alias the caller's wrapper.
`_move()` semantics are kept as an API: the rvalue setter on either
caller or callback side eager-detaches the canonical-wrapper binding
before the C++ AnyViewToOwnedAny transfer nulls the source chandle.
`test_function.py::test_rvalue_ref` refactored for the new
aliasing-aware use_count expectations.

Refcount fix. The `make_ret_object` state-(b) cache hit path had a
redundant `Py_INCREF(cached)` leaking +1 wrapper ref per cache hit
(verified via `sys.getrefcount`). The `<object>(<PyObject*>...)` cast
already INCREFs for the owned reference; `return cached` transfers it
to the caller. Removed the explicit INCREF.

`PyClassDeleter` in `extra/dataclass.cc` and the `__ffi_new__` /
`__ffi_shallow_copy__` paths are routed through the same allocator
registry so Python-defined types share the layout and lifetime
semantics.

`TVMFFIPyArgSetterObjectRValueRef_` clears the source's binding
eagerly before the C++ side nulls its `chandle`; otherwise a
downstream cache lookup on the (still-alive) chandle would briefly
see a stale back-pointer.

Tests: full Python suite passes (2332 passed, 19 skipped, 2 xfailed).
New `tests/python/test_pyobject_tying.py` (32 tests) covers state
b/c, cache-on aliasing, `_move()` under cache-on, pickle stress,
threading stress, GC integration with state (c), multi-chandle
isolation, and the weakref limitation. `test_function.py::test_rvalue_ref`
refactored for cache-on aliasing semantics.

TODO carried in `function.pxi::_get_global_func`: name-keyed dict
cache for static-init Function id-stability. Most registry-resident
Functions are allocated at C++ static init (before the Python
allocator is registered) so their chandle has only the base header --
make_ret_object's cache can't reach them. Deferred to keep this
change small.

Report URL: https://github.com/apache/tvm-ffi/actions/runs/26362014999

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to