andygrove opened a new pull request, #1769:
URL: https://github.com/apache/datafusion-ballista/pull/1769
# Which issue does this PR close?
Closes #.
# Rationale for this change
The published `ballista==53.0.0` cp310-abi3 macOS wheel on TestPyPI
segfaults the moment Python constructs a `BallistaScheduler` or
`BallistaExecutor`:
```
uv run --python 3.10 --with "ballista==53.0.0" \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple/ \
--index-strategy unsafe-best-match \
python -X faulthandler -u -c \
"from ballista import BallistaScheduler; BallistaScheduler()"
```
Output:
```
Fatal Python error: Segmentation fault
Current thread 0x00000001effd1e80 (most recent call first):
File ""<string>"", line 1 in <module>
Extension modules: pyarrow.lib (total: 1)
```
The same exact API works on Linux (manylinux x86_64 wheel via docker).
## Root cause
LLDB shows ~104,500 frames of the same symbol at offset `+696` before
`_PyType_call`. Disassembly at that offset is a `bl` back to the function's own
entry (a self-call), and the binary's strings include `mimalloc: warning:`,
`aligned allocation request is too large (size %zu, alignment %zu)`, etc. The
crash is inside `libmimalloc` recursing during the first `malloc` Python issues
after the extension loads.
On macOS, `libmimalloc` installs a static constructor that registers itself
as a malloc zone, so every `malloc` in the process (including Python's
`PyObject_Malloc`) is intercepted. On Linux that auto-interposition does not
happen, so the linked mimalloc code is dead unless explicitly declared as
`#[global_allocator]`, which the Python wheel never does. Hence the platform
asymmetry.
`mimalloc` was reaching the wheel through two independent paths:
```
pyballista -> ballista (default = standalone)
-> ballista-executor (default features include `mimalloc`)
pyballista -> datafusion-python
(default features include `mimalloc`)
```
Cutting either alone is not enough; both must be cut.
# What changes are included in this PR?
* `ballista/client/Cargo.toml`: depend on `ballista-executor` and
`ballista-scheduler` with `default-features = false`. The `standalone` feature
only needs the in-process constructors. `arrow-ipc-optimizations` is re-enabled
on the executor dep so the in-process executor keeps its IPC read perf
optimization.
* `ballista/executor/Cargo.toml`: move `mimalloc` from the crate's default
feature set into the `build-binary` feature, since `#[global_allocator]` is
only set in `src/bin/main.rs`. The `ballista-executor` binary still pulls in
mimalloc via `cargo build` defaults.
* `python/Cargo.toml`: depend on `datafusion-python` with `default-features
= false`.
* `python/Cargo.lock`: regenerated.
After the change, `cargo tree -p pyballista --invert mimalloc` returns *no
match*. Locally rebuilt cp310-abi3 macOS wheel constructs and starts both
`BallistaScheduler` and `BallistaExecutor`, runs `CREATE EXTERNAL TABLE` +
`SELECT COUNT(*)` end-to-end, and exits cleanly. `cargo check` passes for
default, `--no-default-features`, `ballista` with `--features standalone`, and
the `ballista-executor` binary build.
# Are there any user-facing changes?
No public API change. The Python wheel will no longer link `libmimalloc`, so
memory allocation in the Python process is unchanged from system `malloc`.
Users running `ballista-executor` as a binary continue to use mimalloc as the
global allocator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]