Thanks for the feedback so far! I have a PR up to fix the segfault issue. https://github.com/apache/datafusion-ballista/pull/1769
On Sun, May 24, 2026 at 2:47 PM Kevin Liu <[email protected]> wrote: > > Found an issue with running on MacOS for `ballista==53.0.0` wheels on > TestPyPI. Tested on linux successfully, passed all the examples from > python/README. > > MacOS Issue: > Constructing the in-process BallistaScheduler or BallistaExecutor > segfaults. Reproduced on Python 3.10 and 3.12, with both uv and pip/venv. > Minimal repro on MacOS: > ``` > uv run --python 3.10 --with "ballista==53.0.0" \ > --index-url https://test.pypi.org/simple/ \ > --extra-index-url https://pypi.org/simple/ \ > --index-strategy unsafe-best-match \ > python -X faulthandler -u -c \ > "from ballista import BallistaScheduler; BallistaScheduler()" > ``` > Output: > """ > Fatal Python error: Segmentation fault > > Current thread 0x00000001effd1e80 (most recent call first): > File "<string>", line 1 in <module> > > Extension modules: pyarrow.lib (total: 1) > """ > > Linux via docker: > ``` > docker run --rm --platform linux/amd64 -v "$PWD/python/testdata:/data:ro" > python:3.10-slim bash -c ' > pip install --quiet --index-url https://test.pypi.org/simple/ \ > --extra-index-url https://pypi.org/simple/ "ballista==53.0.0" > datafusion && > python -u <<PY > from ballista import BallistaSessionContext, BallistaScheduler, > BallistaExecutor > from datafusion import col, lit > import time > > sched = BallistaScheduler(bind_port=50050); sched.start() > execu = BallistaExecutor(bind_port=50051, scheduler_port=50050); > execu.start() > time.sleep(3) > > ctx = BallistaSessionContext("df://localhost:50050") > ctx.sql("create external table t stored as parquet location > \"/data/test.parquet\"") > ctx.sql("select * from t limit 5").show() > ctx.sql("select count(*) as n from t").show() > > df = ctx.read_parquet("/data/test.parquet").filter(col("id") > > lit(4)).limit(5) > batches = df.collect() > print("rows:", sum(b.num_rows for b in batches)) > > execu.close(); sched.close() > PY' > ``` > Outputs: > """ > DataFrame() > +----+----------+-------------+--------------+---------+------------+-----------+------------+------------------+------------+---------------------+ > | id | bool_col | tinyint_col | smallint_col | int_col | bigint_col | > float_col | double_col | date_string_col | string_col | timestamp_col > | > +----+----------+-------------+--------------+---------+------------+-----------+------------+------------------+------------+---------------------+ > | 4 | true | 0 | 0 | 0 | 0 | 0.0 > | 0.0 | 30332f30312f3039 | 30 | 2009-03-01T00:00:00 | > | 5 | false | 1 | 1 | 1 | 10 | 1.1 > | 10.1 | 30332f30312f3039 | 31 | 2009-03-01T00:01:00 | > | 6 | true | 0 | 0 | 0 | 0 | 0.0 > | 0.0 | 30342f30312f3039 | 30 | 2009-04-01T00:00:00 | > | 7 | false | 1 | 1 | 1 | 10 | 1.1 > | 10.1 | 30342f30312f3039 | 31 | 2009-04-01T00:01:00 | > | 2 | true | 0 | 0 | 0 | 0 | 0.0 > | 0.0 | 30322f30312f3039 | 30 | 2009-02-01T00:00:00 | > +----+----------+-------------+--------------+---------+------------+-----------+------------+------------------+------------+---------------------+ > DataFrame() > +---+ > | n | > +---+ > | 8 | > +---+ > rows: 3 > """ > > Best, > Kevin Liu > > On Sun, May 24, 2026 at 7:49 AM Shekhar Rajak <[email protected]> > wrote: > > > +1 (non-binding) — installed from test.pypi.org and ran the smoke import. > > $ pip show ballista > > python -c "import ballista; print(ballista.__version__)" > > Name: ballista > > Version: 53.0.0 > > Summary: Python client for Apache Arrow Ballista Distributed SQL Query > > Engine > > Home-page: https://datafusion.apache.org/ballista/ > > Author: > > Author-email: > > License: > > Location: /private/tmp/ballista-rc-verify/lib/python3.13/site-packages > > Requires: datafusion, pyarrow > > Required-by: 53.0.0 > > Result: `from ballista import BallistaSessionContext; print('ok')` -> ok > > > > > > > > > > Regards, > > Shekharrajak > > > > > > On Sunday 24 May 2026 at 06:53:18 am GMT+5:30, Andy Grove < > > [email protected]> wrote: > > > > I have published a test version of Ballista to test.pypi.org [1] and I > > am looking for help testing this. > > > > Instructions for installing Ballista from test.pypi.org can be found > > in the release verification documentation [2]. > > > > Please note that this is NOT an official Apache release. This is a > > test of the new PyPi publishing process. > > > > This release was built from GitHub tag 53.0.0-rc1-pypitest-3. > > > > I plan on creating an official 53.x.x release to PyPi pretty soon, > > once I have feedback from this test. > > > > Thanks, > > > > Andy. > > > > [1] https://test.pypi.org/project/ballista/ > > [2] > > https://github.com/apache/datafusion-ballista/blob/main/dev/release/README.md#optional-verify-the-python-wheels-from-testpypi > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
