The current Ballista Python bindings [1] were created by cloning the DataFusion Python bindings and then making some modifications. The resulting codebase proved to be challenging to maintain and has not been maintained for almost a year. This repository contains around 1,100 lines of Rust code.
I propose that we archive this repository and adopt a new Python client that only exposes SQL capabilities rather than providing both SQL and DataFrame APIs. I have a PR [2] up for a new client, and this only contains 75 lines of Rust code. This new client uses the datafusion-python crate as a dependency rather than duplicating code. My hope is that this much leaner implementation will be easier to maintain and keep up-to-date with Ballista releases. We can add the DataFrame API in the future as a thin wrapper around the datafusion-python dependency if the project gains enough traction. If there are no objections, I will go ahead and archive the old repository in the next week or two (and update the README to point to the new client). Thanks, Andy. [1] https://github.com/apache/arrow-ballista-python [2] https://github.com/apache/arrow-ballista/pull/970