alamb opened a new issue, #2679: URL: https://github.com/apache/arrow-datafusion/issues/2679
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Basically the tests added in https://github.com/apache/arrow-datafusion/pull/2582 to keep ballista and datafusion in sync add significant burden to DataFusion development and I propose removing them, at least temporarily Here is a description of the process: https://github.com/apache/arrow-datafusion/blob/907504c/.github/pull_request_template.md?plain=1#L31-L47 I think the rationale for the new CI was to add friction on DataFusion API changes to encourage a more stable API; However, with the currently ongoing efforts to rework the object store and parquet reading, I think all we are doing with the process is slowing things down. The alternative, to have Ballista keep up with changes in DataFusion, sounds daunting at first, but my firsthand experience suggests it is not that bad. Specifically, https://github.com/influxdata/influxdb_iox, my project, uses DataFusion similarly to Ballista (as the core query engine) and uses a DataFusion pin directly from master. Instead of impinging on the DataFusion development process, we keep IOx up with DataFusion by [manually updating the DataFusion pin in IOX about once a week](https://github.com/influxdata/influxdb_iox/pulls?q=is%3Apr+update+datafusion+is%3Aclosed) , and sorting out any API changes. This does take time, but it is mostly mechanical. We do occasionally find bugs that were introduced into DataFusion such as when we tried most recently with https://github.com/influxdata/influxdb_iox/pull/4743 and we then contribute a fix back upstream (e.g. https://github.com/apache/arrow-datafusion/pull/2674) I would be interested to hear how others keep up with pre-release DataFusion as well (maybe @ovr and cube-js?) **Describe the solution you'd like** I propose removing the Ballista CI check in DataFusion Specifically this check: https://github.com/apache/arrow-datafusion/blob/907504c5aa768601f9d70ad2c8f928bedfa9b069/.github/workflows/rust.yml#L128-L172 And writing up instructions (maybe even automation) on how to upgrade the datafusion pin in Ballista manually **Describe alternatives you've considered** * Do nothing * Bring ballsita back into DataFusion repositoru **Additional context** The move of Ballista to a new repo is tracked in: https://github.com/apache/arrow-datafusion/issues/2502 There are several discussions about this pain: * https://github.com/apache/arrow-ballista/pull/48#discussion_r885486298 cc @andygrove @thinkharderdev @ming535 @Ted-Jiang @xudong963 @tustvold @korowa -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
