alamb opened a new issue, #2679:
URL: https://github.com/apache/arrow-datafusion/issues/2679

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   Basically the tests added in 
https://github.com/apache/arrow-datafusion/pull/2582 to keep ballista and 
datafusion in sync add significant burden to DataFusion development and I 
propose removing them, at least temporarily
   
   Here is a description of the process:
   
https://github.com/apache/arrow-datafusion/blob/907504c/.github/pull_request_template.md?plain=1#L31-L47
   
   I think the rationale for the new CI  was to add friction on DataFusion API 
changes to encourage a more stable API; However, with the currently ongoing 
efforts to rework the object store and parquet reading, I think all we are 
doing with the process is slowing things down. 
   
   The alternative, to have Ballista keep up with changes in DataFusion, sounds 
daunting at first, but my firsthand experience suggests it is not that bad.  
Specifically,  https://github.com/influxdata/influxdb_iox, my project, uses 
DataFusion similarly to Ballista (as the core query engine) and uses a 
DataFusion pin directly from master. Instead of impinging on the DataFusion 
development process, we keep IOx up with DataFusion by [manually updating the 
DataFusion pin in IOX about once a 
week](https://github.com/influxdata/influxdb_iox/pulls?q=is%3Apr+update+datafusion+is%3Aclosed)
 , and sorting out any API changes. 
   
   This does take time, but it is mostly mechanical. We do occasionally find 
bugs that were introduced into DataFusion such as when we tried most recently 
with https://github.com/influxdata/influxdb_iox/pull/4743 and we then 
contribute a fix back upstream (e.g. 
https://github.com/apache/arrow-datafusion/pull/2674)
   
   I would be interested to hear how others keep up with pre-release DataFusion 
as well (maybe @ovr and cube-js?)
   
   **Describe the solution you'd like**
   I propose removing the Ballista CI check in DataFusion 
   
   Specifically this check: 
https://github.com/apache/arrow-datafusion/blob/907504c5aa768601f9d70ad2c8f928bedfa9b069/.github/workflows/rust.yml#L128-L172
   
   And writing up instructions (maybe even automation) on how to upgrade the 
datafusion pin in Ballista manually
   
   **Describe alternatives you've considered**
   * Do nothing
   * Bring ballsita back into DataFusion repositoru
   
   **Additional context**
   The move of Ballista to a new repo is tracked in: 
https://github.com/apache/arrow-datafusion/issues/2502
   
   There are several discussions about this pain:
   * https://github.com/apache/arrow-ballista/pull/48#discussion_r885486298
   
   
   
   cc @andygrove @thinkharderdev @ming535 @Ted-Jiang @xudong963 @tustvold 
@korowa 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to