jorgecarleitao commented on issue #1176: URL: https://github.com/apache/arrow-rs/issues/1176#issuecomment-1430883886
> I would just like to get away from this situation where we have two concurrent projects. [...] I agree. I agree that the situation is not productive. I am sorry that I caused frustration to people here. > Whilst I do not like the idea of porting stuff across, and yes it would be an annoying use of time, I am willing to contribute to such an effort if it sees an end to this situation. I am also willing to contribute to such an effort. What do you think about something to the effect of: * Arrow2 is donated to Apache Arrow and its development ceases in jorgecarleitao/arrow2 * The core of arrow2 (`array/`, `bitmap/`, `offsets.rs`, `types/`) are lifted to a crate living in this repo (e.g. `arrow-core` or something). * the core receives relevant methods from arrow-rs; add methods existing in arrow-rs with "deprecated" to give time for arrow-rs users to use them. * arrow-rs' FFI of arrow2 is moved to a separate crate and replaces arrow-rs' one * Arrow-rs' compute is migrated to use `arrow-core` * Arrow-rs' IO except IPC is migrated to use `arrow-core` * Arrow-rs' IPC IO is replaced by arrow2's implementation with necessary adjustments * arrow-core will add `RunEndArray` (this is missing there atm) * `RecordBatch` (arrow-rs) and `Chunk` (arrow2) co-exist to give room for both communities to use (in core or something else) * Development and governance follows Apache and this repo of community-driven development. This could result in the following changes to arrow-rs: * ArrayData is removed * It becomes interoperable with `Vec` but no longer aligned with cache lines. * IPC support is improved with e.g. unsafe free, big endian support, mmap of IPC files * There is code churn related to `from` vs `from_slice` (we can switch to arrow-rs names) * slicing of arrays become easier (handling of offsets) It would also end the arrow-arrow2 split e.g. removing the un-productive discussions around "which is better", and combine development efforts. Some challenges: * Arrow2 uses `Box<dyn Array>` as children to allow easy mutation; arrow-rs uses `Arc<dyn Array>` * Arrow2 does not have `TimestampArray` nor `DecimalArray`, and instead sticks to the physical types only -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
