Following on from the email thread "Rust sync meeting" I would like to
start a new discussion about moving the Rust components out to new GitHub
repositories and using a new process for issues and release management.
I have started a Google document [1] with details and to track the work
required for this effort but I will summarize the key points of the
proposal here:
-
Move existing Rust code into two new repositories
-
apache/arrow-rs
-
Arrow + Parquet crates
-
apache/datafusion
-
DataFusion + Ballista crates (which are expected to merge to some
degree over time)
-
TPC-H benchmarks
-
Use GitHub issues for issue tracking
-
Decouple release process
-
Crates are released individually
-
A vote on the source release of the released crate is held over the
mailing list as usual.
-
Rust does not need to release a new version when the rest of Arrow
releases; we bundle our latest released crates to the signed tar.
-
Crates can depend on GitHub commit hashes between releases
The Google document may be the best place to collaborate on the proposal
but I can update the document based on any comments in this email thread as
well.
Note that I have excluded discussion about arrow2/parquet2 from this
proposal and I believe we should discuss that separately as a follow-on
discussion.
I look forward to hearing opinions on this both from current Rust
maintainers and contributors and also from the wider Arrow community.
Thanks,
Andy.
[1]
https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit?usp=sharing