Here is the report that was submitted ## Description: The mission of Apache Arrow is the creation and maintenance of software related to columnar in-memory processing and data interchange
## Issues: Lack of ASF sponsored invite-free chat service is a minor source of friction for community building. Most subprojects now use github for tickets to lower the barrier to entry for new / casual contributors, but we still have fragmented stories for group chat. ASF Slack requires an invite and some sub communities use other chat-like services. ## Membership Data: Apache Arrow was founded 2016-01-20 (7 years ago) There are currently 89 committers and 45 PMC members in this project. The Committer-to-PMC ratio is roughly 2:1. Community changes, past quarter: - Kun Liu was added to the PMC on 2022-11-13 - Jacob Quinn was added to the PMC on 2022-10-25 - Nicola Crane was added to the PMC on 2022-10-25 - Jacob Wujciak was added as committer on 2022-12-19 - Ben Baumgold was added as committer on 2022-10-26 - Bogumił Kamiński was added as committer on 2022-10-24 - Eric Hanson was added as committer on 2022-10-26 - Jie Wen was added as committer on 2023-01-08 - Jarrett Revels was added as committer on 2022-11-02 - Curtis Vogt was added as committer on 2022-11-02 - Raúl Cumplido was added as committer on 2022-12-05 - Will Jones was added as committer on 2022-10-28 - Yang Jiang was added as committer on 2022-11-02 ## Project Activity: * Switching from JIRA to github issues in order to keep the overhead for new contributors low (no need to register for an ASF JIRA account) * [ADBC] (Arrow Database Connectivity) first release: * Community voted to add RLE to the specification * Additional subproject updates are below * We continue to release several different products and releases per quarter [ADBC]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/ Recent releases: ADBC-0.1.0 was released on 2023-01-10. RS-30.0.1 was released on 2023-01-08. RS-OS-0.5.3 was released on 2023-01-08. RS-30.0.0 was released on 2023-01-03. RS-29.0.0 was released on 2022-12-12. RS-OS-0.5.2 was released on 2022-12-07. RS-DATAFUSION-15.0.0 was released on 2022-12-05. DATAFUSION-PYTHON-0.7.0 was released on 2022-11-29. RS-28.0.0 was released on 2022-11-28. 10.0.1 was released on 2022-11-22. RS-BALLISTA-0.10.0 was released on 2022-11-21. JULIA-2.4.1 was released on 2022-11-18. RS-27.0.0 was released on 2022-11-15. RS-DATAFUSION-14.0.0 was released on 2022-11-07. RS-26.0.0 was released on 2022-11-03. 10.0.0 was released on 2022-10-26. JULIA-2.4.0 was released on 2022-10-26. RS-BALLISTA-0.9.0 was released on 2022-10-26. RS-25.0.0 was released on 2022-10-17. ## Community Health: The community health appears good, discussions on the mailing lists and github are productive. We recently had a nice discussion on the State of the Project: https://lists.apache.org/thread/r8gl3wvjgy9k8n2t194r0bbdbxx6ksqc and discussed various ways to keep encouraging the community. ## Language Area Updates Arrow has at least 12 different language bindings, as explained in https://arrow.apache.org/overview/ Arrow 10.0.0 release: https://arrow.apache.org/blog/2022/10/31/10.0.0-release/ ### C++ ### C# ### Go We’re seeing significant increases in interest and usage of the Arrow Go library. From startups like Spice.AI to being incorporated and used in Google BigQuery’s quickstart example and more. 2022 was a big year of updates, fixes, and drumming up interest for the Go module that we hope to continue for increased adoption and usage. The Go module, along with C++, is used as the initial implementation for the Run-End Encoding array implementation. Future development plans are to continue to expand the compute capabilities of the Go module and extend integration with Substrait. ### Java ### JavaScript ### Julia We’ve worked again on simplifying and streamlining the administrative side for the Julia implementation; adding additional committers, simplifying the release process, etc. This has increased the rate of contributions, as expected. There’s interest in finishing the C data/stream interfaces for the Julia implementation soon. ### Rust Rust has several projects: arrow-rs (arrow, parquet, arrow-flight object_store implementations) arrow-datafusion: rust query engine arrow-ballista: distributed query engine We are working to incorporate substrait into DataFusion Working on external communication with several blog posts about technology on sorting Fast and Memory Efficient Multi-Column Sorts in Apache Arrow Rust, Part 1 and Querying Parquet with Millisecond Latency We also continue calendar based release train with good results. ### C (GLib) We’ve added support for 16-bit float type. ### MATLAB 1. We have been focusing development efforts on implementing an "object dispatch layer" that uses MEX to "connect" MATLAB objects with corresponding C++ objects. This code is being actively developed at github.com/mathworks/libmexclass. See the following Arrow mailing list discussion for more context. We hope to upstream the changes needed to make the MATLAB Interface to Arrow use libmexclass under the hood in the coming months. This should enable the MATLAB interface to wrap relevant Arrow C++ objects (e.g. arrow::Array, arrow::Table) and expose them to MATLAB. 2. We have been continuing to investigate Windows CI support for the MATLAB interface. Currently, only Linux and macOS are supported. ### Python ### R ### Ruby We’ve added support for 16-bit float type. There is a new contributor who develops a new data frame library based on the Ruby bindings. The new contributor upstreamed some improvements implemented in the downstream data frame library. On Sun, Jan 8, 2023 at 10:09 PM Andrew Lamb <al...@influxdata.com> wrote: > Thank you Kevin. > > As a reminder to anyone else who may be interested in contributing I plan > to submit this report in 2 days or so on Jan 11 > > Andrew > > On Fri, Jan 6, 2023 at 9:05 PM Kevin Gurney <kgur...@mathworks.com> wrote: > >> Sreehari, Fiona, and I added a few notes about progress on the MATLAB >> interface. >> >> Best Regards, >> >> Kevin Gurney >> ------------------------------ >> *From:* Andrew Lamb <al...@influxdata.com> >> *Sent:* Wednesday, January 4, 2023 7:24 PM >> *To:* u...@arrow.apache.org <u...@arrow.apache.org>; dev < >> dev@arrow.apache.org> >> *Subject:* Re: Apache Arrow Board Report, by Jan 11 2023 >> >> Thank you Jacob and Matthew -- the level of detail in your suggestions >> looks just about perfect. 🙇♂️ >> >> On Wed, Jan 4, 2023 at 12:20 PM Jacob Quinn <quinn.jac...@gmail.com> >> wrote: >> >> > I added a few notes on the Julia implementation. >> > >> > -Jacob >> > >> > On Tue, Dec 27, 2022 at 2:45 PM Andrew Lamb <al...@influxdata.com> >> wrote: >> > >> >> Hello Arrow Community, >> >> >> >> One of the (possibly the only) responsibilities of the PMC chair is to >> >> collect information on the project and submit quarterly updates to the >> ASF >> >> board. The next one is due on January 11, 2023 >> >> >> >> Historically[1], Arrow has crowd sourced the content and I plan to >> >> continue the tradition. >> >> >> >> Please feel free to add your comments directly to [2] or reply to this >> >> email and I will incorporate your comments. >> >> >> >> I think it would be especially interesting if anyone from the following >> >> implementation communities wanted to provide any updates: >> >> >> >> ### C++ >> >> ### C# >> >> ### Go >> >> ### Java >> >> ### JavaScript >> >> ### Julia >> >> ### Rust >> >> ### C (Glib) >> >> ### MATLAB >> >> ### Python >> >> ### R >> >> ### Ruby >> >> >> >> Thank you, >> >> ANdrew >> >> >> >> >> >> [1] https://lists.apache.org/thread/w7lwr7t979oqsqb8qz4smtg9wmj9f48s >> >> [2] >> >> >> https://docs.google.com/document/d/12ybofzyB8FGlsWV6IxefAUqRDgxFR-Kk53kPk7LuKDY/edit?usp=sharing >> >> >> >> >> >