Thanks to everyone who contributed to the board update under such short
notice.  I always enjoy reading what the rest of the project has been up to
-- if anyone else is interested, the final report that was submitted can be
found below.

Thanks again and have a nice weekend,
Andrew




## Description:
The mission of Apache Arrow is the creation and maintenance of software
related to columnar in-memory processing and data interchange. More
information can be found at https://arrow.apache.org/overview/

## Project Status:

Current project status: Ongoing (high activity)

Issues for the board: None

## Membership Data:
Apache Arrow was founded 2016-01-19 (7 years ago)
There are currently 97 committers and 50 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:4.

Community changes, past quarter:
- Ben Baumgold was added to the PMC on 2023-06-19
- Jie Wen was added to the PMC on 2023-06-10
- Dewey Dunnington was added to the PMC on 2023-06-22
- Matthew Topol was added to the PMC on 2023-05-02
- Gang Wu was added as committer on 2023-05-15
- Kevin Gurney was added as committer on 2023-07-04
- Marco Neumann was added as committer on 2023-05-11
- Mehmet Ozan Kabak was added as committer on 2023-06-10
- Ruihang Xia was added as committer on 2023-04-15

## Project Activity:

There has been healthy debate about adding new formats, [StringArray] and
[ListView], focused on increasing Arrow’s appeal in high performance
 computation engines.

We have completed the transition from JIRA to using GitHub issues for the
mono
repo and that appears to be going well.

The DataFusion subproject is considering applying to become its own top
level
Apache project (see DataFusion update below)

[StringArray]:
https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v
[ListView]: https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb


## Sub Project Updates
Arrow has several subprojects, as listed on https://arrow.apache.org/

### ADBC

We have released 2 new minor versions. They include new drivers and new
implementations.

### Arrow Flight

We have added new features to the Arrow Flight specification:

1. Ordered data support: https://github.com/apache/arrow/issues/34852
2. Resultset expiration support:
https://github.com/apache/arrow/issues/35500

### Arrow Flight SQL

We have updated the Arrow Flight SQL specifications based on the above Arrow
Flight update.

### DataFusion

DataFusion continues to grow and mature. The community added many new
features
as described in the latest [blog] post, and discussed and came to consensus
on
the [goals] of the project and is discussing a [move to its own top level
Apache project]. Current development focus is on performance and adding
better
support for structured types such as Lists and Structs. We expect more work
on
improving documentation and communicating externally over the next quarter.

[blog]: https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/
[goals]: https://github.com/apache/arrow-datafusion/discussions/6441
[move to its own top level Apache project]:
 https://github.com/apache/arrow-datafusion/discussions/6475


## Language Area Updates

Arrow has at least 12 different language implementations, as explained in
https://arrow.apache.org/overview/

Arrow 12.0.0 was released from the monorepo:
https://arrow.apache.org/blog/2023/05/02/12.0.0-release/

### C++

PRs have been created with example implementations of two new layouts, Array
View and String View.  These layouts are motivated by Arrow-compatible
engines
which found these layouts to be more efficient for their workflows.

As mentioned in the previous report, the C++ compute engine Acero was broken
out into a separate module and Arrow-C++ can now be built without it,
allowing
for more modular feature configuration.

### C#

C# now has a complete implementation of the C data interface, allowing for
efficient intra-process communication between C# and other languages.  In
addition, there has been some early discussion

### Go

PRs were created with the example implementation of StringView for Go to be
the second implementation in order to vote on the layout. Changes were
introduced to improve compatibility with x86 (32-bit) systems and TinyGo
builds for WebAssembly builds, along with corresponding CI builds.

A default Arrow Flight middleware was added for handling Cookies via gRPC
headers.

Usage of the Go implementation continues to grow and expand in the
community.

### Java

Ongoing maintenance of the Arrow Java implementation remains steady.

### JavaScript

### Julia
We have released new versions rapidly when we fix a problem.

New PMC member who focuses on Julia has joined. There are 2 PMC members who
focus on Julia now.

### nanoarrow

The 0.2.0 release of nanoarrow featured support for decoding the Arrow IPC
format and included a number of interface improvements and bugfixes
resulting
from early usage. Ongoing work includes support for non-CPU data via the
Arrow
C Device interface and documentation improvements suggested by early users
of
the library.


### Rust

The Rust implementation has been focused on improving the UX of the API, the
speed, consistency and correctness (timezones!) of the kernels.

### C (GLib)

We have added new bindings continually as usual.

### MATLAB

New committer who focuses on MATLAB has joined. The committer is the first
committer who focuses on MATLAB. We’ll expand the MATLAB community.

Integrated support for mathworks/libmexclass, enabling streamlined
development
of the MATLAB interface. As a result, significant progress has been made on
public MATLAB APIs, including support for Array and RecordBatch construction
from equivalent MATLAB types (e.g. table).

Recently merged Windows and ccache CI support, bridging the platform gap for
MATLAB qualification. This will help ensure quality of PRs and improve
developer confidence when making changes.

Next steps for the MATLAB interface include working on compound / nested
data
types and tabular file I/O workflows.

### Python

The python community is embracing “protocols”, which allow for
library-agnostic interchange and duck-typing.  Pyarrow has added support for
the dataframe interchange protocol which maps to pyarrow’s Table class.  In
addition, some early discussion has begun around a dataset protocol based on
pyarrow’s datasets API.

### R

The R bindings now support JSON Datasets and continue to benefit from
ongoing
performance enhancements and feature additions in the C++ library.

### Ruby

Ruby related questions/issue reports were increased. It shows that user base
of the Ruby bindings is increasing.

### Swift

We have started implementing Arrow Flight.

## Community Health:
Community communication continues to be strong.

There have been 9 blog posts published to https://arrow.apache.org/blog/  in
the last 3 months, including two from community members on their use of
Arrow

The mailing lists are active

* dev@arrow.apache.org had a 10% decrease in traffic in the past quarter
(779
  emails compared to 858)
* j...@arrow.apache.org had a 100% decrease in traffic in the past quarter
(0
  emails compared to 10778)

For the mono repo:

* 2275 commits in the past quarter (5% increase)
* 254 code contributors in the past quarter (1% increase)
* 1986 PRs opened on GitHub, past quarter (-6% change)
* (1954 PRs closed on GitHub, past quarter (-11% change)
* 1573 issues opened on GitHub, past quarter (-11% change)
* 1342 issues closed on GitHub, past quarter (-5% change)




On Thu, Jul 13, 2023 at 1:28 PM Kevin Gurney <kgur...@mathworks.com> wrote:

> Hi All,
>
> Thanks for putting this together, Andrew!
>
> Sarah, Fiona, and I added some notes about the MATLAB interface.
>
> Best Regards,
>
> Kevin Gurney
> ________________________________
> From: Sutou Kouhei <k...@clear-code.com>
> Sent: Wednesday, July 12, 2023 9:36 PM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Subject: Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide
> feedback
>
> Hi,
>
> Thanks! I've added something.
>
> --
> kou
>
> In <CAFhtnRxMnFoX5EJ8_7P9XtrmPT7n8AgPSkGTS7cju=vzyqb...@mail.gmail.com>
> "Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback" on
> Wed, 12 Jul 2023 16:32:23 -0400,
> Andrew Lamb <al...@influxdata.com> wrote:
>
> > I apologize, I sent the link out for the last board report
> >
> > This correct link is [1]
> >
> > [1]
> >
> https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam
> <
> https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam
> >
> >
> > On Wed, Jul 12, 2023 at 5:49 AM Andrew Lamb <al...@influxdata.com>
> wrote:
> >
> >> Hello Arrow Community,
> >>
> >> TLDR: Please add any comments or board content directly to [2] or reply
> to
> >> this email and I will incorporate your comments. You can see what we
> >> currently have at the end of this email.
> >>
> >> In an epic scheduling fail, I forgot to organize this report a few weeks
> >> ago, so now the deadline is tight.
> >>
> >> One of the responsibilities of being part of the Apache Software
> Foundation
> >> (ASF) is to regularly summarize the state of the project in a quarterly
> >> update to the ASF board. I plan to submit the next report on July 14,
> 2023
> >> (in 2 days time -- I am sorry for the late notice)
> >>
> >> Historically[1], Arrow has crowd sourced the content which has worked
> >> well. While this is partly an administrative reporting exercise, I
> think it
> >> is also valuable to reflect on the past and think about goals for the
> >> future.
> >>
> >> It would be especially interesting if anyone from the various language
> >> implementation communities could provide an update of a sentence or two.
> >>
> >> Andrew
> >>
> >> [1]: https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5<
> https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5>
> >>
> >> [2]:
> >>
> >>
> https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#
> <
> https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#
> >
> >>
> >>
> >>
> >> ## Description:
> >> The mission of Apache Arrow is the creation and maintenance of software
> >> related
> >> to columnar in-memory processing and data interchange. More information
> >> can be found at https://arrow.apache.org/overview/<
> https://arrow.apache.org/overview>
> >>
> >> ## Issues:
> >>
> >>
> >> ## Membership Data:
> >> Apache Arrow was founded 2016-01-19 (7 years ago)
> >> There are currently 97 committers and 50 PMC members in this project.
> >> The Committer-to-PMC ratio is roughly 7:4.
> >>
> >> Community changes, past quarter:
> >> - Ben Baumgold was added to the PMC on 2023-06-19
> >> - Jie Wen was added to the PMC on 2023-06-10
> >> - Dewey Dunnington was added to the PMC on 2023-06-22
> >> - Matthew Topol was added to the PMC on 2023-05-02
> >> - Gang Wu was added as committer on 2023-05-15
> >> - Kevin Gurney was added as committer on 2023-07-04
> >> - Marco Neumann was added as committer on 2023-05-11
> >> - Mehmet Ozan Kabak was added as committer on 2023-06-10
> >> - Ruihang Xia was added as committer on 2023-04-15
> >>
> >>
> >>
> >> ## Project Activity:
> >>
> >> There has been healthy debate about adding new formats, [StringArray]
> and
> >> [ListView], focused on increasing Arrow’s appeal in high performance
> >> computation engines.
> >> We have completed the transition from JIRA to using Github issues for
> the
> >> mono repo and that appears to be going well.
> >>
> >> The DataFusion subproject is considering applying to become its own top
> >> level Apache project (see DataFusion update below)
> >> [StringArray]:
> >> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v<
> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v>
> >> [ListView]:
> >> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb<
> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb>
> >>
> >>
> >>
> >> ## Community Health:
> >>
> >>
> >> There have been 9 blog posts published to
> https://arrow.apache.org/blog/<https://arrow.apache.org/blog>
> >> in the last 3 months, including two from community members on their use
> of
> >> Arrow
> >>
> >>
> >> ## Sub Project Updates
> >> Arrow has several subprojects, as listed on https://arrow.apache.org/<
> https://arrow.apache.org>
> >>
> >> ### ADBC
> >>
> >> ### Arrow Flight
> >>
> >> ### Arrow Flight SQL
> >>
> >> ### DataFusion
> >>
> >> DataFusion continues to grow and mature. The community added many new
> >> features as described in the latest [blog] post, and discussed and came
> to
> >> consensus on the [goals] of the project and is discussing a [move to its
> >> own top level Apache project]. Current development focus is on
> performance
> >> and adding better support for structured types such as LIsts and
> Structs.
> >> We expect more work on improving documentation and communicating
> externally
> >> over the next quarter.
> >>
> >> [blog]: https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/<
> https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0>
> >> [goals]: https://github.com/apache/arrow-datafusion/discussions/6441<
> https://github.com/apache/arrow-datafusion/discussions/6441>
> >> [move to its own top level Apache project]:
> >> https://github.com/apache/arrow-datafusion/discussions/6475<
> https://github.com/apache/arrow-datafusion/discussions/6475>
> >>
> >>
> >> ## Language Area Updates
> >>
> >>
> >> Arrow has at least 12 different language implementations, as explained
> in
> >> https://arrow.apache.org/overview/<https://arrow.apache.org/overview/>
> >>
> >> Arrow 12.0.0 was released from the monorepo:
> >> https://arrow.apache.org/blog/2023/05/02/12.0.0-release/<
> https://arrow.apache.org/blog/2023/05/02/12.0.0-release>
> >>
> >>
> >>
> >> ### C++
> >>
> >>
> >>
> >> ### C#
> >>
> >>
> >>
> >> ### Go
> >>
> >>
> >> ### Java
> >>
> >>
> >>
> >>
> >> ### JavaScript
> >>
> >> ### Julia
> >>
> >> ### nanoarrow
> >>
> >>
> >>
> >> ### Rust
> >>
> >>
> >> ### C (GLib)
> >>
> >>
> >> ### MATLAB
> >>
> >>
> >>
> >>
> >> ### Python
> >>
> >>
> >>
> >> ### R
> >>
> >>
> >>
> >> ### Ruby
> >>
> >>
> >> ### Swift
> >>
> >>
> >> ## Release activity
> >>
> >> (This is automatically generated):
> >>
> >> RS-DATAFUSION-PYTHON-27.0.0 was released on 2023-07-08.
> >> RS-43.0.0 was released on 2023-07-03.
> >> RS-DATAFUSION-27.0.0 was released on 2023-06-30.
> >> ADBC-0.5.1 was released on 2023-06-26.
> >> NANOARROW-0.2.0 was released on 2023-06-22.
> >> ADBC-0.5.0 was released on 2023-06-20.
> >> RS-42.0.0 was released on 2023-06-20.
> >> 12.0.1 was released on 2023-06-13.
> >> JULIA-2.6.2 was released on 2023-06-12.
> >> JULIA-2.6.1 was released on 2023-06-08.
> >> RS-DATAFUSION-26.0.0 was released on 2023-06-07.
> >> RS-41.0.0 was released on 2023-06-06.
> >> RS-OS-0.6.1 was released on 2023-06-06.
> >> JULIA-2.6.0 was released on 2023-06-05.
> >> RS-DATAFUSION-25.0.0 was released on 2023-05-23.
> >> RS-40.0.0 was released on 2023-05-22.
> >> RS-OS-0.6.0 was released on 2023-05-22.
> >> ADBC-0.4.0 was released on 2023-05-12.
> >> RS-39.0.0 was released on 2023-05-09.
> >> RS-DATAFUSION-24.0.0 was released on 2023-05-09.
> >> 12.0.0 was released on 2023-05-01.
> >> RS-DATAFUSION-PYTHON-23.0.0 was released on 2023-04-28.
> >> RS-38.0.0 was released on 2023-04-25.
> >> RS-DATAFUSION-23.0.0 was released on 2023-04-24.
> >> JULIA-2.5.2 was released on 2023-04-19.
> >> JULIA-2.5.1 was released on 2023-04-16.
> >> RS-DATAFUSION-PYTHON-22.0.0 was released on 2023-04-14.
> >>
> >>
>

Reply via email to