Hi,

Does the parquet rust implementation have a similar issue?

Similar to the C++ implementation, the Rust implementation lives under the Apache Arrow umbrella and does not have any direct affiliation with the Apache Parquet project that I am aware of, beyond using the same format specification. However, as almost all of the users and contributions are with respect to the arrow interfaces, and not the parquet record APIs, there perhaps isn't the same ambiguity as encountered with the C++ implementation. I would expect all issues to be raised in the arrow-rs repository, and a PARQUET Jira only raised, likely by myself or whoever is triaging the issue, if there is some issue/ambiguity pertaining to the format itself.

Kind Regards,

Raphael

On 02/02/2023 01:58, Gang Wu wrote:
Hi Will,

AFAIK, the Apache Parquet community no longer considers contribution to
parquet-cpp when promoting new committers after the donation to Apache
Arrow.

It would be a dilemma for the parquet-cpp contributors if none of the
Apache Arrow community or Apache Parquet community recognizes their work.

Does the parquet rust implementation have a similar issue?

Best,
Gang

On Thu, Feb 2, 2023 at 3:27 AM Will Jones <will.jones...@gmail.com> wrote:

Hello,

A while back, the Parquet C++ implementation was merged into the Apache
Arrow monorepo [1]. As I understand it, this helped the development process
immensely. However, I am noticing some governance issues because of it.

First, it's not obvious where issues are supposed to be open: In Parquet
Jira or Arrow GitHub issues. Looking back at some of the original
discussion, it looks like the intention was

* use PARQUET-XXX for issues relating to Parquet core
* use ARROW-XXX for issues relation to Arrow's consumption of Parquet
core (e.g. changes that are in parquet/arrow right now)

The README for the old parquet-cpp repo [3] states instead in it's
migration note:

  JIRA issues should continue to be opened in the PARQUET JIRA project.


Either way, it doesn't seem like this process is obvious to people. Perhaps
we could clarify this and add notices to Arrow's GitHub issues template?

Second, committer status is a little unclear. I am a committer on Arrow,
but not on Parquet right now. Does that mean I should only merge Parquet
C++ PRs for code changes in parquet/arrow? Or that I shouldn't merge
Parquet changes at all?

Also, are the contributions to Arrow C++ Parquet being actively reviewed
for potential new committers?

Best,

Will Jones

[1] https://lists.apache.org/thread/76wzx2lsbwjl363bg066g8kdsocd03rw
[2] https://lists.apache.org/thread/dkh6vjomcfyjlvoy83qdk9j5jgxk7n4j
[3] https://github.com/apache/parquet-cpp

Reply via email to