> > I am a committer on Arrow, > but not on Parquet right now. Does that mean I should only merge Parquet > C++ PRs for code changes in parquet/arrow?
FWIW, This was the mode I was operating under. My preference here would be to continue to operate under this mode for the governance perspective. As it is, it seems the current parquet PMC [1] doesn't have a lot of active C++ contributors, so it might be harder to continue growing out the C++ committer base. Thanks, Micah [1] https://projects.apache.org/committee.html?parquet On Thu, Feb 2, 2023 at 7:31 AM Will Jones <will.jones...@gmail.com> wrote: > Day to day, I think having Parquet-cpp under the Apache Arrow project could > make sense. Though I worry about two risks: > > 1. Would that lead to the governance of the format itself to be primarily > the responsibility of the developers of Parquet-MR? > 2. Would C++ developers interested in working with Parquet outside of Arrow > recognize it as a relevant library? > > On Thu, Feb 2, 2023 at 6:03 AM Neal Richardson < > neal.p.richard...@gmail.com> > wrote: > > > Would it make sense to transfer all governance of the parquet-cpp > > implementation to Apache Arrow? It seems like that's where we de facto > are > > already, so that would resolve these ambiguities and put it in line with > > the Rust implementation. > > > > Would the Parquet PMC be opposed to formalizing this change? > > > > Neal > > > > On Thu, Feb 2, 2023 at 6:48 AM Raphael Taylor-Davies > > <r.taylordav...@googlemail.com.invalid> wrote: > > > > > Hi, > > > > > > > Does the parquet rust implementation have a similar issue? > > > > > > Similar to the C++ implementation, the Rust implementation lives under > > > the Apache Arrow umbrella and does not have any direct affiliation with > > > the Apache Parquet project that I am aware of, beyond using the same > > > format specification. However, as almost all of the users and > > > contributions are with respect to the arrow interfaces, and not the > > > parquet record APIs, there perhaps isn't the same ambiguity as > > > encountered with the C++ implementation. I would expect all issues to > be > > > raised in the arrow-rs repository, and a PARQUET Jira only raised, > > > likely by myself or whoever is triaging the issue, if there is some > > > issue/ambiguity pertaining to the format itself. > > > > > > Kind Regards, > > > > > > Raphael > > > > > > On 02/02/2023 01:58, Gang Wu wrote: > > > > Hi Will, > > > > > > > > AFAIK, the Apache Parquet community no longer considers contribution > to > > > > parquet-cpp when promoting new committers after the donation to > Apache > > > > Arrow. > > > > > > > > It would be a dilemma for the parquet-cpp contributors if none of the > > > > Apache Arrow community or Apache Parquet community recognizes their > > work. > > > > > > > > Does the parquet rust implementation have a similar issue? > > > > > > > > Best, > > > > Gang > > > > > > > > On Thu, Feb 2, 2023 at 3:27 AM Will Jones <will.jones...@gmail.com> > > > wrote: > > > > > > > >> Hello, > > > >> > > > >> A while back, the Parquet C++ implementation was merged into the > > Apache > > > >> Arrow monorepo [1]. As I understand it, this helped the development > > > process > > > >> immensely. However, I am noticing some governance issues because of > > it. > > > >> > > > >> First, it's not obvious where issues are supposed to be open: In > > Parquet > > > >> Jira or Arrow GitHub issues. Looking back at some of the original > > > >> discussion, it looks like the intention was > > > >> > > > >> * use PARQUET-XXX for issues relating to Parquet core > > > >>> * use ARROW-XXX for issues relation to Arrow's consumption of > Parquet > > > >>> core (e.g. changes that are in parquet/arrow right now) > > > >>> > > > >> The README for the old parquet-cpp repo [3] states instead in it's > > > >> migration note: > > > >> > > > >> JIRA issues should continue to be opened in the PARQUET JIRA > > project. > > > >> > > > >> > > > >> Either way, it doesn't seem like this process is obvious to people. > > > Perhaps > > > >> we could clarify this and add notices to Arrow's GitHub issues > > template? > > > >> > > > >> Second, committer status is a little unclear. I am a committer on > > Arrow, > > > >> but not on Parquet right now. Does that mean I should only merge > > Parquet > > > >> C++ PRs for code changes in parquet/arrow? Or that I shouldn't merge > > > >> Parquet changes at all? > > > >> > > > >> Also, are the contributions to Arrow C++ Parquet being actively > > reviewed > > > >> for potential new committers? > > > >> > > > >> Best, > > > >> > > > >> Will Jones > > > >> > > > >> [1] > https://lists.apache.org/thread/76wzx2lsbwjl363bg066g8kdsocd03rw > > > >> [2] > https://lists.apache.org/thread/dkh6vjomcfyjlvoy83qdk9j5jgxk7n4j > > > >> [3] https://github.com/apache/parquet-cpp > > > >> > > > > > >