To Uwe's point I think we might wait 6-12 months before merging C++ with the main Parquet repo until we've reached functional feature completeness in our Arrow reader/writer. Until then we will have quite frequent releases with incremental new functionality and possibly API changes.
On Thu, Aug 3, 2017 at 12:03 PM, Julien Le Dem <[email protected]> wrote: > +1 on merging the repos assuming we find a sane way of doing so that > somewhat preserves history. > big +1 on more frequent releases. Reducing friction for releases is a big > win. > I'm fine with doing this in 2 steps (mr + format then cpp) or 1 (mr + > format + cpp). > > > On Thu, Aug 3, 2017 at 5:18 AM, Uwe L. Korn <[email protected]> wrote: > >> I'm in favour of merging parquet-format and parquet-mr but at the >> moment, I would not merge MR and CPP, development speeds and release >> cycles differ and thus it would be more an inconvenience to have them in >> the same repo. >> >> Uwe >> >> On Thu, Aug 3, 2017, at 02:37 AM, Deepak Majeti wrote: >> > +1. I like the idea of a common repository as well. This will ease the >> > Java >> > and C++ interoperability. Currently, Java treats parquet files written by >> > C++ differently. >> > >> > On Wed, Aug 2, 2017 at 7:59 PM, Wes McKinney <[email protected]> >> wrote: >> > >> > > +1. In doing so we may want to rename the repository to apache/parquet >> > > to reflect the expanded scope. >> > > >> > > We could also discuss merging in the C++ implementation, though the >> > > main reservation I would have would be version numbers as we will >> > > likely be releasing parquet-cpp more frequently than parquet-java has >> > > been releasing since the implementation continues to evolve. If the >> > > Java folks are comfortable with more frequent releases (and we would >> > > want to add a document explaining the respective API stability of each >> > > component, e.g. C++ will be a bit less stable for a while) then this >> > > seems OK to me. >> > > >> > > On Wed, Aug 2, 2017 at 1:26 PM, Nong Li <[email protected]> wrote: >> > > > Hi, >> > > > >> > > > I'd like to propose retiring the parquet-format repo and moving the >> code >> > > > into >> > > > parquet-mr. Having the splits repos causes unnecessary complexity and >> > > > doesn't >> > > > seem to offer much benefit. For example: >> > > > 1. Making changes that require format changes and implementation >> is >> > > > split. Things >> > > > go out of sync. >> > > > 2. More release version/release process management >> > > > 3. More things to do and understand getting started >> > > > >> > > > I don't recall why it was originally split; probably an artifact of >> how >> > > it >> > > > was born. If >> > > > this makes sense, we can consider merging parquet-cpp as well. >> > > > >> > > > The specific proposal is to add a commit to parquet-format to >> indicate it >> > > > is moved >> > > > and merged into parquet-mr and move the current parquet-format files >> into >> > > > parquet-mr. >> > > > The next release of parquet-mr would release both, with the same >> version. >> > > > >> > > > Thoughts? >> > > > Nong >> > > >> > >> > >> > >> > -- >> > regards, >> > Deepak Majeti >>
