+1. I like the idea of a common repository as well. This will ease the Java and C++ interoperability. Currently, Java treats parquet files written by C++ differently.
On Wed, Aug 2, 2017 at 7:59 PM, Wes McKinney <[email protected]> wrote: > +1. In doing so we may want to rename the repository to apache/parquet > to reflect the expanded scope. > > We could also discuss merging in the C++ implementation, though the > main reservation I would have would be version numbers as we will > likely be releasing parquet-cpp more frequently than parquet-java has > been releasing since the implementation continues to evolve. If the > Java folks are comfortable with more frequent releases (and we would > want to add a document explaining the respective API stability of each > component, e.g. C++ will be a bit less stable for a while) then this > seems OK to me. > > On Wed, Aug 2, 2017 at 1:26 PM, Nong Li <[email protected]> wrote: > > Hi, > > > > I'd like to propose retiring the parquet-format repo and moving the code > > into > > parquet-mr. Having the splits repos causes unnecessary complexity and > > doesn't > > seem to offer much benefit. For example: > > 1. Making changes that require format changes and implementation is > > split. Things > > go out of sync. > > 2. More release version/release process management > > 3. More things to do and understand getting started > > > > I don't recall why it was originally split; probably an artifact of how > it > > was born. If > > this makes sense, we can consider merging parquet-cpp as well. > > > > The specific proposal is to add a commit to parquet-format to indicate it > > is moved > > and merged into parquet-mr and move the current parquet-format files into > > parquet-mr. > > The next release of parquet-mr would release both, with the same version. > > > > Thoughts? > > Nong > -- regards, Deepak Majeti
