Re: [C++] Changing the versioning string for Parquet-CPP
Hi, As a first step, I went ahead and renamed unreleased version "cpp-1.6.0" to "cpp-4.0.0" on the Parquet JIRA. Now we need to solve https://issues.apache.org/jira/browse/ARROW-7830. Best regards Antoine. On Fri, 12 Mar 2021 22:09:27 +0100 "Uwe L. Korn" wrote: > When we merged this into the Arrow repo, at least from my side, there was the > intention to revert that maybe at some stage again. The though behind moving > parquet-cpp out of the Arrow repo again was based on the idea that Parquet > was one of the many interfaces Arrow does provide access to but not one of > the outstanding ones. Nowadays, I have the feeling that Parquet and Arrow > have a much more bound-together relationship than I initially expected. Thus > we should probably accept that parquet-cpp will stay for a very long time in > the Arrow repo and this should continue with the versioning. > > Also we had the assumption that from time to time the parquet community would > make separate releases. I have no memory anymore how we assumed that these > releases would happen or why though. > > Basically, we had some assumptions that supported keeping the version numbers > separate makes sense. All of the assumptions I can think of turned out to be > false, thus keeping the version in line with Arrow (C++) makes totally sense > nowadays. > > Uwe > > On Tue, Mar 9, 2021, at 7:57 PM, Micah Kornfield wrote: > > I think there might have been some old agreement on this when parquet-cpp > > was moved into the Arrow repo. I can't seem to find the thread, but it > > would be nice for some PMC members to chime it to make sure this seems OK > > to them. > > > > On Sat, Mar 6, 2021 at 7:38 AM Antoine Pitrou wrote: > > > > > On Fri, 5 Mar 2021 10:26:57 -0800 > > > Micah Kornfield > > > wrote: > > > > > > > > I'd like to propose that we change the default version string [1] for > > > > parquet-cpp to reflect arrow releases (e.g. "parquet-cpp-arrow version > > > > 3.0.0" instead of "parquet-cpp version 1.5.1-snapshot"). > > > > > > +1. This definitely makes the most sense. > > > > > > Regards > > > > > > Antoine. > > > > > > > > > > > >
Re: [C++] Changing the versioning string for Parquet-CPP
When we merged this into the Arrow repo, at least from my side, there was the intention to revert that maybe at some stage again. The though behind moving parquet-cpp out of the Arrow repo again was based on the idea that Parquet was one of the many interfaces Arrow does provide access to but not one of the outstanding ones. Nowadays, I have the feeling that Parquet and Arrow have a much more bound-together relationship than I initially expected. Thus we should probably accept that parquet-cpp will stay for a very long time in the Arrow repo and this should continue with the versioning. Also we had the assumption that from time to time the parquet community would make separate releases. I have no memory anymore how we assumed that these releases would happen or why though. Basically, we had some assumptions that supported keeping the version numbers separate makes sense. All of the assumptions I can think of turned out to be false, thus keeping the version in line with Arrow (C++) makes totally sense nowadays. Uwe On Tue, Mar 9, 2021, at 7:57 PM, Micah Kornfield wrote: > I think there might have been some old agreement on this when parquet-cpp > was moved into the Arrow repo. I can't seem to find the thread, but it > would be nice for some PMC members to chime it to make sure this seems OK > to them. > > On Sat, Mar 6, 2021 at 7:38 AM Antoine Pitrou wrote: > > > On Fri, 5 Mar 2021 10:26:57 -0800 > > Micah Kornfield > > wrote: > > > > > > I'd like to propose that we change the default version string [1] for > > > parquet-cpp to reflect arrow releases (e.g. "parquet-cpp-arrow version > > > 3.0.0" instead of "parquet-cpp version 1.5.1-snapshot"). > > > > +1. This definitely makes the most sense. > > > > Regards > > > > Antoine. > > > > > > >
Re: [C++] Changing the versioning string for Parquet-CPP
I think there might have been some old agreement on this when parquet-cpp was moved into the Arrow repo. I can't seem to find the thread, but it would be nice for some PMC members to chime it to make sure this seems OK to them. On Sat, Mar 6, 2021 at 7:38 AM Antoine Pitrou wrote: > On Fri, 5 Mar 2021 10:26:57 -0800 > Micah Kornfield > wrote: > > > > I'd like to propose that we change the default version string [1] for > > parquet-cpp to reflect arrow releases (e.g. "parquet-cpp-arrow version > > 3.0.0" instead of "parquet-cpp version 1.5.1-snapshot"). > > +1. This definitely makes the most sense. > > Regards > > Antoine. > > >
Re: [C++] Changing the versioning string for Parquet-CPP
On Fri, 5 Mar 2021 10:26:57 -0800 Micah Kornfield wrote: > > I'd like to propose that we change the default version string [1] for > parquet-cpp to reflect arrow releases (e.g. "parquet-cpp-arrow version > 3.0.0" instead of "parquet-cpp version 1.5.1-snapshot"). +1. This definitely makes the most sense. Regards Antoine.
Re: [C++] Changing the versioning string for Parquet-CPP
There is an issue about this: https://issues.apache.org/jira/browse/ARROW-7830 +1 on changing this to follow the Arrow version number (the current non-changing number is not particularly useful ..) Joris On Fri, 5 Mar 2021 at 19:27, Micah Kornfield wrote: > There has not been an official release of the Parquet C++ library in quite > some time. I don't think this is a huge issue as the parquet bits are > packaged into each Arrow release. > > However, one practical concern is when bugs crop up for a particular > version writing a parquet file, it is impossible for readers to mitigate > them. One practical example is a long standing bug (with a fix recently > merged) where the comparator for ByteArray/FLBA encoded Decimals was > incorrectly implemented. This means min/max statistics for these Decimal > values cannot be relied on. > > I'd like to propose that we change the default version string [1] for > parquet-cpp to reflect arrow releases (e.g. "parquet-cpp-arrow version > 3.0.0" instead of "parquet-cpp version 1.5.1-snapshot"). > > Any objections? An alternative would be to try to do releases of > parquet-cpp on the same timeline as Arrow releases. > > Thanks, > Micah > > [1] > > https://github.com/apache/arrow/blob/25c736d48dc289f457e74d15d05db65f6d539447/cpp/src/parquet/parquet_version.h.in >