Great! I did a final check on the Avro logical-types support on my end,
using the latest parquet-1.13.x branch, and all looks well.

Best,
Claire

On Thu, May 11, 2023 at 9:28 AM Fokko Driesprong <[email protected]> wrote:

> Let me send out the email. I did a final check on any performance
> regressions, and it looks good:
> https://github.com/apache/iceberg/pull/7301#issuecomment-1544008745
>
> Cheers,
> Fokko
>
> Op do 11 mei 2023 om 03:38 schreef Gang Wu <[email protected]>:
>
> > There are no new blockers on my side. I think we are good to go
> > if Apache Iceberg is blocked by this.
> >
> > Best,
> > Gang
> >
> > On Thu, May 11, 2023 at 5:33 AM Fokko Driesprong <[email protected]>
> wrote:
> >
> > > Hey everyone,
> > >
> > > What would be a good time to kick off the 1.13.1 release? Happy to run
> > the
> > > release.
> > >
> > > Cheers,
> > > Fokko Driesprong
> > >
> > > Op za 6 mei 2023 om 04:03 schreef Claire McGinty <
> > > [email protected]
> > > >:
> > >
> > > > Great, thank you! I realized that that PR does depend on PARQUET-2265
> > > > <https://github.com/apache/parquet-mr/pull/1049/files> for the
> writes
> > to
> > > > work -- it's a very minor PR bringing the default Avro write behavior
> > in
> > > > line with Avro read behavior. It unlocks a new functionality--being
> > able
> > > to
> > > > specify an Avro DataModel via a Configuration key--and has no
> > user-facing
> > > > effect on any existing behavior.
> > > >
> > > > I created a cherry-pick PR containing both tickets here
> > > > <https://github.com/apache/parquet-mr/pull/1091> -- I tested my
> local
> > > > installation locally and on Hadoop, using Avro data containing
> logical
> > > > types, and confirmed that it's working as expected.
> > > >
> > > > Thanks,
> > > > Claire
> > > >
> > > > On Fri, May 5, 2023 at 6:27 PM Fokko Driesprong <[email protected]>
> > > wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > Thanks for reviewing the PR on bringing back Hadoop 2.7.x support.
> > I've
> > > > > created a backport <https://github.com/apache/parquet-mr/pull/1090
> >
> > to
> > > > the
> > > > > 1.13.1 branch.
> > > > >
> > > > > Claire: Thanks for working on that. I looked through the PR and it
> > > looks
> > > > > well-tested, personally, I'm comfortable with cherry-picking this
> > back
> > > to
> > > > > 1.13.1. If you can create a PR to the parquet-1.13.x branch we can
> > take
> > > > it
> > > > > from there.
> > > > >
> > > > > Cheers, Fokko
> > > > >
> > > > > Op vr 5 mei 2023 om 15:14 schreef Claire McGinty <
> > > > > [email protected]
> > > > > >:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I wanted to add another +1 to the requests for a 1.13.1 release—I
> > > just
> > > > > had
> > > > > > a PR merged to add built-in support for Avro logical types in
> > > > > > Avro{Read,Write}Support (
> > > > https://github.com/apache/parquet-mr/pull/1078
> > > > > ).
> > > > > > For context on this, my org, Spotify, is currently migrating many
> > > > > datasets
> > > > > > from Avro to Parquet, but since many of our Avro datasets use
> > logical
> > > > > > types, it’s not working out of the box without custom config per
> > > > > > dataset/consumer. I know it’s unconventional to include new
> > features
> > > > in a
> > > > > > patch release, but having this change available would really help
> > us
> > > > > speed
> > > > > > up our Parquet adoption, plus the implementation is on a strictly
> > > > > > “best-effort” basis—any errors fetching logical types result in
> > > Parquet
> > > > > > reverting to its default read/write behavior.
> > > > > >
> > > > > > Thanks,
> > > > > > Claire
> > > > > >
> > > > > >
> > > > > > On 2023/05/01 06:56:37 Fokko Driesprong wrote:
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > I also got some pushback in Iceberg, so I've taken the time to
> > > revert
> > > > > the
> > > > > > > change to continue <
> > > https://github.com/apache/parquet-mr/pull/1084/>
> > > > > to
> > > > > > > support Hadoop <2.9. We now have two mechanisms to check if the
> > > > stream
> > > > > is
> > > > > > > byte buffer readable. First it will use the new mechanism (For
> > > Hadoop
> > > > > > 2.9+
> > > > > > > and Hadoop 3.3+. Otherwise, it will fall back to the previous
> > > method.
> > > > > > > Please review and let me know what you think. Once in, I can
> > > backport
> > > > > > this
> > > > > > > to 1.13.1.
> > > > > > >
> > > > > > > Kind regards,
> > > > > > > Fokko Driesprong
> > > > > > >
> > > > > > > Op vr 28 apr 2023 om 10:06 schreef Fokko Driesprong <
> > > > [email protected]
> > > > > >:
> > > > > > >
> > > > > > > > And it is in Hive as well:
> > > https://github.com/apache/hive/pull/427
> > > > > > > >
> > > > > > > > Kind regards,
> > > > > > > > Fokko
> > > > > > > >
> > > > > > > > Op wo 26 apr 2023 om 18:25 schreef Fokko Driesprong <
> > > > > [email protected]
> > > > > > >:
> > > > > > > >
> > > > > > > >> Hi Xinli,
> > > > > > > >>
> > > > > > > >> I know that some folks are waiting on running Apache Flink
> > > without
> > > > > > Hadoop
> > > > > > > >> (but using the parquet-mr library in Iceberg). Spark already
> > > > > upgraded
> > > > > > to
> > > > > > > >> Parquet 1.3.0, and I did some checks today with Hive, and I
> > > don't
> > > > > see
> > > > > > any
> > > > > > > >> blockers there. What's your understanding of a reasonable
> time
> > > > > window?
> > > > > > > >> Personally, I don't mind running a release, especially a
> patch
> > > > > release
> > > > > > > >> should be quite straightforward.
> > > > > > > >>
> > > > > > > >> Kind regards,
> > > > > > > >> Fokko Driesprong
> > > > > > > >>
> > > > > > > >> Op wo 26 apr 2023 om 04:37 schreef Xinli shang
> > > > > <[email protected]
> > > > > > >:
> > > > > > > >>
> > > > > > > >>> Hi Fokko,
> > > > > > > >>>
> > > > > > > >>> Thanks for volunteering to release 1.13.1! That would be
> > great
> > > > and
> > > > > I
> > > > > > am
> > > > > > > >>> looking forward to you being the release manager for that.
> > > > > > > >>>
> > > > > > > >>> We can have the 1.13.1 release to add back the support old
> > > Hadoop
> > > > > > > >>> version,
> > > > > > > >>> but the question is should we release ASAP or wait for a
> > > > reasonable
> > > > > > time
> > > > > > > >>> window? The new version 1.13.0 is just released and I am
> not
> > > sure
> > > > > if
> > > > > > > >>> there
> > > > > > > >>> are more issues coming so that we can put together the
> fixes
> > > into
> > > > > > 1.13.1.
> > > > > > > >>> Is Iceberg urgently blocked on this?
> > > > > > > >>>
> > > > > > > >>> Xinli Shang
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> On Tue, Apr 25, 2023 at 6:51 PM Gang Wu <[email protected]>
> > > wrote:
> > > > > > > >>>
> > > > > > > >>> > That sounds good to me.
> > > > > > > >>> >
> > > > > > > >>> > I have just released 1.13.0, just let me know if you need
> > > > > anything
> > > > > > > >>> > on my end to make the next release.
> > > > > > > >>> >
> > > > > > > >>> > Best,
> > > > > > > >>> > Gang
> > > > > > > >>> >
> > > > > > > >>> > On Tue, Apr 25, 2023 at 10:31 PM Fokko Driesprong <
> > > > > > [email protected]>
> > > > > > > >>> > wrote:
> > > > > > > >>> >
> > > > > > > >>> > > Hey Gang,
> > > > > > > >>> > >
> > > > > > > >>> > > Thanks for the quick reply. I think 2.8.x is water
> under
> > > the
> > > > > > bridge,
> > > > > > > >>> but
> > > > > > > >>> > I
> > > > > > > >>> > > can be convinced otherwise. I also spend a few cycles
> to
> > > see
> > > > if
> > > > > > we
> > > > > > > >>> can
> > > > > > > >>> > get
> > > > > > > >>> > > compatibility with 2.7.3+, but it doesn't seem trivial
> > > > > > > >>> > > <
> > > > > > > >>>
> > > > > >
> > > https://github.com/apache/parquet-mr/pull/1075#issuecomment-1514518094
> > > > > > > >>> > >.
> > > > > > > >>> > > As Gabor said on the ticket, it is fine to drop support
> > for
> > > > > older
> > > > > > > >>> systems
> > > > > > > >>> > > from time to time. The public Hadoop 2.8
> > > > > > > >>> > > <https://github.com/apache/hadoop/tree/branch-2.8>
> > doesn't
> > > > > seem
> > > > > > to
> > > > > > > >>> get
> > > > > > > >>> > any
> > > > > > > >>> > > active updates. I don't fully agree with the ticket,
> you
> > > can
> > > > > > still
> > > > > > > >>> read
> > > > > > > >>> > > Parquet, but using an older version of the library.
> > > > > > > >>> > >
> > > > > > > >>> > > Kind regards,
> > > > > > > >>> > > Fokko Driesprong
> > > > > > > >>> > >
> > > > > > > >>> > > Op di 25 apr 2023 om 16:13 schreef Gang Wu <
> > > [email protected]
> > > > >:
> > > > > > > >>> > >
> > > > > > > >>> > > > Hi Fokko,
> > > > > > > >>> > > >
> > > > > > > >>> > > > There is an issue of the 1.13.0 release:
> > > > > > > >>> > > > https://issues.apache.org/jira/browse/PARQUET-2276.
> > > > > > > >>> > > >
> > > > > > > >>> > > > It seems that Hadoop 2.8.x is no longer supported
> after
> > > > > > 1.13.0. I
> > > > > > > >>> have
> > > > > > > >>> > > seen
> > > > > > > >>> > > > that
> > > > > > > >>> > > > you have added CI checks for Hadoop 2.9.x. Not sure
> if
> > > this
> > > > > is
> > > > > > a
> > > > > > > >>> > > > blocking issue.
> > > > > > > >>> > > >
> > > > > > > >>> > > > Best,
> > > > > > > >>> > > > Gang
> > > > > > > >>> > > >
> > > > > > > >>> > > >
> > > > > > > >>> > > >
> > > > > > > >>> > > > On Tue, Apr 25, 2023 at 3:25 PM Fokko Driesprong <
> > > > > > [email protected]
> > > > > > > >>> >
> > > > > > > >>> > > wrote:
> > > > > > > >>> > > >
> > > > > > > >>> > > > > Hi all,
> > > > > > > >>> > > > >
> > > > > > > >>> > > > > I would like to discuss releasing Parquet 1.13.1.
> For
> > > > > > Iceberg we
> > > > > > > >>> ran
> > > > > > > >>> > > into
> > > > > > > >>> > > > > two things:
> > > > > > > >>> > > > >
> > > > > > > >>> > > > >    - We noticed that support for Hadoop 2 was
> > dropped.
> > > > > > Iceberg is
> > > > > > > >>> > still
> > > > > > > >>> > > > on
> > > > > > > >>> > > > >    2.7.3, and we're aware of the fact that has been
> > > > > released
> > > > > > in
> > > > > > > >>> > August
> > > > > > > >>> > > > > 2016.
> > > > > > > >>> > > > >    The PR that I've created
> > > > > > > >>> > > > >    <
> https://github.com/apache/parquet-mr/pull/1083/>
> > > > bumps
> > > > > > the
> > > > > > > >>> lower
> > > > > > > >>> > > > bound
> > > > > > > >>> > > > >    to Hadoop 2.9.2. Which is also old, but if
> > possible
> > > we
> > > > > > would
> > > > > > > >>> like
> > > > > > > >>> > to
> > > > > > > >>> > > > > cater
> > > > > > > >>> > > > >    to the widest audience possible.
> > > > > > > >>> > > > >    - At Iceberg we also have the Apache Flink
> > > > integration,
> > > > > > and
> > > > > > > >>> Flink
> > > > > > > >>> > is
> > > > > > > >>> > > > >    able to run without Hadoop. This required some
> > minor
> > > > > > changes
> > > > > > > >>> > (#1074
> > > > > > > >>> > > > >    <https://github.com/apache/parquet-mr/pull/1074
> >,
> > > > #1073
> > > > > > > >>> > > > >    <https://github.com/apache/parquet-mr/pull/1073
> >)
> > > > that
> > > > > > > >>> already
> > > > > > > >>> > have
> > > > > > > >>> > > > > been
> > > > > > > >>> > > > >    backported. It would be awesome to get these
> out.
> > > > > > > >>> > > > >
> > > > > > > >>> > > > > My question is, after the release of 1.13.0 are
> there
> > > any
> > > > > > issues
> > > > > > > >>> that
> > > > > > > >>> > > > came
> > > > > > > >>> > > > > up, or anything that you would like to see being
> > > > released?
> > > > > > I'm
> > > > > > > >>> happy
> > > > > > > >>> > to
> > > > > > > >>> > > > > volunteer as a release manager for 1.13.1. Let us
> > know!
> > > > > > > >>> > > > >
> > > > > > > >>> > > > > Kind regards,
> > > > > > > >>> > > > > Fokko
> > > > > > > >>> > > > >
> > > > > > > >>> > > >
> > > > > > > >>> > >
> > > > > > > >>> >
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> --
> > > > > > > >>> Xinli Shang
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to