Hey everyone,

There is a preference to mark iceberg-thrift as deprecated and remove it in
a subsequent release. I went ahead and created PR: #1175
<https://github.com/apache/parquet-mr/pull/1175>. Some parts of
parquet-thrift were already marked as deprecated, and I went ahead and
removed those. Do we want to move this forward? Or should I raise a VOTE
thread?

Kind regards,
Fokko

Op vr 13 okt 2023 om 11:15 schreef Fokko Driesprong <fo...@apache.org>:

> Looking at the history
> <https://github.com/apache/parquet-mr/commits/master/parquet-thrift>, the
> last contribution to the module was on Jan
> <https://github.com/apache/parquet-mr/pull/832> 2021
> <https://github.com/apache/parquet-mr/pull/832>. This has been released
> to the public. My main concern is that Parquet is slow-moving regarding
> releases, so first deprecating and then removing it will take quite a bit
> of time. It is orthogonal to one of the main reasons to bump Thrift is to
> make the release process easier :)
>
> Cheers, Fokko
>
> Op vr 13 okt 2023 om 03:35 schreef Gang Wu <ust...@gmail.com>:
>
>> If we cannot drop parquet-thrift immediately, I am inclined to go with [1]
>> and mark the entire module as deprecated (though I don't know if there
>> is an alternative approach to annotating all classes as deprecated).
>>
>> [1] https://github.com/apache/parquet-mr/pull/1156
>>
>> Gang
>>
>> On Tue, Oct 3, 2023 at 11:18 PM Xinli shang <sha...@uber.com.invalid>
>> wrote:
>>
>> > Hi Fokko,
>> >
>> > Thanks for looking into this! I generally agree we probably should
>> retire
>> > parquet-thrift. The only thing is we need to find out what is still
>> using
>> > it which is hard to do because of the large user base of parquet-mr.
>> What
>> > we did earlier is to mark that module as deprecated first. Then after
>> one
>> > release, we officially remove it. But I don't know that process would
>> block
>> > you too long.
>> >
>> > Xinli
>> >
>> > On Thu, Sep 28, 2023 at 2:20 AM Fokko Driesprong <fo...@apache.org>
>> wrote:
>> >
>> > > Hey Gang,
>> > >
>> > > It is also used in some of the code:
>> > >
>> > >    - org.apache.parquet.hadoop.thrift.AbstractThriftWriteSupport
>> > >    - org.apache.parquet.thrift.AbstractThriftWriteSupport
>> > >    - org.apache.parquet.thrift.ThriftSchemaConverter
>> > >    - org.apache.parquet.thrift.TupleToThriftWriteSupport
>> > >
>> > > Yesterday I tried to factor it out, but I ended up removing most of
>> the
>> > > codebase. I'm not aware of any alternative to Elephantbird. I tried to
>> > ping
>> > > the original author
>> > > <
>> https://github.com/apache/parquet-mr/pull/1068#issuecomment-1729434254
>> > >,
>> > > but the GitHub account seems to be abandoned.
>> > >
>> > > Kind regards,
>> > > Fokko
>> > >
>> > > Op do 28 sep 2023 om 11:13 schreef Gang Wu <ust...@gmail.com>:
>> > >
>> > > > Hi Fokko,
>> > > >
>> > > > Is there any alternative to Elephantbird? Since it is only used in
>> the
>> > > > test, could we rewrite those test cases using the alternative if
>> any?
>> > > > The effort may be huge though.
>> > > >
>> > > > Best,
>> > > > Gang
>> > > >
>> > > > On Thu, Sep 28, 2023 at 5:03 PM Fokko Driesprong <fo...@apache.org>
>> > > wrote:
>> > > >
>> > > > > Hi everyone,
>> > > > >
>> > > > > I was in the process of updating to the latest version of Thrift
>> > > > > <https://github.com/apache/parquet-mr/pull/1138> (from 0.16.0 to
>> > > > 0.19.0).
>> > > > > Mostly because it contains CVEs and makes the release process
>> easier
>> > > > > because you don't have to install Thrift from source (it is just
>> > > > available
>> > > > > on homebrew etc).
>> > > > >
>> > > > > While working on this, I ran into an issue with Elephantbird,
>> which
>> > is
>> > > > > using a very old version of Thrift (0.7.0). Trying to bump this I
>> > > noticed
>> > > > > that a lot of classes that we use in the tests have
>> > > > > <https://github.com/apache/parquet-mr/pull/1156> been made
>> private
>> > > > > <https://github.com/apache/parquet-mr/pull/1156>. Therefore it is
>> > hard
>> > > > to
>> > > > > test if we break anything.
>> > > > >
>> > > > > It looks like parquet-thrift is not used by anyone anymore
>> > > > > <
>> > https://mvnrepository.com/artifact/org.apache.parquet/parquet-thrift
>> > > >.
>> > > > I
>> > > > > would suggest removing the module from the repository
>> > > > > <https://github.com/apache/parquet-mr/pull/1158> unless anyone
>> > > objects.
>> > > > >
>> > > > > Kind regards, Fokko
>> > > > >
>> > > >
>> > >
>> >
>> >
>> > --
>> > Xinli Shang
>> >
>>
>

Reply via email to