Hi,

I agree with Fokko. It would be nice to drop these modules but only in the
next major release.

On Tue, Jan 29, 2019 at 11:57 AM Uwe L. Korn <[email protected]> wrote:

> Hello Fokko,
>
> I have put up a PR for the Scala update
> https://github.com/apache/parquet-mr/pull/605. parquet-scrooge fails due
> to a Thrift parsing error but parquet-scala succeeds with Scala 2.12 With
> dropping scrooge, we could at least move this forward.
>
> Uwe
>
> > Am 29.01.2019 um 11:40 schrieb Nandor Kollar
> <[email protected]>:
> >
> > Removing parquet-hive-* is a great idea, the code in Parquet is not
> > maintained any more, it is just a burden there.
> >
> > As of parquet-pig, I'd prefer moving it to Pig (if Pig community accepts
> it
> > as it is) instead of dropping it or moving to a separate project. I know
> > people who still use Pig with Parquet.
> >
> > Regards,
> > Nandor
> >
> >> On Mon, Jan 28, 2019 at 6:29 PM Ryan Blue <[email protected]>
> wrote:
> >>
> >> Hi everyone,
> >>
> >> I’m working on the 1.10.1 build and I’ve noticed that we will have
> several
> >> modules that are not maintained or are very old. This includes all of
> the
> >> Hive modules that moved into Hive years ago and also modules like
> >> parquet-scrooge and parquet-scala that are based on Scala 2.10 that has
> >> been EOL for years.
> >>
> >> We also have 2 command-line utilities, parquet-tools and parquet-cli.
> The
> >> parquet-cli version is friendlier to use, but I’m clearly biased. In any
> >> case, I don’t think we need to maintain both and it is confusing for
> users
> >> to have two modules that do the same thing.
> >>
> >> I propose we remove the following modules:
> >>
> >>   - parquet-hive-*
> >>   - parquet-scrooge
> >>   - parquet-scala
> >>   - parquet-tools
> >>   - parquet-hadoop-bundle (shaded deps)
> >>   -
> >>
> >>   parquet-cascading (in favor of parquet-cascading3, if we keep it)
> >>   There are also modules that I’m not sure about. Does anyone use these?
> >>   -
> >>
> >>   parquet-thrift
> >>   - parquet-pig
> >>   - parquet-cascading3
> >>
> >> Pig hasn’t had an update (other than project-wide changes) since Oct
> 2017.
> >> I think it may be time to drop support in Pig and allow that to exist
> as a
> >> separate project if anyone is still interested in it.
> >>
> >> In the last few years, we’ve moved more to a model where processing
> >> frameworks and engines maintain their own integration. Spark, Presto,
> >> Iceberg, and Hive fall into this category. So I would prefer to drop Pig
> >> and Cascading3. I’m fine keeping thrift if people think it is useful.
> >>
> >> Thoughts?
> >>
> >> rb
> >> --
> >> Ryan Blue
> >> Software Engineer
> >> Netflix
> >>
>

Reply via email to