Thanks all for the feedback. I've updated the PR to shade Jackson 2.x as
well based on the feedback of the thread. Personally, I haven't had that
much issues with the latest versions of Jackson, but let us stay on the
safe side.

Jackson is still being used in:
- parquet-cli
- parquet-hadoop
- parquet-pig
- parquet-thrift
- parquet-tools

I've removed it from parquet-scrooge since it was included in the pom, but
wasn't used in the code.

Cheers, Fokko


Op di 19 feb. 2019 om 18:34 schreef Ryan Blue <[email protected]>:

> The difference between Jackson and fastutil is that Jackson is used in
> several modules. Instead of shading the classes into each one, we created a
> module. If I remember correctly, fastutil is used and shaded in just one
> module.
>
> If you want to improve how the plugin is called, that would be great. But
> it doesn't change the need for shading Jackson.
>
> Sorry if this is confusing, but this is how Parquet has worked for a long
> time and it doesn't leak into downstream projects. Most people working with
> the Parquet codebase don't have to worry about how Jackson is handled
> because we use the normal package names. And an even better reason for this
> is that we never have to worry about accidentally using the wrong import
> and having a bad release.
>
> rb
>
> On Tue, Feb 19, 2019 at 4:56 AM XU Qinghui <[email protected]>
> wrote:
>
> > Hi, Ryan
> >
> > The ticket that I mentioned is indeed outdated, thanks for closing it. In
> > fact it is solved as a side effect of PARQUET-1529(
> > https://github.com/apache/parquet-mr/pull/617), which adds
> > maven-shade-plugin to the package phase and thus relocates jackson
> packages.
> >
> > But my point is:
> > It seems that it does not make sense to keep the module parquet-jackson
> > which just provides relocated jackson as a dependency --
> > * We also shade other dependencies in parquet, such as `
> > it.unimi.dsi:fastutil`, but we do not provide a module for just shading
> > it. So it would be more consistent to do it the same with jackson, just
> > shade it in modules where it is used.
> > * We should probably introduce maven-shade-plugin in all modules so that
> > the relocation (managed by the root pom) will be done where necessary
> (but
> > that could be a separate topic)
> > * Introducing parquet-jackson as a dependency but not using relocated
> > jackson package in the code is kind of confusing.
> >
> > Qinghui
> >
> > Le lun. 18 févr. 2019 à 18:14, Ryan Blue <[email protected]> a écrit :
> >
> >> Qinghui,
> >>
> >> Parquet source uses the unshaded dependencies, but those dependencies
> are
> >> rewritten in every module's build. That way source code remains
> compatible
> >> with Jackson and shading is just part of the build. I've closed
> >> PARQUET-1281 with "Not a problem".
> >>
> >> rb
> >>
> >> On Mon, Feb 18, 2019 at 9:04 AM XU Qinghui <
> [email protected]>
> >> wrote:
> >>
> >>> I think it's necessary to shade jackson, as it is used a lot in
> >>> different environment (with different versions).
> >>> But it turns out that the parquet-jackson is not used everywhere inside
> >>> parquet-mr (PARQUET-1281
> >>> <https://issues.apache.org/jira/browse/PARQUET-1281>).
> >>> I brought up this subject some while ago, but it seems that it would be
> >>> more convenient / friendly for developers to use directly jackson and
> then
> >>> shade it when packaging. In fact, IDEs often have problem to deal with
> >>> "shaded" package in your code appropriately. So we might just remove
> the
> >>> parquet-jackson module, and shade our artifacts instead.
> >>>
> >>> Best wishes,
> >>> Qinghui
> >>>
> >>> Le lun. 18 févr. 2019 à 17:47, Ryan Blue <[email protected]> a
> >>> écrit :
> >>>
> >>>> I don't think that removing the shading is a good idea. Jackson is a
> >>>> very
> >>>> common dependency and pulling in projects that use different versions
> >>>> has
> >>>> caused a lot of headache. Why go backward and make Parquet vulnerable
> to
> >>>> those problems? I don't see a good justification for it.
> >>>>
> >>>> On Mon, Feb 18, 2019 at 8:29 AM Jacques Nadeau <[email protected]>
> >>>> wrote:
> >>>>
> >>>> > I haven't looked at the usage but would wonder if the core modules
> >>>> truly
> >>>> > need jackson. I don't think most of the systems that read Parquet
> use
> >>>> the
> >>>> > jackson part (?). If so, maybe the code could be refactored to
> remove
> >>>> the
> >>>> > dependency and it be moved to an optional component. We want to do
> >>>> the same
> >>>> > thing with Jackson in Arrow (and did it recently for Guava).
> >>>> >
> >>>> > On Mon, Feb 18, 2019 at 3:09 AM Driesprong, Fokko
> >>>> <[email protected]>
> >>>> > wrote:
> >>>> >
> >>>> > > Hi all,
> >>>> > >
> >>>> > > Recently I've opened a PR to move from Jackson 1.x to Jackson 2.9
> >>>> > > <https://github.com/apache/parquet-mr/pull/616>. I've also
> removed
> >>>> the
> >>>> > > shading project since most libraries are up to date with Jackson
> >>>> 2.x.
> >>>> > Gabor
> >>>> > > suggested having a discussion on the mailing list to discuss the
> >>>> removal
> >>>> > of
> >>>> > > the shading of Jackson.
> >>>> > >
> >>>> > > Spark 2.x is at 2.6, Spark 3.0 at 2.9.6, Hadoop at 2.9.x, Flink at
> >>>> 2.7.9,
> >>>> > > but that one is shaded anyway :-) One problem might be Apache Avro
> >>>> which
> >>>> > is
> >>>> > > still using Jackson 1.x (codehause), until we release Avro 1.9.
> >>>> > >
> >>>> > > What are the thoughts on this subject, should we still shade
> >>>> Jackson, or
> >>>> > > not?
> >>>> > >
> >>>> > > Cheers, Fokko
> >>>> > >
> >>>> >
> >>>>
> >>>>
> >>>> --
> >>>> Ryan Blue
> >>>> Software Engineer
> >>>> Netflix
> >>>>
> >>>
> >>
> >> --
> >> Ryan Blue
> >> Software Engineer
> >> Netflix
> >>
> >
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to