Hi, Ryan

The ticket that I mentioned is indeed outdated, thanks for closing it. In
fact it is solved as a side effect of PARQUET-1529(
https://github.com/apache/parquet-mr/pull/617), which adds
maven-shade-plugin to the package phase and thus relocates jackson packages.

But my point is:
It seems that it does not make sense to keep the module parquet-jackson
which just provides relocated jackson as a dependency --
* We also shade other dependencies in parquet, such as `
it.unimi.dsi:fastutil`, but we do not provide a module for just shading it.
So it would be more consistent to do it the same with jackson, just shade
it in modules where it is used.
* We should probably introduce maven-shade-plugin in all modules so that
the relocation (managed by the root pom) will be done where necessary (but
that could be a separate topic)
* Introducing parquet-jackson as a dependency but not using relocated
jackson package in the code is kind of confusing.

Qinghui

Le lun. 18 févr. 2019 à 18:14, Ryan Blue <[email protected]> a écrit :

> Qinghui,
>
> Parquet source uses the unshaded dependencies, but those dependencies are
> rewritten in every module's build. That way source code remains compatible
> with Jackson and shading is just part of the build. I've closed
> PARQUET-1281 with "Not a problem".
>
> rb
>
> On Mon, Feb 18, 2019 at 9:04 AM XU Qinghui <[email protected]>
> wrote:
>
>> I think it's necessary to shade jackson, as it is used a lot in different
>> environment (with different versions).
>> But it turns out that the parquet-jackson is not used everywhere inside
>> parquet-mr (PARQUET-1281
>> <https://issues.apache.org/jira/browse/PARQUET-1281>).
>> I brought up this subject some while ago, but it seems that it would be
>> more convenient / friendly for developers to use directly jackson and then
>> shade it when packaging. In fact, IDEs often have problem to deal with
>> "shaded" package in your code appropriately. So we might just remove the
>> parquet-jackson module, and shade our artifacts instead.
>>
>> Best wishes,
>> Qinghui
>>
>> Le lun. 18 févr. 2019 à 17:47, Ryan Blue <[email protected]> a
>> écrit :
>>
>>> I don't think that removing the shading is a good idea. Jackson is a very
>>> common dependency and pulling in projects that use different versions has
>>> caused a lot of headache. Why go backward and make Parquet vulnerable to
>>> those problems? I don't see a good justification for it.
>>>
>>> On Mon, Feb 18, 2019 at 8:29 AM Jacques Nadeau <[email protected]>
>>> wrote:
>>>
>>> > I haven't looked at the usage but would wonder if the core modules
>>> truly
>>> > need jackson. I don't think most of the systems that read Parquet use
>>> the
>>> > jackson part (?). If so, maybe the code could be refactored to remove
>>> the
>>> > dependency and it be moved to an optional component. We want to do the
>>> same
>>> > thing with Jackson in Arrow (and did it recently for Guava).
>>> >
>>> > On Mon, Feb 18, 2019 at 3:09 AM Driesprong, Fokko <[email protected]
>>> >
>>> > wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > Recently I've opened a PR to move from Jackson 1.x to Jackson 2.9
>>> > > <https://github.com/apache/parquet-mr/pull/616>. I've also removed
>>> the
>>> > > shading project since most libraries are up to date with Jackson 2.x.
>>> > Gabor
>>> > > suggested having a discussion on the mailing list to discuss the
>>> removal
>>> > of
>>> > > the shading of Jackson.
>>> > >
>>> > > Spark 2.x is at 2.6, Spark 3.0 at 2.9.6, Hadoop at 2.9.x, Flink at
>>> 2.7.9,
>>> > > but that one is shaded anyway :-) One problem might be Apache Avro
>>> which
>>> > is
>>> > > still using Jackson 1.x (codehause), until we release Avro 1.9.
>>> > >
>>> > > What are the thoughts on this subject, should we still shade
>>> Jackson, or
>>> > > not?
>>> > >
>>> > > Cheers, Fokko
>>> > >
>>> >
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to