The difference between Jackson and fastutil is that Jackson is used in
several modules. Instead of shading the classes into each one, we created a
module. If I remember correctly, fastutil is used and shaded in just one
module.

If you want to improve how the plugin is called, that would be great. But
it doesn't change the need for shading Jackson.

Sorry if this is confusing, but this is how Parquet has worked for a long
time and it doesn't leak into downstream projects. Most people working with
the Parquet codebase don't have to worry about how Jackson is handled
because we use the normal package names. And an even better reason for this
is that we never have to worry about accidentally using the wrong import
and having a bad release.

rb

On Tue, Feb 19, 2019 at 4:56 AM XU Qinghui <[email protected]>
wrote:

> Hi, Ryan
>
> The ticket that I mentioned is indeed outdated, thanks for closing it. In
> fact it is solved as a side effect of PARQUET-1529(
> https://github.com/apache/parquet-mr/pull/617), which adds
> maven-shade-plugin to the package phase and thus relocates jackson packages.
>
> But my point is:
> It seems that it does not make sense to keep the module parquet-jackson
> which just provides relocated jackson as a dependency --
> * We also shade other dependencies in parquet, such as `
> it.unimi.dsi:fastutil`, but we do not provide a module for just shading
> it. So it would be more consistent to do it the same with jackson, just
> shade it in modules where it is used.
> * We should probably introduce maven-shade-plugin in all modules so that
> the relocation (managed by the root pom) will be done where necessary (but
> that could be a separate topic)
> * Introducing parquet-jackson as a dependency but not using relocated
> jackson package in the code is kind of confusing.
>
> Qinghui
>
> Le lun. 18 févr. 2019 à 18:14, Ryan Blue <[email protected]> a écrit :
>
>> Qinghui,
>>
>> Parquet source uses the unshaded dependencies, but those dependencies are
>> rewritten in every module's build. That way source code remains compatible
>> with Jackson and shading is just part of the build. I've closed
>> PARQUET-1281 with "Not a problem".
>>
>> rb
>>
>> On Mon, Feb 18, 2019 at 9:04 AM XU Qinghui <[email protected]>
>> wrote:
>>
>>> I think it's necessary to shade jackson, as it is used a lot in
>>> different environment (with different versions).
>>> But it turns out that the parquet-jackson is not used everywhere inside
>>> parquet-mr (PARQUET-1281
>>> <https://issues.apache.org/jira/browse/PARQUET-1281>).
>>> I brought up this subject some while ago, but it seems that it would be
>>> more convenient / friendly for developers to use directly jackson and then
>>> shade it when packaging. In fact, IDEs often have problem to deal with
>>> "shaded" package in your code appropriately. So we might just remove the
>>> parquet-jackson module, and shade our artifacts instead.
>>>
>>> Best wishes,
>>> Qinghui
>>>
>>> Le lun. 18 févr. 2019 à 17:47, Ryan Blue <[email protected]> a
>>> écrit :
>>>
>>>> I don't think that removing the shading is a good idea. Jackson is a
>>>> very
>>>> common dependency and pulling in projects that use different versions
>>>> has
>>>> caused a lot of headache. Why go backward and make Parquet vulnerable to
>>>> those problems? I don't see a good justification for it.
>>>>
>>>> On Mon, Feb 18, 2019 at 8:29 AM Jacques Nadeau <[email protected]>
>>>> wrote:
>>>>
>>>> > I haven't looked at the usage but would wonder if the core modules
>>>> truly
>>>> > need jackson. I don't think most of the systems that read Parquet use
>>>> the
>>>> > jackson part (?). If so, maybe the code could be refactored to remove
>>>> the
>>>> > dependency and it be moved to an optional component. We want to do
>>>> the same
>>>> > thing with Jackson in Arrow (and did it recently for Guava).
>>>> >
>>>> > On Mon, Feb 18, 2019 at 3:09 AM Driesprong, Fokko
>>>> <[email protected]>
>>>> > wrote:
>>>> >
>>>> > > Hi all,
>>>> > >
>>>> > > Recently I've opened a PR to move from Jackson 1.x to Jackson 2.9
>>>> > > <https://github.com/apache/parquet-mr/pull/616>. I've also removed
>>>> the
>>>> > > shading project since most libraries are up to date with Jackson
>>>> 2.x.
>>>> > Gabor
>>>> > > suggested having a discussion on the mailing list to discuss the
>>>> removal
>>>> > of
>>>> > > the shading of Jackson.
>>>> > >
>>>> > > Spark 2.x is at 2.6, Spark 3.0 at 2.9.6, Hadoop at 2.9.x, Flink at
>>>> 2.7.9,
>>>> > > but that one is shaded anyway :-) One problem might be Apache Avro
>>>> which
>>>> > is
>>>> > > still using Jackson 1.x (codehause), until we release Avro 1.9.
>>>> > >
>>>> > > What are the thoughts on this subject, should we still shade
>>>> Jackson, or
>>>> > > not?
>>>> > >
>>>> > > Cheers, Fokko
>>>> > >
>>>> >
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>>>
>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to