Thanks all for the feedback. I've updated the PR to shade Jackson 2.x as well based on the feedback of the thread. Personally, I haven't had that much issues with the latest versions of Jackson, but let us stay on the safe side.
Jackson is still being used in: - parquet-cli - parquet-hadoop - parquet-pig - parquet-thrift - parquet-tools I've removed it from parquet-scrooge since it was included in the pom, but wasn't used in the code. Cheers, Fokko Op di 19 feb. 2019 om 18:34 schreef Ryan Blue <[email protected]>: > The difference between Jackson and fastutil is that Jackson is used in > several modules. Instead of shading the classes into each one, we created a > module. If I remember correctly, fastutil is used and shaded in just one > module. > > If you want to improve how the plugin is called, that would be great. But > it doesn't change the need for shading Jackson. > > Sorry if this is confusing, but this is how Parquet has worked for a long > time and it doesn't leak into downstream projects. Most people working with > the Parquet codebase don't have to worry about how Jackson is handled > because we use the normal package names. And an even better reason for this > is that we never have to worry about accidentally using the wrong import > and having a bad release. > > rb > > On Tue, Feb 19, 2019 at 4:56 AM XU Qinghui <[email protected]> > wrote: > > > Hi, Ryan > > > > The ticket that I mentioned is indeed outdated, thanks for closing it. In > > fact it is solved as a side effect of PARQUET-1529( > > https://github.com/apache/parquet-mr/pull/617), which adds > > maven-shade-plugin to the package phase and thus relocates jackson > packages. > > > > But my point is: > > It seems that it does not make sense to keep the module parquet-jackson > > which just provides relocated jackson as a dependency -- > > * We also shade other dependencies in parquet, such as ` > > it.unimi.dsi:fastutil`, but we do not provide a module for just shading > > it. So it would be more consistent to do it the same with jackson, just > > shade it in modules where it is used. > > * We should probably introduce maven-shade-plugin in all modules so that > > the relocation (managed by the root pom) will be done where necessary > (but > > that could be a separate topic) > > * Introducing parquet-jackson as a dependency but not using relocated > > jackson package in the code is kind of confusing. > > > > Qinghui > > > > Le lun. 18 févr. 2019 à 18:14, Ryan Blue <[email protected]> a écrit : > > > >> Qinghui, > >> > >> Parquet source uses the unshaded dependencies, but those dependencies > are > >> rewritten in every module's build. That way source code remains > compatible > >> with Jackson and shading is just part of the build. I've closed > >> PARQUET-1281 with "Not a problem". > >> > >> rb > >> > >> On Mon, Feb 18, 2019 at 9:04 AM XU Qinghui < > [email protected]> > >> wrote: > >> > >>> I think it's necessary to shade jackson, as it is used a lot in > >>> different environment (with different versions). > >>> But it turns out that the parquet-jackson is not used everywhere inside > >>> parquet-mr (PARQUET-1281 > >>> <https://issues.apache.org/jira/browse/PARQUET-1281>). > >>> I brought up this subject some while ago, but it seems that it would be > >>> more convenient / friendly for developers to use directly jackson and > then > >>> shade it when packaging. In fact, IDEs often have problem to deal with > >>> "shaded" package in your code appropriately. So we might just remove > the > >>> parquet-jackson module, and shade our artifacts instead. > >>> > >>> Best wishes, > >>> Qinghui > >>> > >>> Le lun. 18 févr. 2019 à 17:47, Ryan Blue <[email protected]> a > >>> écrit : > >>> > >>>> I don't think that removing the shading is a good idea. Jackson is a > >>>> very > >>>> common dependency and pulling in projects that use different versions > >>>> has > >>>> caused a lot of headache. Why go backward and make Parquet vulnerable > to > >>>> those problems? I don't see a good justification for it. > >>>> > >>>> On Mon, Feb 18, 2019 at 8:29 AM Jacques Nadeau <[email protected]> > >>>> wrote: > >>>> > >>>> > I haven't looked at the usage but would wonder if the core modules > >>>> truly > >>>> > need jackson. I don't think most of the systems that read Parquet > use > >>>> the > >>>> > jackson part (?). If so, maybe the code could be refactored to > remove > >>>> the > >>>> > dependency and it be moved to an optional component. We want to do > >>>> the same > >>>> > thing with Jackson in Arrow (and did it recently for Guava). > >>>> > > >>>> > On Mon, Feb 18, 2019 at 3:09 AM Driesprong, Fokko > >>>> <[email protected]> > >>>> > wrote: > >>>> > > >>>> > > Hi all, > >>>> > > > >>>> > > Recently I've opened a PR to move from Jackson 1.x to Jackson 2.9 > >>>> > > <https://github.com/apache/parquet-mr/pull/616>. I've also > removed > >>>> the > >>>> > > shading project since most libraries are up to date with Jackson > >>>> 2.x. > >>>> > Gabor > >>>> > > suggested having a discussion on the mailing list to discuss the > >>>> removal > >>>> > of > >>>> > > the shading of Jackson. > >>>> > > > >>>> > > Spark 2.x is at 2.6, Spark 3.0 at 2.9.6, Hadoop at 2.9.x, Flink at > >>>> 2.7.9, > >>>> > > but that one is shaded anyway :-) One problem might be Apache Avro > >>>> which > >>>> > is > >>>> > > still using Jackson 1.x (codehause), until we release Avro 1.9. > >>>> > > > >>>> > > What are the thoughts on this subject, should we still shade > >>>> Jackson, or > >>>> > > not? > >>>> > > > >>>> > > Cheers, Fokko > >>>> > > > >>>> > > >>>> > >>>> > >>>> -- > >>>> Ryan Blue > >>>> Software Engineer > >>>> Netflix > >>>> > >>> > >> > >> -- > >> Ryan Blue > >> Software Engineer > >> Netflix > >> > > > > -- > Ryan Blue > Software Engineer > Netflix >
