Hi, Ryan The ticket that I mentioned is indeed outdated, thanks for closing it. In fact it is solved as a side effect of PARQUET-1529( https://github.com/apache/parquet-mr/pull/617), which adds maven-shade-plugin to the package phase and thus relocates jackson packages.
But my point is: It seems that it does not make sense to keep the module parquet-jackson which just provides relocated jackson as a dependency -- * We also shade other dependencies in parquet, such as ` it.unimi.dsi:fastutil`, but we do not provide a module for just shading it. So it would be more consistent to do it the same with jackson, just shade it in modules where it is used. * We should probably introduce maven-shade-plugin in all modules so that the relocation (managed by the root pom) will be done where necessary (but that could be a separate topic) * Introducing parquet-jackson as a dependency but not using relocated jackson package in the code is kind of confusing. Qinghui Le lun. 18 févr. 2019 à 18:14, Ryan Blue <[email protected]> a écrit : > Qinghui, > > Parquet source uses the unshaded dependencies, but those dependencies are > rewritten in every module's build. That way source code remains compatible > with Jackson and shading is just part of the build. I've closed > PARQUET-1281 with "Not a problem". > > rb > > On Mon, Feb 18, 2019 at 9:04 AM XU Qinghui <[email protected]> > wrote: > >> I think it's necessary to shade jackson, as it is used a lot in different >> environment (with different versions). >> But it turns out that the parquet-jackson is not used everywhere inside >> parquet-mr (PARQUET-1281 >> <https://issues.apache.org/jira/browse/PARQUET-1281>). >> I brought up this subject some while ago, but it seems that it would be >> more convenient / friendly for developers to use directly jackson and then >> shade it when packaging. In fact, IDEs often have problem to deal with >> "shaded" package in your code appropriately. So we might just remove the >> parquet-jackson module, and shade our artifacts instead. >> >> Best wishes, >> Qinghui >> >> Le lun. 18 févr. 2019 à 17:47, Ryan Blue <[email protected]> a >> écrit : >> >>> I don't think that removing the shading is a good idea. Jackson is a very >>> common dependency and pulling in projects that use different versions has >>> caused a lot of headache. Why go backward and make Parquet vulnerable to >>> those problems? I don't see a good justification for it. >>> >>> On Mon, Feb 18, 2019 at 8:29 AM Jacques Nadeau <[email protected]> >>> wrote: >>> >>> > I haven't looked at the usage but would wonder if the core modules >>> truly >>> > need jackson. I don't think most of the systems that read Parquet use >>> the >>> > jackson part (?). If so, maybe the code could be refactored to remove >>> the >>> > dependency and it be moved to an optional component. We want to do the >>> same >>> > thing with Jackson in Arrow (and did it recently for Guava). >>> > >>> > On Mon, Feb 18, 2019 at 3:09 AM Driesprong, Fokko <[email protected] >>> > >>> > wrote: >>> > >>> > > Hi all, >>> > > >>> > > Recently I've opened a PR to move from Jackson 1.x to Jackson 2.9 >>> > > <https://github.com/apache/parquet-mr/pull/616>. I've also removed >>> the >>> > > shading project since most libraries are up to date with Jackson 2.x. >>> > Gabor >>> > > suggested having a discussion on the mailing list to discuss the >>> removal >>> > of >>> > > the shading of Jackson. >>> > > >>> > > Spark 2.x is at 2.6, Spark 3.0 at 2.9.6, Hadoop at 2.9.x, Flink at >>> 2.7.9, >>> > > but that one is shaded anyway :-) One problem might be Apache Avro >>> which >>> > is >>> > > still using Jackson 1.x (codehause), until we release Avro 1.9. >>> > > >>> > > What are the thoughts on this subject, should we still shade >>> Jackson, or >>> > > not? >>> > > >>> > > Cheers, Fokko >>> > > >>> > >>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >> > > -- > Ryan Blue > Software Engineer > Netflix >
