In any case, IMHO, even updating jackson version numbers in two places is preferable to compiling against shaded packages.
On Fri, Jun 20, 2025 at 3:25 PM Dmitri Bourlatchkov <di...@apache.org> wrote: > I suppose we should be able to get the version of Jackson used by Iceberg > from Iceberg POM information, right? > > Cheers, > Dmitri. > > On Fri, Jun 20, 2025 at 3:08 PM Yufei Gu <flyrain...@gmail.com> wrote: > >> That's an interesting idea. But it requires us to maintain the consistency >> of the Jackson version in two places instead of one. The original Jackson >> version has to match with the one shaded in Iceberg spark runtime. Every >> time we update one, we have to remember to update another. I'm not sure if >> it improves the situation. >> >> Yufei >> >> >> On Fri, Jun 20, 2025 at 11:43 AM Dmitri Bourlatchkov <di...@apache.org> >> wrote: >> >> > Hi Yun and Yufei, >> > >> > > Specifically, why does CreateGenericTableRESTRequest use the shaded >> > Jackson? >> > >> > As discussed off list, request / response payload classes have to work >> with >> > the version of Jackson included with the Iceberg Spark jars (because >> they >> > own the RESTClient). >> > >> > That in itself is fine. >> > >> > I'd like to propose a different approach to implementing that in >> Polaris, >> > though. >> > >> > Instead of compiling against relocated classes, why don't we compile >> > against the original Jackson jar, and later relocate the Spark Client to >> > "org.apache.iceberg.shaded.com.fasterxml.jackson.*" ? >> > >> > I believe Jackson is the only relocation concern. >> > >> > After relocation we can publish both the "thin" client for use with >> > --package in Spark, and the "fat" jar for use with --jar. Both artifacts >> > will depend on the relocated Iceberg artifacts. >> > >> > WDYT? >> > >> > Cheers, >> > Dmitri. >> > >> > On Fri, Jun 20, 2025 at 1:05 PM Dmitri Bourlatchkov <di...@apache.org> >> > wrote: >> > >> > > Thanks for the quick response, Yun! >> > > >> > > > org.apache.polaris#polaris-core >> > > > org.apache.iceberg#iceberg-spark-runtime-3.5_2.12 >> > > >> > > IIRC, polaris-core uses Jackson. iceberg-spark-runtime also uses >> Jackson, >> > > but it shades it. >> > > >> > > I believe I saw issues with using both shaded and non-shaded Jackson >> in >> > > the same Spark env. with Iceberg. >> > > >> > > This may or may not be a concern for our Spark Client. What I mean is >> > that >> > > it may need some more consideration to be sure. >> > > >> > > Specifically, why does CreateGenericTableRESTRequest use the shaded >> > > Jackson? >> > > >> > > WDYT? >> > > >> > > Thanks, >> > > Dmitri. >> > > >> > > On Fri, Jun 20, 2025 at 12:47 PM yun zou <yunzou.colost...@gmail.com> >> > > wrote: >> > > >> > >> *-- What is the maven artifact that Spark can automatically pull >> > >> (via--packages)* >> > >> >> > >> Our spark client pulls the following: >> > >> >> > >> org.apache.polaris#polaris-spark-3.5_2.12 >> > >> >> > >> org.apache.polaris#polaris-core >> > >> >> > >> org.apache.polaris#polaris-api-management-model >> > >> >> > >> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12 >> > >> >> > >> >> > >> Prior to the change, it also pulled iceberg-core and avro 1.20.0. >> > >> >> > >> >> > >> *-- Does that artifact use shaded dependencies* >> > >> >> > >> Any usage of classes from iceberg-spark-runtime uses the shaded >> > libraries >> > >> shipped along with the artifacts. >> > >> >> > >> >> > >> >> > >> *-- Does that artifact depend on the Iceberg Spark bundle?* >> > >> >> > >> If you are referring to our spark client, it depends on >> > >> iceberg-spark-runtime, >> > >> not other bundles. >> > >> >> > >> >> > >> >> > >> *-- Is the _code_ running in Spark the same when the Polaris Spark >> > Client >> > >> ispulled via --packages and via --jars?* >> > >> >> > >> >> > >> yes, the jar and package will use the same code, where the jar simply >> > >> packs >> > >> everything >> > >> >> > >> for the user and there is no need to download any other dependency. >> > >> >> > >> >> > >> Best Regards, >> > >> >> > >> Yun >> > >> >> > >> >> > >> >> > >> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov < >> di...@apache.org> >> > >> wrote: >> > >> >> > >> > Some questions for clarification: >> > >> > >> > >> > * What is the maven artifact that Spark can automatically pull (via >> > >> > --packages)? >> > >> > * Does that artifact use shaded dependencies? >> > >> > * Does that artifact depend on the Iceberg Spark bundle? >> > >> > * Is the _code_ running in Spark the same when the Polaris Spark >> > Client >> > >> is >> > >> > pulled via --packages and via --jars? >> > >> > >> > >> > I know I could have figured that out from code, but I'm asking here >> > >> because >> > >> > I think we may need to review our approach to publishing these >> > >> artifacts. >> > >> > >> > >> > I believe that regardless of the method of including the Client >> into >> > >> Spark >> > >> > runtime, the code has to be exactly the same.... and I doubt it is >> the >> > >> same >> > >> > now. WDYT? >> > >> > >> > >> > Thanks, >> > >> > Dmitri. >> > >> > >> > >> > >> > >> > On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov < >> > di...@apache.org> >> > >> > wrote: >> > >> > >> > >> > > Hi All, >> > >> > > >> > >> > > Re: PR [1908] let's use this thread to clarify the problems we're >> > >> trying >> > >> > > to solve and options for solutions. >> > >> > > >> > >> > > As for me, it looks like some refactoring in the way the Spark >> > Client >> > >> is >> > >> > > built and published may be needed. >> > >> > > >> > >> > > I think it makes sense to clarify this before 1.0 to avoid >> changes >> > to >> > >> > > Maven coordinates right after 1.0 >> > >> > > >> > >> > > [1908] https://github.com/apache/polaris/pull/1908 >> > >> > > >> > >> > > Thanks, >> > >> > > Dmitri. >> > >> > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >