I suppose we should be able to get the version of Jackson used by Iceberg
from Iceberg POM information, right?

Cheers,
Dmitri.

On Fri, Jun 20, 2025 at 3:08 PM Yufei Gu <flyrain...@gmail.com> wrote:

> That's an interesting idea. But it requires us to maintain the consistency
> of the Jackson version in two places instead of one. The original Jackson
> version has to match with the one shaded in Iceberg spark runtime. Every
> time we update one, we have to remember to update another. I'm not sure if
> it improves the situation.
>
> Yufei
>
>
> On Fri, Jun 20, 2025 at 11:43 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > Hi Yun and Yufei,
> >
> > > Specifically, why does CreateGenericTableRESTRequest use the shaded
> > Jackson?
> >
> > As discussed off list, request / response payload classes have to work
> with
> > the version of Jackson included with the Iceberg Spark jars (because they
> > own the RESTClient).
> >
> > That in itself is fine.
> >
> > I'd like to propose a different approach to implementing that in Polaris,
> > though.
> >
> > Instead of compiling against relocated classes, why don't we compile
> > against the original Jackson jar, and later relocate the Spark Client to
> > "org.apache.iceberg.shaded.com.fasterxml.jackson.*" ?
> >
> > I believe Jackson is the only relocation concern.
> >
> > After relocation we can publish both the "thin" client for use with
> > --package in Spark, and the "fat" jar for use with --jar. Both artifacts
> > will depend on the relocated Iceberg artifacts.
> >
> > WDYT?
> >
> > Cheers,
> > Dmitri.
> >
> > On Fri, Jun 20, 2025 at 1:05 PM Dmitri Bourlatchkov <di...@apache.org>
> > wrote:
> >
> > > Thanks for the quick response, Yun!
> > >
> > > > org.apache.polaris#polaris-core
> > > > org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
> > >
> > > IIRC, polaris-core uses Jackson. iceberg-spark-runtime also uses
> Jackson,
> > > but it shades it.
> > >
> > > I believe I saw issues with using both shaded and non-shaded Jackson in
> > > the same Spark env. with Iceberg.
> > >
> > > This may or may not be a concern for our Spark Client. What I mean is
> > that
> > > it may need some more consideration to be sure.
> > >
> > > Specifically, why does CreateGenericTableRESTRequest use the shaded
> > > Jackson?
> > >
> > > WDYT?
> > >
> > > Thanks,
> > > Dmitri.
> > >
> > > On Fri, Jun 20, 2025 at 12:47 PM yun zou <yunzou.colost...@gmail.com>
> > > wrote:
> > >
> > >> *-- What is the maven artifact that Spark can automatically pull
> > >> (via--packages)*
> > >>
> > >> Our spark client pulls the following:
> > >>
> > >> org.apache.polaris#polaris-spark-3.5_2.12
> > >>
> > >> org.apache.polaris#polaris-core
> > >>
> > >> org.apache.polaris#polaris-api-management-model
> > >>
> > >> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
> > >>
> > >>
> > >> Prior to the change, it also pulled iceberg-core and avro 1.20.0.
> > >>
> > >>
> > >> *-- Does that artifact use shaded dependencies*
> > >>
> > >> Any usage of classes from iceberg-spark-runtime uses the shaded
> > libraries
> > >> shipped along with the artifacts.
> > >>
> > >>
> > >>
> > >> *-- Does that artifact depend on the Iceberg Spark bundle?*
> > >>
> > >> If you are referring to our spark client, it depends on
> > >> iceberg-spark-runtime,
> > >> not other bundles.
> > >>
> > >>
> > >>
> > >> *-- Is the _code_ running in Spark the same when the Polaris Spark
> > Client
> > >> ispulled via --packages and via --jars?*
> > >>
> > >>
> > >> yes, the jar and package will use the same code, where the jar simply
> > >> packs
> > >> everything
> > >>
> > >> for the user and there is no need to download any other dependency.
> > >>
> > >>
> > >> Best Regards,
> > >>
> > >> Yun
> > >>
> > >>
> > >>
> > >> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <di...@apache.org
> >
> > >> wrote:
> > >>
> > >> > Some questions for clarification:
> > >> >
> > >> > * What is the maven artifact that Spark can automatically pull (via
> > >> > --packages)?
> > >> > * Does that artifact use shaded dependencies?
> > >> > * Does that artifact depend on the Iceberg Spark bundle?
> > >> > * Is the _code_ running in Spark the same when the Polaris Spark
> > Client
> > >> is
> > >> > pulled via --packages and via --jars?
> > >> >
> > >> > I know I could have figured that out from code, but I'm asking here
> > >> because
> > >> > I think we may need to review our approach to publishing these
> > >> artifacts.
> > >> >
> > >> > I believe that regardless of the method of including the Client into
> > >> Spark
> > >> > runtime, the code has to be exactly the same.... and I doubt it is
> the
> > >> same
> > >> > now. WDYT?
> > >> >
> > >> > Thanks,
> > >> > Dmitri.
> > >> >
> > >> >
> > >> > On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <
> > di...@apache.org>
> > >> > wrote:
> > >> >
> > >> > > Hi All,
> > >> > >
> > >> > > Re: PR [1908] let's use this thread to clarify the problems we're
> > >> trying
> > >> > > to solve and options for solutions.
> > >> > >
> > >> > > As for me, it looks like some refactoring in the way the Spark
> > Client
> > >> is
> > >> > > built and published may be needed.
> > >> > >
> > >> > > I think it makes sense to clarify this before 1.0 to avoid changes
> > to
> > >> > > Maven coordinates right after 1.0
> > >> > >
> > >> > > [1908] https://github.com/apache/polaris/pull/1908
> > >> > >
> > >> > > Thanks,
> > >> > > Dmitri.
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to