In any case, IMHO, even updating jackson version numbers in two places is
preferable to compiling against shaded packages.

On Fri, Jun 20, 2025 at 3:25 PM Dmitri Bourlatchkov <di...@apache.org>
wrote:

> I suppose we should be able to get the version of Jackson used by Iceberg
> from Iceberg POM information, right?
>
> Cheers,
> Dmitri.
>
> On Fri, Jun 20, 2025 at 3:08 PM Yufei Gu <flyrain...@gmail.com> wrote:
>
>> That's an interesting idea. But it requires us to maintain the consistency
>> of the Jackson version in two places instead of one. The original Jackson
>> version has to match with the one shaded in Iceberg spark runtime. Every
>> time we update one, we have to remember to update another. I'm not sure if
>> it improves the situation.
>>
>> Yufei
>>
>>
>> On Fri, Jun 20, 2025 at 11:43 AM Dmitri Bourlatchkov <di...@apache.org>
>> wrote:
>>
>> > Hi Yun and Yufei,
>> >
>> > > Specifically, why does CreateGenericTableRESTRequest use the shaded
>> > Jackson?
>> >
>> > As discussed off list, request / response payload classes have to work
>> with
>> > the version of Jackson included with the Iceberg Spark jars (because
>> they
>> > own the RESTClient).
>> >
>> > That in itself is fine.
>> >
>> > I'd like to propose a different approach to implementing that in
>> Polaris,
>> > though.
>> >
>> > Instead of compiling against relocated classes, why don't we compile
>> > against the original Jackson jar, and later relocate the Spark Client to
>> > "org.apache.iceberg.shaded.com.fasterxml.jackson.*" ?
>> >
>> > I believe Jackson is the only relocation concern.
>> >
>> > After relocation we can publish both the "thin" client for use with
>> > --package in Spark, and the "fat" jar for use with --jar. Both artifacts
>> > will depend on the relocated Iceberg artifacts.
>> >
>> > WDYT?
>> >
>> > Cheers,
>> > Dmitri.
>> >
>> > On Fri, Jun 20, 2025 at 1:05 PM Dmitri Bourlatchkov <di...@apache.org>
>> > wrote:
>> >
>> > > Thanks for the quick response, Yun!
>> > >
>> > > > org.apache.polaris#polaris-core
>> > > > org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
>> > >
>> > > IIRC, polaris-core uses Jackson. iceberg-spark-runtime also uses
>> Jackson,
>> > > but it shades it.
>> > >
>> > > I believe I saw issues with using both shaded and non-shaded Jackson
>> in
>> > > the same Spark env. with Iceberg.
>> > >
>> > > This may or may not be a concern for our Spark Client. What I mean is
>> > that
>> > > it may need some more consideration to be sure.
>> > >
>> > > Specifically, why does CreateGenericTableRESTRequest use the shaded
>> > > Jackson?
>> > >
>> > > WDYT?
>> > >
>> > > Thanks,
>> > > Dmitri.
>> > >
>> > > On Fri, Jun 20, 2025 at 12:47 PM yun zou <yunzou.colost...@gmail.com>
>> > > wrote:
>> > >
>> > >> *-- What is the maven artifact that Spark can automatically pull
>> > >> (via--packages)*
>> > >>
>> > >> Our spark client pulls the following:
>> > >>
>> > >> org.apache.polaris#polaris-spark-3.5_2.12
>> > >>
>> > >> org.apache.polaris#polaris-core
>> > >>
>> > >> org.apache.polaris#polaris-api-management-model
>> > >>
>> > >> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
>> > >>
>> > >>
>> > >> Prior to the change, it also pulled iceberg-core and avro 1.20.0.
>> > >>
>> > >>
>> > >> *-- Does that artifact use shaded dependencies*
>> > >>
>> > >> Any usage of classes from iceberg-spark-runtime uses the shaded
>> > libraries
>> > >> shipped along with the artifacts.
>> > >>
>> > >>
>> > >>
>> > >> *-- Does that artifact depend on the Iceberg Spark bundle?*
>> > >>
>> > >> If you are referring to our spark client, it depends on
>> > >> iceberg-spark-runtime,
>> > >> not other bundles.
>> > >>
>> > >>
>> > >>
>> > >> *-- Is the _code_ running in Spark the same when the Polaris Spark
>> > Client
>> > >> ispulled via --packages and via --jars?*
>> > >>
>> > >>
>> > >> yes, the jar and package will use the same code, where the jar simply
>> > >> packs
>> > >> everything
>> > >>
>> > >> for the user and there is no need to download any other dependency.
>> > >>
>> > >>
>> > >> Best Regards,
>> > >>
>> > >> Yun
>> > >>
>> > >>
>> > >>
>> > >> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <
>> di...@apache.org>
>> > >> wrote:
>> > >>
>> > >> > Some questions for clarification:
>> > >> >
>> > >> > * What is the maven artifact that Spark can automatically pull (via
>> > >> > --packages)?
>> > >> > * Does that artifact use shaded dependencies?
>> > >> > * Does that artifact depend on the Iceberg Spark bundle?
>> > >> > * Is the _code_ running in Spark the same when the Polaris Spark
>> > Client
>> > >> is
>> > >> > pulled via --packages and via --jars?
>> > >> >
>> > >> > I know I could have figured that out from code, but I'm asking here
>> > >> because
>> > >> > I think we may need to review our approach to publishing these
>> > >> artifacts.
>> > >> >
>> > >> > I believe that regardless of the method of including the Client
>> into
>> > >> Spark
>> > >> > runtime, the code has to be exactly the same.... and I doubt it is
>> the
>> > >> same
>> > >> > now. WDYT?
>> > >> >
>> > >> > Thanks,
>> > >> > Dmitri.
>> > >> >
>> > >> >
>> > >> > On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <
>> > di...@apache.org>
>> > >> > wrote:
>> > >> >
>> > >> > > Hi All,
>> > >> > >
>> > >> > > Re: PR [1908] let's use this thread to clarify the problems we're
>> > >> trying
>> > >> > > to solve and options for solutions.
>> > >> > >
>> > >> > > As for me, it looks like some refactoring in the way the Spark
>> > Client
>> > >> is
>> > >> > > built and published may be needed.
>> > >> > >
>> > >> > > I think it makes sense to clarify this before 1.0 to avoid
>> changes
>> > to
>> > >> > > Maven coordinates right after 1.0
>> > >> > >
>> > >> > > [1908] https://github.com/apache/polaris/pull/1908
>> > >> > >
>> > >> > > Thanks,
>> > >> > > Dmitri.
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> >
>>
>

Reply via email to