Hi Yun,

Current docs [1] suggest using `--packages
org.apache.polaris:polaris-spark-3.5_2.12:1.0.0` but PR 1908 produces
`polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar` (note:
bundle, disregard version).

At least this is what I saw in my local build. Is that a concern?

[1]
https://github.com/apache/polaris/blob/main/site/content/in-dev/unreleased/polaris-spark-client.md

Thanks,
Dmitri.

On Fri, Jun 20, 2025 at 12:47 PM yun zou <yunzou.colost...@gmail.com> wrote:

> *-- What is the maven artifact that Spark can automatically pull
> (via--packages)*
>
> Our spark client pulls the following:
>
> org.apache.polaris#polaris-spark-3.5_2.12
>
> org.apache.polaris#polaris-core
>
> org.apache.polaris#polaris-api-management-model
>
> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
>
>
> Prior to the change, it also pulled iceberg-core and avro 1.20.0.
>
>
> *-- Does that artifact use shaded dependencies*
>
> Any usage of classes from iceberg-spark-runtime uses the shaded libraries
> shipped along with the artifacts.
>
>
>
> *-- Does that artifact depend on the Iceberg Spark bundle?*
>
> If you are referring to our spark client, it depends on
> iceberg-spark-runtime,
> not other bundles.
>
>
>
> *-- Is the _code_ running in Spark the same when the Polaris Spark Client
> ispulled via --packages and via --jars?*
>
>
> yes, the jar and package will use the same code, where the jar simply packs
> everything
>
> for the user and there is no need to download any other dependency.
>
>
> Best Regards,
>
> Yun
>
>
>
> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > Some questions for clarification:
> >
> > * What is the maven artifact that Spark can automatically pull (via
> > --packages)?
> > * Does that artifact use shaded dependencies?
> > * Does that artifact depend on the Iceberg Spark bundle?
> > * Is the _code_ running in Spark the same when the Polaris Spark Client
> is
> > pulled via --packages and via --jars?
> >
> > I know I could have figured that out from code, but I'm asking here
> because
> > I think we may need to review our approach to publishing these artifacts.
> >
> > I believe that regardless of the method of including the Client into
> Spark
> > runtime, the code has to be exactly the same.... and I doubt it is the
> same
> > now. WDYT?
> >
> > Thanks,
> > Dmitri.
> >
> >
> > On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <di...@apache.org>
> > wrote:
> >
> > > Hi All,
> > >
> > > Re: PR [1908] let's use this thread to clarify the problems we're
> trying
> > > to solve and options for solutions.
> > >
> > > As for me, it looks like some refactoring in the way the Spark Client
> is
> > > built and published may be needed.
> > >
> > > I think it makes sense to clarify this before 1.0 to avoid changes to
> > > Maven coordinates right after 1.0
> > >
> > > [1908] https://github.com/apache/polaris/pull/1908
> > >
> > > Thanks,
> > > Dmitri.
> > >
> > >
> >
>

Reply via email to