I definitely agree that we should resolve this issue for 1.0.

... but I'm not sure that running different code with --jar and --packages
is a good idea, even if the differences are only in references to shaded
classes.

If one works without shading, why can the other not work without shading?

Thanks,
Dmitri.

On Fri, Jun 20, 2025 at 12:52 PM yun zou <yunzou.colost...@gmail.com> wrote:

> As for the following point
> I believe that regardless of the method of including the Client into Spark
> runtime, the code has to be exactly the same.... and I doubt it is the same
> now. WDYT?
>
> The code included in the jar for Spark Client is different now with the
> change, because it
> now uses a class in a different package, even though they do the same
> thing. However,
> I think it is a good change, it simplifies our dependency and avoids
> potential compatibility issue
> due to the shading of iceberg-spark-runtime. I definitely agree we should
> also include this also in 1.0.
>
> Best Regards,
> Yun
>
> On Fri, Jun 20, 2025 at 9:47 AM yun zou <yunzou.colost...@gmail.com>
> wrote:
>
> >
> > *-- What is the maven artifact that Spark can automatically pull
> > (via--packages)*
> >
> > Our spark client pulls the following:
> >
> > org.apache.polaris#polaris-spark-3.5_2.12
> >
> > org.apache.polaris#polaris-core
> >
> > org.apache.polaris#polaris-api-management-model
> >
> > org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
> >
> >
> > Prior to the change, it also pulled iceberg-core and avro 1.20.0.
> >
> >
> > *-- Does that artifact use shaded dependencies*
> >
> > Any usage of classes from iceberg-spark-runtime uses the shaded libraries
> > shipped along with the artifacts.
> >
> >
> >
> > *-- Does that artifact depend on the Iceberg Spark bundle?*
> >
> > If you are referring to our spark client, it depends on
> iceberg-spark-runtime,
> > not other bundles.
> >
> >
> >
> > *-- Is the _code_ running in Spark the same when the Polaris Spark Client
> > ispulled via --packages and via --jars?*
> >
> >
> > yes, the jar and package will use the same code, where the jar simply
> > packs everything
> >
> > for the user and there is no need to download any other dependency.
> >
> >
> > Best Regards,
> >
> > Yun
> >
> >
> >
> > On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <di...@apache.org>
> > wrote:
> >
> >> Some questions for clarification:
> >>
> >> * What is the maven artifact that Spark can automatically pull (via
> >> --packages)?
> >> * Does that artifact use shaded dependencies?
> >> * Does that artifact depend on the Iceberg Spark bundle?
> >> * Is the _code_ running in Spark the same when the Polaris Spark Client
> is
> >> pulled via --packages and via --jars?
> >>
> >> I know I could have figured that out from code, but I'm asking here
> >> because
> >> I think we may need to review our approach to publishing these
> artifacts.
> >>
> >> I believe that regardless of the method of including the Client into
> Spark
> >> runtime, the code has to be exactly the same.... and I doubt it is the
> >> same
> >> now. WDYT?
> >>
> >> Thanks,
> >> Dmitri.
> >>
> >>
> >> On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <di...@apache.org>
> >> wrote:
> >>
> >> > Hi All,
> >> >
> >> > Re: PR [1908] let's use this thread to clarify the problems we're
> trying
> >> > to solve and options for solutions.
> >> >
> >> > As for me, it looks like some refactoring in the way the Spark Client
> is
> >> > built and published may be needed.
> >> >
> >> > I think it makes sense to clarify this before 1.0 to avoid changes to
> >> > Maven coordinates right after 1.0
> >> >
> >> > [1908] https://github.com/apache/polaris/pull/1908
> >> >
> >> > Thanks,
> >> > Dmitri.
> >> >
> >> >
> >>
> >
>

Reply via email to