As for the following point
I believe that regardless of the method of including the Client into Spark
runtime, the code has to be exactly the same.... and I doubt it is the same
now. WDYT?

The code included in the jar for Spark Client is different now with the
change, because it
now uses a class in a different package, even though they do the same
thing. However,
I think it is a good change, it simplifies our dependency and avoids
potential compatibility issue
due to the shading of iceberg-spark-runtime. I definitely agree we should
also include this also in 1.0.

Best Regards,
Yun

On Fri, Jun 20, 2025 at 9:47 AM yun zou <yunzou.colost...@gmail.com> wrote:

>
> *-- What is the maven artifact that Spark can automatically pull
> (via--packages)*
>
> Our spark client pulls the following:
>
> org.apache.polaris#polaris-spark-3.5_2.12
>
> org.apache.polaris#polaris-core
>
> org.apache.polaris#polaris-api-management-model
>
> org.apache.iceberg#iceberg-spark-runtime-3.5_2.12
>
>
> Prior to the change, it also pulled iceberg-core and avro 1.20.0.
>
>
> *-- Does that artifact use shaded dependencies*
>
> Any usage of classes from iceberg-spark-runtime uses the shaded libraries
> shipped along with the artifacts.
>
>
>
> *-- Does that artifact depend on the Iceberg Spark bundle?*
>
> If you are referring to our spark client, it depends on iceberg-spark-runtime,
> not other bundles.
>
>
>
> *-- Is the _code_ running in Spark the same when the Polaris Spark Client
> ispulled via --packages and via --jars?*
>
>
> yes, the jar and package will use the same code, where the jar simply
> packs everything
>
> for the user and there is no need to download any other dependency.
>
>
> Best Regards,
>
> Yun
>
>
>
> On Fri, Jun 20, 2025 at 9:18 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
>> Some questions for clarification:
>>
>> * What is the maven artifact that Spark can automatically pull (via
>> --packages)?
>> * Does that artifact use shaded dependencies?
>> * Does that artifact depend on the Iceberg Spark bundle?
>> * Is the _code_ running in Spark the same when the Polaris Spark Client is
>> pulled via --packages and via --jars?
>>
>> I know I could have figured that out from code, but I'm asking here
>> because
>> I think we may need to review our approach to publishing these artifacts.
>>
>> I believe that regardless of the method of including the Client into Spark
>> runtime, the code has to be exactly the same.... and I doubt it is the
>> same
>> now. WDYT?
>>
>> Thanks,
>> Dmitri.
>>
>>
>> On Fri, Jun 20, 2025 at 10:15 AM Dmitri Bourlatchkov <di...@apache.org>
>> wrote:
>>
>> > Hi All,
>> >
>> > Re: PR [1908] let's use this thread to clarify the problems we're trying
>> > to solve and options for solutions.
>> >
>> > As for me, it looks like some refactoring in the way the Spark Client is
>> > built and published may be needed.
>> >
>> > I think it makes sense to clarify this before 1.0 to avoid changes to
>> > Maven coordinates right after 1.0
>> >
>> > [1908] https://github.com/apache/polaris/pull/1908
>> >
>> > Thanks,
>> > Dmitri.
>> >
>> >
>>
>

Reply via email to