Re: [QUESTION] SparkTable refreshEagerly

Szehon Ho Fri, 08 Aug 2025 18:08:21 -0700

Hi

Interesting, off the top of my head, I'm not 100% sure a SparkTable is
expected to support a concurrent schema evolution case (that Spark can
handle a relation changing the underlying schema during query analysis).
The code seems more to handle concurrent writes, to get the latest
snapshot/data.


Thanks,
Szehon

On Fri, Aug 8, 2025 at 8:32 AM Ma, Limin <l...@akamai.com.invalid> wrote:

> Hi all,
>
>
>
> I have an inquiry about the proper use of SparkTable’s refreshEagerly.
>
> When SparkTable is instantiated, it derives Spark Schema from the wrapped
> icebergTable object’s current snapshot. If refreshEagerly=true, by the time
> of newScanBuilder,
>
> icebergTable.refresh is invoked. In case Iceberg table schema has evolved
> in-between, won’t a newer schema be passed to SparkScanBuilder to build the
> scan? Won’t this has potential schema mismatch issue?
>
>
>
> Any clarifications/suggestions?
>
>
>
> Thanks,
>
> Limin
>

Re: [QUESTION] SparkTable refreshEagerly

Reply via email to