Re: SPIP - Asynchronous Metadata Resolution & Lazy Prefetching for Spark Connect

vaquar khan Wed, 07 Jan 2026 07:47:36 -0800

Hi  Herman,

I have enabled the comments and appreciate your feedback.


Regards,
Vaquar khan

On Wed, 7 Jan 2026 at 07:53, Herman van Hovell via dev <[email protected]>
wrote:

> Hi Vaquar,
>
> Can you enable comments on the doc?
>
> In general I am not against making improvements in this area. However the
> devil is very much in the details here.
>
> Cheers,
> Herman
>
> On Mon, Dec 29, 2025 at 1:15 PM vaquar khan <[email protected]> wrote:
>
>> Hi everyone,
>>
>> I’ve been following the rapid maturation of *Spark Connect* in the 4.x
>> release and have been identifying areas where remote execution can reach
>> parity with Spark Classic .
>>
>> While the remote execution model elegantly decouples the client from the
>> JVM, I am concerned about a performance regression in interactive and
>> high-complexity workloads.
>>
>> Specifically, the current implementation of *Eager Analysis* (df.columns,
>> df.schema, etc.) relies on synchronous gRPC round-trips that block the
>> client thread. In environments with high network latency, these blocking
>> calls create a "Death by 1000 RPCs" bottleneck—often forcing developers to
>> write suboptimal, "Connect-specific" code to avoid metadata requests .
>>
>> *Proposal*:
>>
>> I propose we introduce a Client-Side Metadata Skip-Layer (Lazy
>> Prefetching) within the Spark Connect protocol. Key pillars include:
>>
>>    1.
>>
>>    *Plan-Piggybacking:* Allowing the *SparkConnectService* to return
>>    resolved schemas of relations during standard plan execution.
>>    2.
>>
>>    *Local Schema Cache:* A configurable client-side cache in the
>>    *SparkSession* to store resolved schemas.
>>    3.
>>
>>    *Batched Analysis API:* An extension to the *AnalyzePlan* protocol to
>>    allow schema resolution for multiple DataFrames in a single batch call.
>>
>> This shift would ensure that Spark Connect provides the same "fluid"
>> interactive experience as Spark Classic, removing the $O(N)$ network
>> latency overhead for metadata-heavy operations .
>>
>> I have drafted a full SPIP document ready for review  , which includes
>> the proposed changes for the *SparkConnectService* and *AnalyzePlan*
>> handlers.
>>
>> *SPIP Doc:*
>>
>>
>> https://docs.google.com/document/d/1xTvL5YWnHu1jfXvjlKk2KeSv8JJC08dsD7mdbjjo9YE/edit?usp=sharing
>>
>> Before I finalize the JIRA, has there been any recent internal discussion
>> regarding metadata prefetching or batching analysis requests in the current
>> Spark Connect roadmap ?
>>
>>
>> Regards,
>> Vaquar Khan
>> https://www.linkedin.com/in/vaquar-khan-b695577/
>>
>

-- 
Regards,
Vaquar Khan

Re: SPIP - Asynchronous Metadata Resolution & Lazy Prefetching for Spark Connect

Reply via email to