Re: [DISCUSS] SPIP: Faster queries in local laptop mode for Apache Spark

Felix Cheung Thu, 07 May 2026 21:23:35 -0700

Interesting

________________________________
From: Cheng Pan <[email protected]>
Sent: Wednesday, May 6, 2026 8:01 PM
To: [email protected] <[email protected]>
Subject: Re: [DISCUSS] SPIP: Faster queries in local laptop mode for Apache 
Spark


+1. And I leave a comment in the docs about the Hadoop client improvement, 
which should also benefit running Spark on the laptop.

Thanks,
Cheng Pan



On May 6, 2026, at 15:01, John Zhuge <[email protected]> wrote:

+1 worthwhile to lower Spark small-data overhead

On Mon, May 4, 2026 at 11:47 PM Ángel Álvarez Pascua 
<[email protected]<mailto:[email protected]>> wrote:
Love it. Please, count on me if any help is needed.

El mar, 5 may 2026, 7:31, DB Tsai <[email protected]<mailto:[email protected]>> 
escribió:
Thanks Daniel and Liang-Chi for driving this. This is an exciting proposal that 
can significantly speed up local experimentation and development on laptops. It 
also helps make Spark a great fit for both big-data workloads and small-data 
exploratory workflows.

DB Tsai  |  https://www.dbtsai.com/  |  PGP 0x9FB9FAA3

On Monday, May 4th, 2026 at 3:39 PM, Daniel Tenedorio 
<[email protected]<mailto:[email protected]>> wrote:

Hi Spark community,

We’d like to propose a new SPIP to improve the experience of running Apache 
Spark on laptops.

SPIP doc:

https://docs.google.com/document/d/1Nphejrf_vh4YRECn0JPgKClqxDS_lB6wufZFJQxyY98/edit?tab=t.0#heading=h.hj76akdx5ul

Summary:

Spark’s execution model is optimized for distributed workloads, but this 
introduces noticeable overhead for small datasets (e.g., <100MB), where even 
simple queries can take multiple seconds. This makes Spark less suitable for 
interactive and exploratory use cases on laptops, and often pushes users toward 
alternative single-node tools.

This proposal aims to reduce that overhead in local mode, improving latency for 
small queries and making Spark more usable as an entry point for new users and 
iterative workflows.

We’d appreciate your review and feedback.

Thanks,
Daniel Tenedorio and Liang-Chi Hsieh



--
John Zhuge

Re: [DISCUSS] SPIP: Faster queries in local laptop mode for Apache Spark

Reply via email to