Greetings fellow practitioners of the art,

We believe we are ready to call a vote on the *Faster queries in local
laptop mode for Apache Spark*.

Motivation:
We want to enhance Spark's usability and interactivity for small-data
queries, specifically on laptops. This can make it more useful for
individual users and beginners prototyping.

Proposal:
This SPIP includes three specific categories of performance improvements,
including optimization improvements for single-file scans, an Arrow-based
df.cache reimplementation, and shuffle-free local execution for small
queries. The community has also suggested a couple other ideas in a similar
spirit on the document, and a couple members have volunteered to help with
the implementation.

SPIP Document:
https://docs.google.com/document/d/1Nphejrf_vh4YRECn0JPgKClqxDS_lB6wufZFJQxyY98/edit?tab=t.0#heading=h.hj76akdx5ul

The vote will be open for at least 72 hours, and passes if a majority +1 PMC
votes are cast, with a minimum of 3 +1 votes.
Please vote:
[ ] +1: Accept the proposal as an official SPIP
[ ] +0
[ ] -1: I don't think this is a good idea because ...

Best,
Daniel Tenedorio and Liang-Chi Hsieh

Reply via email to