Hi Ángel,

Yeah the ChecksumCheckpointFileManager creates background threads to
parallelize file uploads from threads using the state store. This only
applies to stateful queries (i.e. using state store). You can turn it off
by setting `spark.sql.streaming.checkpoint.fileChecksum.enabled` to `false`
and see how your experiment behaves.

Also, for Stateful queries, for each partition, Spark creates a state store
instance. If you're using the default store in Spark, then it is an
in-memory map. As you increase the number of partitions, the number of
store instances increases as well, hence node resource usage increases.

In reality, you typically won't be running that amount of partitions on a
single node and will likely set it based on the number of cores you have
available. Or do you have a specific use case that requires 1000 state
partitions on a single node?

But yeah, a lower default might be useful.

On Mon, Feb 2, 2026 at 10:10 PM Ángel Álvarez Pascua <
[email protected]> wrote:

> Hi,
>
> While running some tests for an article last weekend, I came across the
> new ChecksumCheckpointFileManager feature in Spark 4.1.
>
> This feature spawns *4 threads per output partition* in streaming jobs.
> Since streaming jobs also use the *default 200 shuffle partitions*, this
> could pose a significant resource risk. In fact, I tested it by increasing
> spark.sql.shuffle.partitions to 1000 in a simple streaming job and ran
> into an *OOM error*—likely due to the creation of 4,000 extra threads (4
> threads × 1000 partitions).
>
> I’ve opened *SPARK-55311
> <https://issues.apache.org/jira/browse/SPARK-55311>* regarding this. My
> suggestion is that, similar to how *AQE is disabled in streaming jobs*,
> we might consider *defaulting to a lower spark.sql.shuffle.partitions
> value* (e.g., 25) for streaming workloads.
>
> I’d love to hear your thoughts on this.
>
> Regards,
> Ángel Álvarez
>

Reply via email to