kazuyukitanimura opened a new pull request #33721:
URL: https://github.com/apache/spark/pull/33721
### What changes were proposed in this pull request?
The current `MapOutputTracker` class may throw `NegativeArraySizeException`
with a large number of partitions. Within the serializeOutputStatuses() method,
it is trying to compress an array of mapStatuses and outputting the binary data
into (Apache)ByteArrayOutputStream . Inside the
(Apache)ByteArrayOutputStream.toByteArray(), negative index exception happens
because the index is int and overflows (2GB limit) when the output binary size
is too large.
This PR proposes two high-level ideas:
1. Use `org.apache.spark.util.io.ChunkedByteBufferOutputStream`, which has
a way to output the underlying buffer as `Array[Array[Byte]]`.
2. Change the signatures from `Array[Byte]` to `Array[Array[Byte]]` in
order to handle over 2GB compressed data.
### Why are the changes needed?
This issue seems to be missed out in the earlier effort of addressing 2GB
limitations [SPARK-6235](https://issues.apache.org/jira/browse/SPARK-6235)
Without this fix, `spark.default.parallelism` needs to be kept at the low
number. The drawback of setting smaller spark.default.parallelism is that it
requires more executor memory (more data per partition). Setting
`spark.io.compression.zstd.level` to higher number (default 1) hardly helps.
That essentially means we have the data size limit that for shuffling and
does not scale.
### Does this PR introduce _any_ user-facing change?
No. Optionally users can tune the chunk size through the
`spark.shuffle.mapStatus.output.chunkSize` config.
### How was this patch tested?
Passed existing tests
```
build/sbt "core/testOnly org.apache.spark.MapOutputTrackerSuite"
```
Also added a new unit test
```
build/sbt "core/testOnly org.apache.spark.MapOutputTrackerSuite -- -z
SPARK-32210"
```
Ran the benchmark using GitHub Actions and didn't not observe any
performance penalties. The results are attached in this PR
```
core/benchmarks/MapStatusesSerDeserBenchmark-jdk11-results.txt
core/benchmarks/MapStatusesSerDeserBenchmark-results.txt
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]