[
https://issues.apache.org/jira/browse/SPARK-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740816#comment-14740816
]
Daniel Darabos commented on SPARK-5077:
---------------------------------------
Hi Josh,
The MapOutputTracker errors are a source of pain for us. Today we hit the 100
MB frame size with a 30,000-partition stage. A natural solution is to increase
the frame size setting to 1 GB. But this got us thinking about what problems
this would cause.
My reading of the code is that it would only affect messages that are larger
than the frame size. That is, it will not cause smaller messages to suddenly
start using more memory, for example by allocating a 1 GB buffer for each
message. It would be reassuring if you could confirm that. This may even be a
good addition to the documentation. It's not obvious why this setting would not
be set to infinity, for example. Thanks!
> Map output statuses can still exceed spark.akka.frameSize
> ---------------------------------------------------------
>
> Key: SPARK-5077
> URL: https://issues.apache.org/jira/browse/SPARK-5077
> Project: Spark
> Issue Type: Bug
> Components: Shuffle
> Affects Versions: 1.2.0, 1.3.0, 1.4.1
> Reporter: Josh Rosen
>
> Since HighlyCompressedMapOutputStatuses uses a bitmap for tracking empty
> blocks, its size is not bounded and thus Spark is still susceptible to
> "MapOutputTrackerMasterActor: Map output statuses
> were 11141547 bytes which exceeds spark.akka.frameSize"-type errors, even in
> 1.2.0.
> We needed to use a bitmap for tracking zero-sized blocks (see SPARK-3740;
> this isn't just a performance issue; it's necessary for correctness). This
> will require a bit more effort to fix, since we'll either have to find a way
> to use a fixed size / capped size encoding for MapOutputStatuses (which might
> require changes to let us fetch empty blocks safely) or figure out some other
> strategy for shipping these statues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]