[ 
https://issues.apache.org/jira/browse/SPARK-12831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-12831:
------------------------------
    Component/s: Spark Core

> akka.remote.OversizedPayloadException on DirectTaskResult
> ---------------------------------------------------------
>
>                 Key: SPARK-12831
>                 URL: https://issues.apache.org/jira/browse/SPARK-12831
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Brett Stime
>
> Getting the following error in my executor logs:
> ERROR akka.ErrorMonitor: Transient association error (association remains 
> live)
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://sparkDriver@172.21.25.199:51562/user/CoarseGrainedScheduler#-2039547722]:
>  max allowed size 134217728 bytes, actual size of encoded class 
> org.apache.spark.rpc.akka.AkkaMessage was 134419636 bytes.
> Seems like the quick fix would be to make AkkaUtils.reservedSizeBytes a 
> little bigger--maybe proportional to spark.akka.frameSize and/or user 
> configurable.
> A more robust solution might be to catch OversizedPayloadException and retry 
> using the BlockManager.
> I should also mention that this has the effect of stalling the entire job (my 
> use case also requires fairly liberal timeouts). For now, I'll see if setting 
> spark.akka.frameSize a little smaller gives me more proportional overhead.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to