[
https://issues.apache.org/jira/browse/SPARK-18404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-18404.
----------------------------------
Resolution: Incomplete
> RPC call from executor to driver blocks when getting map output locations
> (Netty Only)
> --------------------------------------------------------------------------------------
>
> Key: SPARK-18404
> URL: https://issues.apache.org/jira/browse/SPARK-18404
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.6.0
> Reporter: Jeffrey Shmain
> Priority: Major
> Labels: bulk-closed
>
> Compared identical application run on Spark 1.5 and Spark 1.6. Noticed that
> jobs became slower. After looking at it closer, found that 75% of tasks
> finished same or above, and 25% had significant delays (unrelated to data
> skew and GC)
> After more debugging noticed that the executors are blocking for few seconds
> (sometimes 25) on this call:
> https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L199
> logInfo("Doing the fetch; tracker endpoint = " + trackerEndpoint)
> // This try-finally prevents hangs due to timeouts:
> try {
> val fetchedBytes =
> askTracker[Array[Byte]](GetMapOutputStatuses(shuffleId))
> fetchedStatuses =
> MapOutputTracker.deserializeMapStatuses(fetchedBytes)
> logInfo("Got the output locations")
> So the regression seems to be related changing the default from from Akka to
> Netty.
> This was an application working with RDDs, submitting 10 concurrent queries
> at a time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]