[
https://issues.apache.org/jira/browse/SPARK-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong Shen updated SPARK-5421:
-----------------------------
Description:
ExternalAppendOnlyMap if only for the spark job that aggregator isDefined, but
sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL won't spill at
shuffle, it's very easy to throw OOM at shuffle.
One of the executor's log, here is stderr:
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Don't have map outputs for
shuffle 1, fetching them
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Doing the fetch; tracker
actor =
Actor[akka.tcp://[email protected]:40952/user/MapOutputTracker#1435377484]
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Got the output locations
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Getting 143
non-empty blocks out of 143 blocks
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Started 4 remote
fetches in 72 ms
15/01/27 07:47:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL
15: SIGTERM
here is stdout:
2015-01-27T07:44:43.487+0800: [Full GC 3961343K->3959868K(3961344K), 29.8959290
secs]
2015-01-27T07:45:13.460+0800: [Full GC 3961343K->3959992K(3961344K), 27.9218150
secs]
2015-01-27T07:45:41.407+0800: [GC 3960347K(3961344K), 3.0457450 secs]
2015-01-27T07:45:52.950+0800: [Full GC 3961343K->3960113K(3961344K), 29.3894670
secs]
2015-01-27T07:46:22.393+0800: [Full GC 3961118K->3960240K(3961344K), 28.9879600
secs]
2015-01-27T07:46:51.393+0800: [Full GC 3960240K->3960213K(3961344K), 34.1530900
secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill %p"
# Executing /bin/sh -c "kill 9050"...
2015-01-27T07:47:25.921+0800: [GC 3960214K(3961344K), 3.3959300 secs]
was:ExternalAppendOnlyMap if only for the spark job that aggregator
isDefined, but sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL
won't spill at shuffle, it's very easy to throw OOM at shuffle.
> SparkSql throw OOM at shuffle
> -----------------------------
>
> Key: SPARK-5421
> URL: https://issues.apache.org/jira/browse/SPARK-5421
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Reporter: Hong Shen
>
> ExternalAppendOnlyMap if only for the spark job that aggregator isDefined,
> but sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL won't spill
> at shuffle, it's very easy to throw OOM at shuffle.
> One of the executor's log, here is stderr:
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Don't have map outputs
> for shuffle 1, fetching them
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Doing the fetch; tracker
> actor =
> Actor[akka.tcp://[email protected]:40952/user/MapOutputTracker#1435377484]
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Got the output locations
> 15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Getting 143
> non-empty blocks out of 143 blocks
> 15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Started 4 remote
> fetches in 72 ms
> 15/01/27 07:47:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
> SIGNAL 15: SIGTERM
> here is stdout:
> 2015-01-27T07:44:43.487+0800: [Full GC 3961343K->3959868K(3961344K),
> 29.8959290 secs]
> 2015-01-27T07:45:13.460+0800: [Full GC 3961343K->3959992K(3961344K),
> 27.9218150 secs]
> 2015-01-27T07:45:41.407+0800: [GC 3960347K(3961344K), 3.0457450 secs]
> 2015-01-27T07:45:52.950+0800: [Full GC 3961343K->3960113K(3961344K),
> 29.3894670 secs]
> 2015-01-27T07:46:22.393+0800: [Full GC 3961118K->3960240K(3961344K),
> 28.9879600 secs]
> 2015-01-27T07:46:51.393+0800: [Full GC 3960240K->3960213K(3961344K),
> 34.1530900 secs]
> #
> # java.lang.OutOfMemoryError: Java heap space
> # -XX:OnOutOfMemoryError="kill %p"
> # Executing /bin/sh -c "kill 9050"...
> 2015-01-27T07:47:25.921+0800: [GC 3960214K(3961344K), 3.3959300 secs]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]