Re: GC overhead limit exceeded

Aleksandr Modestov Mon, 16 May 2016 07:57:30 -0700

Hi,

"Why did you though you have enough memory for your task? You checked task
statistics in your WebUI?". I mean that I have jnly about 5Gb data but
spark.driver memory in 60Gb. I check task statistics in web UI.
But really spark says that
*"05-16 17:50:06.254 127.0.0.1:54321 <http://127.0.0.1:54321>       1534
#e Thread WARN: Swapping!  GC CALLBACK, (K/V:29.74 GB + POJO:18.97 GB +
FREE:8.79 GB == MEM_MAX:57.50 GB), desiredKV=7.19 GB OOM!Exception in
thread "Heartbeat" java.lang.OutOfMemoryError: Java heap space"*
But why spark doesn't split data into a disk?


On Mon, May 16, 2016 at 5:11 PM, Takeshi Yamamuro <linguin....@gmail.com>
wrote:

> Hi,
>
> Why did you though you have enough memory for your task? You checked task
> statistics in your WebUI?
> Anyway, If you get stuck with the GC issue, you'd better off increasing
> the number of partitions.
>
> // maropu
>
> On Mon, May 16, 2016 at 10:00 PM, AlexModestov <
> aleksandrmodes...@gmail.com> wrote:
>
>> I get the error in the apache spark...
>>
>> "spark.driver.memory 60g
>> spark.python.worker.memory 60g
>> spark.master local[*]"
>>
>> The amount of data is about 5Gb, but spark says that "GC overhead limit
>> exceeded". I guess that my conf-file gives enought resources.
>>
>> "16/05/16 15:13:02 WARN NettyRpcEndpointRef: Error sending message
>> [message
>> = Heartbeat(driver,[Lscala.Tuple2;@87576f9,BlockManagerId(driver,
>> localhost,
>> 59407))] in 1 attempts
>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10
>> seconds]. This timeout is controlled by spark.executor.heartbeatInterval
>>         at
>> org.apache.spark.rpc.RpcTimeout.org
>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
>>         at
>>
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
>>         at
>>
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>>         at
>>
>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
>>         at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>>         at
>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
>>         at
>> org.apache.spark.executor.Executor.org
>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>         at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
>>         at
>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>         at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
>> [10 seconds]
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>         at
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>         at
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>         at
>>
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>         at scala.concurrent.Await$.result(package.scala:107)
>>         at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>>         ... 14 more
>> 16/05/16 15:13:02 WARN NettyRpcEnv: Ignored message:
>> HeartbeatResponse(false)
>> 05-16 15:13:26.398 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.74 GB + FREE:11.03 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:13:44.528 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.86 GB + FREE:10.90 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:13:56.847 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.88 GB + FREE:10.88 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:14:10.215 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.90 GB + FREE:10.86 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:14:33.622 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.91 GB + FREE:10.85 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:14:47.075 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:15:10.555 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.92 GB + FREE:10.84 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:15:25.520 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:15:39.087 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> Exception in thread "HashSessionScavenger-0" java.lang.OutOfMemoryError:
>> GC
>> overhead limit exceeded
>>         at
>>
>> java.util.concurrent.ConcurrentHashMap$ValuesView.iterator(ConcurrentHashMap.java:4683)
>>         at
>>
>> org.eclipse.jetty.server.session.HashSessionManager.scavenge(HashSessionManager.java:314)
>>         at
>>
>> org.eclipse.jetty.server.session.HashSessionManager$2.run(HashSessionManager.java:285)
>>         at java.util.TimerThread.mainLoop(Timer.java:555)
>>         at java.util.TimerThread.run(Timer.java:505)
>> 16/05/16 15:22:26 ERROR Executor: Exception in task 0.0 in stage 10.0 (TID
>> 107)
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at java.lang.Double.valueOf(Double.java:519)
>>         at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84)
>>         at
>>
>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176)
>>         at
>>
>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>> Source)
>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         at
>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
>>         at
>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665)
>>         at
>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>         at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>         at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> 05-16 15:22:26.947 127.0.0.1:54321       2059   #e Thread WARN: Unblock
>> allocations; cache below desired, but also OOM: GC CALLBACK, (K/V:29.74
>> GB +
>> POJO:16.93 GB + FREE:10.83 GB == MEM_MAX:57.50 GB), desiredKV=38.52 GB
>> OOM!
>> 05-16 15:22:26.948 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:14.94 GB + FREE:12.83 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=8.65 GB OOM!
>> 16/05/16 15:22:26 WARN HeartbeatReceiver: Removing executor driver with no
>> recent heartbeats: 144662 ms exceeds timeout 120000 ms
>> 16/05/16 15:22:26 ERROR ActorSystemImpl: exception on LARS’ timer thread
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at
>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22)
>>         at
>>
>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443)
>>         at
>>
>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>>         at
>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>>         at java.lang.Thread.run(Thread.java:745)
>> 16/05/16 15:22:26 INFO ActorSystemImpl: starting new LARS thread
>> 16/05/16 15:22:26 ERROR TaskSchedulerImpl: Lost executor driver on
>> localhost: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 3.0 in stage 10.0 (TID
>> 110,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 ERROR TaskSetManager: Task 3 in stage 10.0 failed 1
>> times;
>> aborting job
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 6.0 in stage 10.0 (TID
>> 113,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 0.0 in stage 10.0 (TID
>> 107,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 2.0 in stage 10.0 (TID
>> 109,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 5.0 in stage 10.0 (TID
>> 112,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 7.0 in stage 10.0 (TID
>> 114,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 1.0 in stage 10.0 (TID
>> 108,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 4.0 in stage 10.0 (TID
>> 111,
>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>> the
>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>> 16/05/16 15:22:26 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose
>> tasks
>> have all completed, from pool
>> 16/05/16 15:22:26 ERROR ActorSystemImpl: Uncaught fatal error from thread
>> [sparkDriverActorSystem-scheduler-1] shutting down ActorSystem
>> [sparkDriverActorSystem]
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at
>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22)
>>         at
>>
>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443)
>>         at
>>
>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>>         at
>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>>         at java.lang.Thread.run(Thread.java:745)
>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator: Shutting
>> down remote daemon.
>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator: Remote
>> daemon shut down; proceeding with flushing remote transports.
>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true
>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true
>> 16/05/16 15:22:27 ERROR SparkUncaughtExceptionHandler: Uncaught exception
>> in
>> thread Thread[Executor task launch worker-14,5,main]
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>         at java.lang.Double.valueOf(Double.java:519)
>>         at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84)
>>         at
>>
>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176)
>>         at
>>
>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>> Source)
>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         at
>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
>>         at
>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665)
>>         at
>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>         at
>>
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>         at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>         at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> 16/05/16 15:22:27 INFO TaskSchedulerImpl: Cancelling stage 10
>> 16/05/16 15:22:27 WARN SparkContext: Killing executors is only supported
>> in
>> coarse-grained mode
>> 16/05/16 15:22:27 INFO DAGScheduler: ResultStage 10 (head at
>> <ipython-input-13-f753ebdb6b0f>:13) failed in 667.824 s
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO DAGScheduler: Executor lost: driver (epoch 2)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Trying to remove
>> executor
>> driver from BlockManagerMaster.
>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Removing block manager
>> BlockManagerId(driver, localhost, 59407)
>> 16/05/16 15:22:27 INFO DAGScheduler: Job 8 failed: head at
>> <ipython-input-13-f753ebdb6b0f>:13, took 667.845630 s
>> 16/05/16 15:22:27 ERROR BlockManager: Failed to report broadcast_15_piece0
>> to master; giving up.
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Removed driver successfully in
>> removeExecutor
>> 16/05/16 15:22:27 INFO DAGScheduler: Host added was in lost list earlier:
>> localhost
>> 16/05/16 15:22:27 INFO SparkContext: Invoking stop() from shutdown hook
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Registering block
>> manager
>> localhost:59407 with 51.5 GB RAM, BlockManagerId(driver, localhost, 59407)
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>> memory
>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>> memory
>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>> memory
>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>> memory
>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>> master
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register BlockManager
>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO SparkUI: Stopped Spark web UI at
>> http://192.168.107.30:4040
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>> memory
>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>> memory
>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>> 16/05/16 15:22:56 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>> 05-16 15:22:56.111 127.0.0.1:54321       2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:15.20 GB + FREE:12.56 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=8.12 GB OOM!
>> 16/05/16 15:22:56 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
>> shut down.
>> 16/05/16 15:22:56 WARN NettyRpcEndpointRef: Error sending message
>> [message =
>> Heartbeat(driver,[Lscala.Tuple2;@797268e9,BlockManagerId(driver,
>> localhost,
>> 59407))] in 1 attempts
>> org.apache.spark.SparkException: Could not find HeartbeatReceiver or it
>> has
>> been stopped.
>>         at
>> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161)
>>         at
>>
>> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:126)
>>         at
>> org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:227)
>>         at
>> org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:511)
>>         at
>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:100)
>>         at
>> org.apache.spark.executor.Executor.org
>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>         at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>         at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
>>         at
>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>         at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)"
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/GC-overhead-limit-exceeded-tp26966.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>

Re: GC overhead limit exceeded

Reply via email to