[
https://issues.apache.org/jira/browse/SPARK-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Miner updated SPARK-18343:
-------------------------------
Description:
I have a driver program where I write read data in from Cassandra using spark,
perform some operations, and then write out to JSON on S3. The program runs
fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.
However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and
spark-cassandra-connector 2.0.0-M3, the program completes in the sense that all
the expected files are written to S3, but the program never terminates.
I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1. In
both cases I use the default output committer.
>From the thread dump (included below) it seems like it could be waiting on:
>`org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`
Code snippet:
{code}
// get MongoDB oplog operations
val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
.where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
// replay oplog operations into documents
val documents = operations
.spanBy(op => op.id)
.map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
.filter { case (id, result) => result.isInstanceOf[Document] }
.map { case (id, document) => MergedDocument(id = id, document = document
.asInstanceOf[Document])
}
// write documents to json on s3
documents
.map(document => document.toJson)
.coalesce(partitions)
.saveAsTextFile(path, classOf[GzipCodec])
sc.stop()
{code}
Thread dump on the driver:
{code}
60 context-cleaner-periodic-gc TIMED_WAITING
46 dag-scheduler-event-loop WAITING
4389 DestroyJavaVM RUNNABLE
12 dispatcher-event-loop-0 WAITING
13 dispatcher-event-loop-1 WAITING
14 dispatcher-event-loop-2 WAITING
15 dispatcher-event-loop-3 WAITING
47 driver-revive-thread TIMED_WAITING
3 Finalizer WAITING
82 ForkJoinPool-1-worker-17 WAITING
43 heartbeat-receiver-event-loop-thread TIMED_WAITING
93 java-sdk-http-connection-reaper TIMED_WAITING
4387 java-sdk-progress-listener-callback-thread WAITING
25 map-output-dispatcher-0 WAITING
26 map-output-dispatcher-1 WAITING
27 map-output-dispatcher-2 WAITING
28 map-output-dispatcher-3 WAITING
29 map-output-dispatcher-4 WAITING
30 map-output-dispatcher-5 WAITING
31 map-output-dispatcher-6 WAITING
32 map-output-dispatcher-7 WAITING
48 MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
44 netty-rpc-env-timeout TIMED_WAITING
92
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
WAITING
62 pool-19-thread-1 TIMED_WAITING
2 Reference Handler WAITING
61 Scheduler-1112394071 TIMED_WAITING
20 shuffle-server-0 RUNNABLE
55 shuffle-server-0 RUNNABLE
21 shuffle-server-1 RUNNABLE
56 shuffle-server-1 RUNNABLE
22 shuffle-server-2 RUNNABLE
57 shuffle-server-2 RUNNABLE
23 shuffle-server-3 RUNNABLE
58 shuffle-server-3 RUNNABLE
4 Signal Dispatcher RUNNABLE
59 Spark Context Cleaner TIMED_WAITING
9 SparkListenerBus WAITING
35 SparkUI-35-selector-ServerConnectorManager@651d3734/0 RUNNABLE
36
SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040}
RUNNABLE
37 SparkUI-37-selector-ServerConnectorManager@651d3734/1 RUNNABLE
38 SparkUI-38 TIMED_WAITING
39 SparkUI-39 TIMED_WAITING
40 SparkUI-40 TIMED_WAITING
41 SparkUI-41 RUNNABLE
42 SparkUI-42 TIMED_WAITING
438 task-result-getter-0 WAITING
450 task-result-getter-1 WAITING
489 task-result-getter-2 WAITING
492 task-result-getter-3 WAITING
75 threadDeathWatcher-2-1 TIMED_WAITING
45 Timer-0 WAITING
{code}
Thread dump on the executors. It's the same on all of them:
{code}
24 dispatcher-event-loop-0 WAITING
25 dispatcher-event-loop-1 WAITING
26 dispatcher-event-loop-2 RUNNABLE
27 dispatcher-event-loop-3 WAITING
39 driver-heartbeater TIMED_WAITING
3 Finalizer WAITING
58 java-sdk-http-connection-reaper TIMED_WAITING
75 java-sdk-progress-listener-callback-thread WAITING
1 main TIMED_WAITING
33 netty-rpc-env-timeout TIMED_WAITING
55
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
WAITING
59 pool-17-thread-1 TIMED_WAITING
2 Reference Handler WAITING
28 shuffle-client-0 RUNNABLE
35 shuffle-client-0 RUNNABLE
41 shuffle-client-0 RUNNABLE
37 shuffle-server-0 RUNNABLE
5 Signal Dispatcher RUNNABLE
23 threadDeathWatcher-2-1 TIMED_WAITING
{code}
Jstack of an executor:
{code}
ubuntu@ip-10-0-230-88:~$ sudo jstack 21811
2016-11-08 21:38:02
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
"Attach Listener" daemon prio=10 tid=0x00007f8234003800 nid=0x5a4c waiting on
condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"java-sdk-progress-listener-callback-thread" daemon prio=10
tid=0x00007f8218001000 nid=0x55c5 waiting on condition [0x00007f81e98d5000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000078797f4f8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"pool-17-thread-1" daemon prio=10 tid=0x00007f82141f9000 nid=0x5597 waiting on
condition [0x00007f81fc2bb000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000074d9008e8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"java-sdk-http-connection-reaper" daemon prio=10 tid=0x00007f820837e000
nid=0x5596 waiting on condition [0x00007f81fc3bc000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at
com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:112)
"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner"
daemon prio=10 tid=0x00007f8208352800 nid=0x5594 in Object.wait()
[0x00007f824cc13000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
- locked <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3063)
at java.lang.Thread.run(Thread.java:745)
"shuffle-client-0" daemon prio=10 tid=0x00007f8208110800 nid=0x5593 runnable
[0x00007f824ca11000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x0000000756803238> (a
io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x0000000756803258> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000007568031f0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
"shuffle-client-0" daemon prio=10 tid=0x00007f820803b800 nid=0x5578 runnable
[0x00007f824c704000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000007568033e0> (a
io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x0000000756803400> (a java.util.Collections$UnmodifiableSet)
- locked <0x0000000756803398> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
"driver-heartbeater" daemon prio=10 tid=0x00007f8200047800 nid=0x5573 waiting
on condition [0x00007f81fdefb000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007568036b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"shuffle-server-0" daemon prio=10 tid=0x00007f8200044000 nid=0x5572 runnable
[0x00007f81fdffc000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000007568038c0> (a
io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000007568038e0> (a java.util.Collections$UnmodifiableSet)
- locked <0x0000000756803878> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
"shuffle-client-0" daemon prio=10 tid=0x000000000222c000 nid=0x5571 runnable
[0x00007f824c1ff000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x0000000756803a68> (a
io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x0000000756803a88> (a java.util.Collections$UnmodifiableSet)
- locked <0x0000000756803a20> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
"netty-rpc-env-timeout" daemon prio=10 tid=0x00007f8285248000 nid=0x5570
waiting on condition [0x00007f824c300000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756803b80> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"dispatcher-event-loop-3" daemon prio=10 tid=0x00007f82851f4800 nid=0x556e
waiting on condition [0x00007f824c502000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756802418> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"dispatcher-event-loop-2" daemon prio=10 tid=0x00007f82851f3800 nid=0x556d
waiting on condition [0x00007f824c805000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756802418> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"dispatcher-event-loop-1" daemon prio=10 tid=0x00007f82851f3000 nid=0x556c
waiting on condition [0x00007f824cf15000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756802418> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"dispatcher-event-loop-0" daemon prio=10 tid=0x00007f82851f2000 nid=0x556b
waiting on condition [0x00007f824c906000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756802418> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"threadDeathWatcher-2-1" daemon prio=10 tid=0x00007f820400e000 nid=0x5567
waiting on condition [0x00007f824c603000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at
io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137)
at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" daemon prio=10 tid=0x00007f82842ae000 nid=0x555a runnable
[0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f82842ab000 nid=0x5559 waiting
on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f82842a9000 nid=0x5558 waiting
on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f82842a6800 nid=0x5557 runnable
[0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Surrogate Locker Thread (Concurrent GC)" daemon prio=10 tid=0x00007f82842a4800
nid=0x5556 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f8284282800 nid=0x5555 in Object.wait()
[0x00007f8280dfc000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
- locked <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" daemon prio=10 tid=0x00007f8284280800 nid=0x5554 in
Object.wait() [0x00007f8280efd000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
- locked <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f8284021000 nid=0x5547 waiting on condition
[0x00007f828da05000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000756804ac8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
at
org.apache.spark.rpc.netty.Dispatcher.awaitTermination(Dispatcher.scala:180)
at
org.apache.spark.rpc.netty.NettyRpcEnv.awaitTermination(NettyRpcEnv.scala:273)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:217)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
"VM Thread" prio=10 tid=0x00007f828427c000 nid=0x5553 runnable
"Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f8284035800 nid=0x5548
runnable
"Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f8284037800 nid=0x5549
runnable
"Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f8284039000 nid=0x554a
runnable
"Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f828403b000 nid=0x554b
runnable
"G1 Main Concurrent Mark GC Thread" prio=10 tid=0x00007f828404f800 nid=0x5551
runnable
"Gang worker#0 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f8284062000
nid=0x5552 runnable
"G1 Concurrent Refinement Thread#0" prio=10 tid=0x00007f8284045800 nid=0x5550
runnable
"G1 Concurrent Refinement Thread#1" prio=10 tid=0x00007f8284043800 nid=0x554f
runnable
"G1 Concurrent Refinement Thread#2" prio=10 tid=0x00007f8284041800 nid=0x554e
runnable
"G1 Concurrent Refinement Thread#3" prio=10 tid=0x00007f828403f800 nid=0x554d
runnable
"G1 Concurrent Refinement Thread#4" prio=10 tid=0x00007f828403e000 nid=0x554c
runnable
"VM Periodic Task Thread" prio=10 tid=0x00007f82842b8800 nid=0x555b waiting on
condition
JNI global references: 358
{code}
was:
I have a driver program where I write read data in from Cassandra using spark,
perform some operations, and then write out to JSON on S3. The program runs
fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.
However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and
spark-cassandra-connector 2.0.0-M3, the program completes in the sense that all
the expected files are written to S3, but the program never terminates.
I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1. In
both cases I use the default output committer.
>From the thread dump (included below) it seems like it could be waiting on:
>`org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`
Code snippet:
{code}
// get MongoDB oplog operations
val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
.where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
// replay oplog operations into documents
val documents = operations
.spanBy(op => op.id)
.map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
.filter { case (id, result) => result.isInstanceOf[Document] }
.map { case (id, document) => MergedDocument(id = id, document = document
.asInstanceOf[Document])
}
// write documents to json on s3
documents
.map(document => document.toJson)
.coalesce(partitions)
.saveAsTextFile(path, classOf[GzipCodec])
sc.stop()
{code}
Thread dump on the driver:
{code}
60 context-cleaner-periodic-gc TIMED_WAITING
46 dag-scheduler-event-loop WAITING
4389 DestroyJavaVM RUNNABLE
12 dispatcher-event-loop-0 WAITING
13 dispatcher-event-loop-1 WAITING
14 dispatcher-event-loop-2 WAITING
15 dispatcher-event-loop-3 WAITING
47 driver-revive-thread TIMED_WAITING
3 Finalizer WAITING
82 ForkJoinPool-1-worker-17 WAITING
43 heartbeat-receiver-event-loop-thread TIMED_WAITING
93 java-sdk-http-connection-reaper TIMED_WAITING
4387 java-sdk-progress-listener-callback-thread WAITING
25 map-output-dispatcher-0 WAITING
26 map-output-dispatcher-1 WAITING
27 map-output-dispatcher-2 WAITING
28 map-output-dispatcher-3 WAITING
29 map-output-dispatcher-4 WAITING
30 map-output-dispatcher-5 WAITING
31 map-output-dispatcher-6 WAITING
32 map-output-dispatcher-7 WAITING
48 MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
44 netty-rpc-env-timeout TIMED_WAITING
92
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
WAITING
62 pool-19-thread-1 TIMED_WAITING
2 Reference Handler WAITING
61 Scheduler-1112394071 TIMED_WAITING
20 shuffle-server-0 RUNNABLE
55 shuffle-server-0 RUNNABLE
21 shuffle-server-1 RUNNABLE
56 shuffle-server-1 RUNNABLE
22 shuffle-server-2 RUNNABLE
57 shuffle-server-2 RUNNABLE
23 shuffle-server-3 RUNNABLE
58 shuffle-server-3 RUNNABLE
4 Signal Dispatcher RUNNABLE
59 Spark Context Cleaner TIMED_WAITING
9 SparkListenerBus WAITING
35 SparkUI-35-selector-ServerConnectorManager@651d3734/0 RUNNABLE
36
SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040}
RUNNABLE
37 SparkUI-37-selector-ServerConnectorManager@651d3734/1 RUNNABLE
38 SparkUI-38 TIMED_WAITING
39 SparkUI-39 TIMED_WAITING
40 SparkUI-40 TIMED_WAITING
41 SparkUI-41 RUNNABLE
42 SparkUI-42 TIMED_WAITING
438 task-result-getter-0 WAITING
450 task-result-getter-1 WAITING
489 task-result-getter-2 WAITING
492 task-result-getter-3 WAITING
75 threadDeathWatcher-2-1 TIMED_WAITING
45 Timer-0 WAITING
{code}
Thread dump on the executors. It's the same on all of them:
{code}
24 dispatcher-event-loop-0 WAITING
25 dispatcher-event-loop-1 WAITING
26 dispatcher-event-loop-2 RUNNABLE
27 dispatcher-event-loop-3 WAITING
39 driver-heartbeater TIMED_WAITING
3 Finalizer WAITING
58 java-sdk-http-connection-reaper TIMED_WAITING
75 java-sdk-progress-listener-callback-thread WAITING
1 main TIMED_WAITING
33 netty-rpc-env-timeout TIMED_WAITING
55
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
WAITING
59 pool-17-thread-1 TIMED_WAITING
2 Reference Handler WAITING
28 shuffle-client-0 RUNNABLE
35 shuffle-client-0 RUNNABLE
41 shuffle-client-0 RUNNABLE
37 shuffle-server-0 RUNNABLE
5 Signal Dispatcher RUNNABLE
23 threadDeathWatcher-2-1 TIMED_WAITING
{code}
> FileSystem$Statistics$StatisticsDataReferenceCleaner hangs on s3 write
> ----------------------------------------------------------------------
>
> Key: SPARK-18343
> URL: https://issues.apache.org/jira/browse/SPARK-18343
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.0.1
> Environment: Spark 2.0.1
> Hadoop 2.7.1
> Mesos 1.0.1
> Ubuntu 14.04
> Reporter: Luke Miner
>
> I have a driver program where I write read data in from Cassandra using
> spark, perform some operations, and then write out to JSON on S3. The program
> runs fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.
> However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and
> spark-cassandra-connector 2.0.0-M3, the program completes in the sense that
> all the expected files are written to S3, but the program never terminates.
> I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1.
> In both cases I use the default output committer.
> From the thread dump (included below) it seems like it could be waiting on:
> `org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`
> Code snippet:
> {code}
> // get MongoDB oplog operations
> val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
> .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
>
> // replay oplog operations into documents
> val documents = operations
> .spanBy(op => op.id)
> .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
> .filter { case (id, result) => result.isInstanceOf[Document] }
> .map { case (id, document) => MergedDocument(id = id, document =
> document
> .asInstanceOf[Document])
> }
>
> // write documents to json on s3
> documents
> .map(document => document.toJson)
> .coalesce(partitions)
> .saveAsTextFile(path, classOf[GzipCodec])
> sc.stop()
> {code}
> Thread dump on the driver:
> {code}
> 60 context-cleaner-periodic-gc TIMED_WAITING
> 46 dag-scheduler-event-loop WAITING
> 4389 DestroyJavaVM RUNNABLE
> 12 dispatcher-event-loop-0 WAITING
> 13 dispatcher-event-loop-1 WAITING
> 14 dispatcher-event-loop-2 WAITING
> 15 dispatcher-event-loop-3 WAITING
> 47 driver-revive-thread TIMED_WAITING
> 3 Finalizer WAITING
> 82 ForkJoinPool-1-worker-17 WAITING
> 43 heartbeat-receiver-event-loop-thread TIMED_WAITING
> 93 java-sdk-http-connection-reaper TIMED_WAITING
> 4387 java-sdk-progress-listener-callback-thread WAITING
> 25 map-output-dispatcher-0 WAITING
> 26 map-output-dispatcher-1 WAITING
> 27 map-output-dispatcher-2 WAITING
> 28 map-output-dispatcher-3 WAITING
> 29 map-output-dispatcher-4 WAITING
> 30 map-output-dispatcher-5 WAITING
> 31 map-output-dispatcher-6 WAITING
> 32 map-output-dispatcher-7 WAITING
> 48 MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
> 44 netty-rpc-env-timeout TIMED_WAITING
> 92
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
> WAITING
> 62 pool-19-thread-1 TIMED_WAITING
> 2 Reference Handler WAITING
> 61 Scheduler-1112394071 TIMED_WAITING
> 20 shuffle-server-0 RUNNABLE
> 55 shuffle-server-0 RUNNABLE
> 21 shuffle-server-1 RUNNABLE
> 56 shuffle-server-1 RUNNABLE
> 22 shuffle-server-2 RUNNABLE
> 57 shuffle-server-2 RUNNABLE
> 23 shuffle-server-3 RUNNABLE
> 58 shuffle-server-3 RUNNABLE
> 4 Signal Dispatcher RUNNABLE
> 59 Spark Context Cleaner TIMED_WAITING
> 9 SparkListenerBus WAITING
> 35 SparkUI-35-selector-ServerConnectorManager@651d3734/0 RUNNABLE
> 36
> SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040}
> RUNNABLE
> 37 SparkUI-37-selector-ServerConnectorManager@651d3734/1 RUNNABLE
> 38 SparkUI-38 TIMED_WAITING
> 39 SparkUI-39 TIMED_WAITING
> 40 SparkUI-40 TIMED_WAITING
> 41 SparkUI-41 RUNNABLE
> 42 SparkUI-42 TIMED_WAITING
> 438 task-result-getter-0 WAITING
> 450 task-result-getter-1 WAITING
> 489 task-result-getter-2 WAITING
> 492 task-result-getter-3 WAITING
> 75 threadDeathWatcher-2-1 TIMED_WAITING
> 45 Timer-0 WAITING
> {code}
> Thread dump on the executors. It's the same on all of them:
> {code}
> 24 dispatcher-event-loop-0 WAITING
> 25 dispatcher-event-loop-1 WAITING
> 26 dispatcher-event-loop-2 RUNNABLE
> 27 dispatcher-event-loop-3 WAITING
> 39 driver-heartbeater TIMED_WAITING
> 3 Finalizer WAITING
> 58 java-sdk-http-connection-reaper TIMED_WAITING
> 75 java-sdk-progress-listener-callback-thread WAITING
> 1 main TIMED_WAITING
> 33 netty-rpc-env-timeout TIMED_WAITING
> 55
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner
> WAITING
> 59 pool-17-thread-1 TIMED_WAITING
> 2 Reference Handler WAITING
> 28 shuffle-client-0 RUNNABLE
> 35 shuffle-client-0 RUNNABLE
> 41 shuffle-client-0 RUNNABLE
> 37 shuffle-server-0 RUNNABLE
> 5 Signal Dispatcher RUNNABLE
> 23 threadDeathWatcher-2-1 TIMED_WAITING
> {code}
> Jstack of an executor:
> {code}
> ubuntu@ip-10-0-230-88:~$ sudo jstack 21811
> 2016-11-08 21:38:02
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x00007f8234003800 nid=0x5a4c waiting on
> condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "java-sdk-progress-listener-callback-thread" daemon prio=10
> tid=0x00007f8218001000 nid=0x55c5 waiting on condition [0x00007f81e98d5000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000078797f4f8> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "pool-17-thread-1" daemon prio=10 tid=0x00007f82141f9000 nid=0x5597 waiting
> on condition [0x00007f81fc2bb000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000074d9008e8> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "java-sdk-http-connection-reaper" daemon prio=10 tid=0x00007f820837e000
> nid=0x5596 waiting on condition [0x00007f81fc3bc000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at
> com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:112)
> "org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner"
> daemon prio=10 tid=0x00007f8208352800 nid=0x5594 in Object.wait()
> [0x00007f824cc13000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> - locked <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> at
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3063)
> at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x00007f8208110800 nid=0x5593 runnable
> [0x00007f824ca11000]
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> - locked <0x0000000756803238> (a
> io.netty.channel.nio.SelectedSelectionKeySet)
> - locked <0x0000000756803258> (a java.util.Collections$UnmodifiableSet)
> - locked <0x00000007568031f0> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x00007f820803b800 nid=0x5578 runnable
> [0x00007f824c704000]
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> - locked <0x00000007568033e0> (a
> io.netty.channel.nio.SelectedSelectionKeySet)
> - locked <0x0000000756803400> (a java.util.Collections$UnmodifiableSet)
> - locked <0x0000000756803398> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> "driver-heartbeater" daemon prio=10 tid=0x00007f8200047800 nid=0x5573 waiting
> on condition [0x00007f81fdefb000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000007568036b8> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "shuffle-server-0" daemon prio=10 tid=0x00007f8200044000 nid=0x5572 runnable
> [0x00007f81fdffc000]
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> - locked <0x00000007568038c0> (a
> io.netty.channel.nio.SelectedSelectionKeySet)
> - locked <0x00000007568038e0> (a java.util.Collections$UnmodifiableSet)
> - locked <0x0000000756803878> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x000000000222c000 nid=0x5571 runnable
> [0x00007f824c1ff000]
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> - locked <0x0000000756803a68> (a
> io.netty.channel.nio.SelectedSelectionKeySet)
> - locked <0x0000000756803a88> (a java.util.Collections$UnmodifiableSet)
> - locked <0x0000000756803a20> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> "netty-rpc-env-timeout" daemon prio=10 tid=0x00007f8285248000 nid=0x5570
> waiting on condition [0x00007f824c300000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756803b80> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-3" daemon prio=10 tid=0x00007f82851f4800 nid=0x556e
> waiting on condition [0x00007f824c502000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756802418> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-2" daemon prio=10 tid=0x00007f82851f3800 nid=0x556d
> waiting on condition [0x00007f824c805000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756802418> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-1" daemon prio=10 tid=0x00007f82851f3000 nid=0x556c
> waiting on condition [0x00007f824cf15000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756802418> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-0" daemon prio=10 tid=0x00007f82851f2000 nid=0x556b
> waiting on condition [0x00007f824c906000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756802418> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "threadDeathWatcher-2-1" daemon prio=10 tid=0x00007f820400e000 nid=0x5567
> waiting on condition [0x00007f824c603000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at
> io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137)
> at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> at java.lang.Thread.run(Thread.java:745)
> "Service Thread" daemon prio=10 tid=0x00007f82842ae000 nid=0x555a runnable
> [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" daemon prio=10 tid=0x00007f82842ab000 nid=0x5559 waiting
> on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" daemon prio=10 tid=0x00007f82842a9000 nid=0x5558 waiting
> on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" daemon prio=10 tid=0x00007f82842a6800 nid=0x5557 runnable
> [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Surrogate Locker Thread (Concurrent GC)" daemon prio=10
> tid=0x00007f82842a4800 nid=0x5556 waiting on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Finalizer" daemon prio=10 tid=0x00007f8284282800 nid=0x5555 in Object.wait()
> [0x00007f8280dfc000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> - locked <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> "Reference Handler" daemon prio=10 tid=0x00007f8284280800 nid=0x5554 in
> Object.wait() [0x00007f8280efd000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:503)
> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
> - locked <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
> "main" prio=10 tid=0x00007f8284021000 nid=0x5547 waiting on condition
> [0x00007f828da05000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000756804ac8> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
> at
> org.apache.spark.rpc.netty.Dispatcher.awaitTermination(Dispatcher.scala:180)
> at
> org.apache.spark.rpc.netty.NettyRpcEnv.awaitTermination(NettyRpcEnv.scala:273)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:217)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> "VM Thread" prio=10 tid=0x00007f828427c000 nid=0x5553 runnable
> "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f8284035800
> nid=0x5548 runnable
> "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f8284037800
> nid=0x5549 runnable
> "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f8284039000
> nid=0x554a runnable
> "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f828403b000
> nid=0x554b runnable
> "G1 Main Concurrent Mark GC Thread" prio=10 tid=0x00007f828404f800 nid=0x5551
> runnable
> "Gang worker#0 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f8284062000
> nid=0x5552 runnable
> "G1 Concurrent Refinement Thread#0" prio=10 tid=0x00007f8284045800 nid=0x5550
> runnable
> "G1 Concurrent Refinement Thread#1" prio=10 tid=0x00007f8284043800 nid=0x554f
> runnable
> "G1 Concurrent Refinement Thread#2" prio=10 tid=0x00007f8284041800 nid=0x554e
> runnable
> "G1 Concurrent Refinement Thread#3" prio=10 tid=0x00007f828403f800 nid=0x554d
> runnable
> "G1 Concurrent Refinement Thread#4" prio=10 tid=0x00007f828403e000 nid=0x554c
> runnable
> "VM Periodic Task Thread" prio=10 tid=0x00007f82842b8800 nid=0x555b waiting
> on condition
> JNI global references: 358
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]