[ 
https://issues.apache.org/jira/browse/SPARK-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Miner updated SPARK-18343:
-------------------------------
    Description: 
I have a driver program where I write read data in from Cassandra using spark, 
perform some operations, and then write out to JSON on S3. The program runs 
fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.

However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and 
spark-cassandra-connector 2.0.0-M3, the program completes in the sense that all 
the expected files are written to S3, but the program never terminates.

I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1. In 
both cases I use the default output committer.

>From the thread dump (included below) it seems like it could be waiting on: 
>`org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`

Code snippet:
{code}
    // get MongoDB oplog operations
    val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
      .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
    
    // replay oplog operations into documents
    val documents = operations
      .spanBy(op => op.id)
      .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
      .filter { case (id, result) => result.isInstanceOf[Document] }
      .map { case (id, document) => MergedDocument(id = id, document = document
        .asInstanceOf[Document])
      }
    
    // write documents to json on s3
    documents
      .map(document => document.toJson)
      .coalesce(partitions)
      .saveAsTextFile(path, classOf[GzipCodec])
    sc.stop()
{code}

Thread dump on the driver:

{code}
    60  context-cleaner-periodic-gc TIMED_WAITING
    46  dag-scheduler-event-loop    WAITING
    4389    DestroyJavaVM   RUNNABLE
    12  dispatcher-event-loop-0 WAITING
    13  dispatcher-event-loop-1 WAITING
    14  dispatcher-event-loop-2 WAITING
    15  dispatcher-event-loop-3 WAITING
    47  driver-revive-thread    TIMED_WAITING
    3   Finalizer   WAITING
    82  ForkJoinPool-1-worker-17    WAITING
    43  heartbeat-receiver-event-loop-thread    TIMED_WAITING
    93  java-sdk-http-connection-reaper TIMED_WAITING
    4387    java-sdk-progress-listener-callback-thread  WAITING
    25  map-output-dispatcher-0 WAITING
    26  map-output-dispatcher-1 WAITING
    27  map-output-dispatcher-2 WAITING
    28  map-output-dispatcher-3 WAITING
    29  map-output-dispatcher-4 WAITING
    30  map-output-dispatcher-5 WAITING
    31  map-output-dispatcher-6 WAITING
    32  map-output-dispatcher-7 WAITING
    48  MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
    44  netty-rpc-env-timeout   TIMED_WAITING
    92  
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
WAITING
    62  pool-19-thread-1    TIMED_WAITING
    2   Reference Handler   WAITING
    61  Scheduler-1112394071    TIMED_WAITING
    20  shuffle-server-0    RUNNABLE
    55  shuffle-server-0    RUNNABLE
    21  shuffle-server-1    RUNNABLE
    56  shuffle-server-1    RUNNABLE
    22  shuffle-server-2    RUNNABLE
    57  shuffle-server-2    RUNNABLE
    23  shuffle-server-3    RUNNABLE
    58  shuffle-server-3    RUNNABLE
    4   Signal Dispatcher   RUNNABLE
    59  Spark Context Cleaner   TIMED_WAITING
    9   SparkListenerBus    WAITING
    35  SparkUI-35-selector-ServerConnectorManager@651d3734/0   RUNNABLE
    36  
SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040} 
RUNNABLE
    37  SparkUI-37-selector-ServerConnectorManager@651d3734/1   RUNNABLE
    38  SparkUI-38  TIMED_WAITING
    39  SparkUI-39  TIMED_WAITING
    40  SparkUI-40  TIMED_WAITING
    41  SparkUI-41  RUNNABLE
    42  SparkUI-42  TIMED_WAITING
    438 task-result-getter-0    WAITING
    450 task-result-getter-1    WAITING
    489 task-result-getter-2    WAITING
    492 task-result-getter-3    WAITING
    75  threadDeathWatcher-2-1  TIMED_WAITING
    45  Timer-0 WAITING
{code}

Thread dump on the executors. It's the same on all of them:

{code}
    24  dispatcher-event-loop-0 WAITING
    25  dispatcher-event-loop-1 WAITING
    26  dispatcher-event-loop-2 RUNNABLE
    27  dispatcher-event-loop-3 WAITING
    39  driver-heartbeater  TIMED_WAITING
    3   Finalizer   WAITING
    58  java-sdk-http-connection-reaper TIMED_WAITING
    75  java-sdk-progress-listener-callback-thread  WAITING
    1   main    TIMED_WAITING
    33  netty-rpc-env-timeout   TIMED_WAITING
    55  
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
WAITING
    59  pool-17-thread-1    TIMED_WAITING
    2   Reference Handler   WAITING
    28  shuffle-client-0    RUNNABLE
    35  shuffle-client-0    RUNNABLE
    41  shuffle-client-0    RUNNABLE
    37  shuffle-server-0    RUNNABLE
    5   Signal Dispatcher   RUNNABLE
    23  threadDeathWatcher-2-1  TIMED_WAITING
{code}

Jstack of an executor:

{code}
ubuntu@ip-10-0-230-88:~$ sudo jstack  21811
2016-11-08 21:38:02
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f8234003800 nid=0x5a4c waiting on 
condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"java-sdk-progress-listener-callback-thread" daemon prio=10 
tid=0x00007f8218001000 nid=0x55c5 waiting on condition [0x00007f81e98d5000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000078797f4f8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"pool-17-thread-1" daemon prio=10 tid=0x00007f82141f9000 nid=0x5597 waiting on 
condition [0x00007f81fc2bb000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000074d9008e8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"java-sdk-http-connection-reaper" daemon prio=10 tid=0x00007f820837e000 
nid=0x5596 waiting on condition [0x00007f81fc3bc000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:112)

"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" 
daemon prio=10 tid=0x00007f8208352800 nid=0x5594 in Object.wait() 
[0x00007f824cc13000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at 
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3063)
        at java.lang.Thread.run(Thread.java:745)

"shuffle-client-0" daemon prio=10 tid=0x00007f8208110800 nid=0x5593 runnable 
[0x00007f824ca11000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x0000000756803238> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x0000000756803258> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000007568031f0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

"shuffle-client-0" daemon prio=10 tid=0x00007f820803b800 nid=0x5578 runnable 
[0x00007f824c704000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000007568033e0> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x0000000756803400> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000756803398> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

"driver-heartbeater" daemon prio=10 tid=0x00007f8200047800 nid=0x5573 waiting 
on condition [0x00007f81fdefb000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007568036b8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"shuffle-server-0" daemon prio=10 tid=0x00007f8200044000 nid=0x5572 runnable 
[0x00007f81fdffc000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000007568038c0> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000007568038e0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000756803878> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

"shuffle-client-0" daemon prio=10 tid=0x000000000222c000 nid=0x5571 runnable 
[0x00007f824c1ff000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x0000000756803a68> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x0000000756803a88> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000756803a20> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

"netty-rpc-env-timeout" daemon prio=10 tid=0x00007f8285248000 nid=0x5570 
waiting on condition [0x00007f824c300000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756803b80> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-3" daemon prio=10 tid=0x00007f82851f4800 nid=0x556e 
waiting on condition [0x00007f824c502000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756802418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-2" daemon prio=10 tid=0x00007f82851f3800 nid=0x556d 
waiting on condition [0x00007f824c805000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756802418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-1" daemon prio=10 tid=0x00007f82851f3000 nid=0x556c 
waiting on condition [0x00007f824cf15000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756802418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-0" daemon prio=10 tid=0x00007f82851f2000 nid=0x556b 
waiting on condition [0x00007f824c906000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756802418> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"threadDeathWatcher-2-1" daemon prio=10 tid=0x00007f820400e000 nid=0x5567 
waiting on condition [0x00007f824c603000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137)
        at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)

"Service Thread" daemon prio=10 tid=0x00007f82842ae000 nid=0x555a runnable 
[0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f82842ab000 nid=0x5559 waiting 
on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f82842a9000 nid=0x5558 waiting 
on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f82842a6800 nid=0x5557 runnable 
[0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Surrogate Locker Thread (Concurrent GC)" daemon prio=10 tid=0x00007f82842a4800 
nid=0x5556 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f8284282800 nid=0x5555 in Object.wait() 
[0x00007f8280dfc000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" daemon prio=10 tid=0x00007f8284280800 nid=0x5554 in 
Object.wait() [0x00007f8280efd000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00000007568040e0> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f8284021000 nid=0x5547 waiting on condition 
[0x00007f828da05000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000756804ac8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at 
java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
        at 
org.apache.spark.rpc.netty.Dispatcher.awaitTermination(Dispatcher.scala:180)
        at 
org.apache.spark.rpc.netty.NettyRpcEnv.awaitTermination(NettyRpcEnv.scala:273)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:217)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)

"VM Thread" prio=10 tid=0x00007f828427c000 nid=0x5553 runnable 

"Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f8284035800 nid=0x5548 
runnable 

"Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f8284037800 nid=0x5549 
runnable 

"Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f8284039000 nid=0x554a 
runnable 

"Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f828403b000 nid=0x554b 
runnable 

"G1 Main Concurrent Mark GC Thread" prio=10 tid=0x00007f828404f800 nid=0x5551 
runnable 

"Gang worker#0 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f8284062000 
nid=0x5552 runnable 

"G1 Concurrent Refinement Thread#0" prio=10 tid=0x00007f8284045800 nid=0x5550 
runnable 

"G1 Concurrent Refinement Thread#1" prio=10 tid=0x00007f8284043800 nid=0x554f 
runnable 

"G1 Concurrent Refinement Thread#2" prio=10 tid=0x00007f8284041800 nid=0x554e 
runnable 

"G1 Concurrent Refinement Thread#3" prio=10 tid=0x00007f828403f800 nid=0x554d 
runnable 

"G1 Concurrent Refinement Thread#4" prio=10 tid=0x00007f828403e000 nid=0x554c 
runnable 


"VM Periodic Task Thread" prio=10 tid=0x00007f82842b8800 nid=0x555b waiting on 
condition 

JNI global references: 358
{code}


  was:
I have a driver program where I write read data in from Cassandra using spark, 
perform some operations, and then write out to JSON on S3. The program runs 
fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.

However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and 
spark-cassandra-connector 2.0.0-M3, the program completes in the sense that all 
the expected files are written to S3, but the program never terminates.

I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1. In 
both cases I use the default output committer.

>From the thread dump (included below) it seems like it could be waiting on: 
>`org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`

Code snippet:
{code}
    // get MongoDB oplog operations
    val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
      .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
    
    // replay oplog operations into documents
    val documents = operations
      .spanBy(op => op.id)
      .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
      .filter { case (id, result) => result.isInstanceOf[Document] }
      .map { case (id, document) => MergedDocument(id = id, document = document
        .asInstanceOf[Document])
      }
    
    // write documents to json on s3
    documents
      .map(document => document.toJson)
      .coalesce(partitions)
      .saveAsTextFile(path, classOf[GzipCodec])
    sc.stop()
{code}

Thread dump on the driver:

{code}
    60  context-cleaner-periodic-gc TIMED_WAITING
    46  dag-scheduler-event-loop    WAITING
    4389    DestroyJavaVM   RUNNABLE
    12  dispatcher-event-loop-0 WAITING
    13  dispatcher-event-loop-1 WAITING
    14  dispatcher-event-loop-2 WAITING
    15  dispatcher-event-loop-3 WAITING
    47  driver-revive-thread    TIMED_WAITING
    3   Finalizer   WAITING
    82  ForkJoinPool-1-worker-17    WAITING
    43  heartbeat-receiver-event-loop-thread    TIMED_WAITING
    93  java-sdk-http-connection-reaper TIMED_WAITING
    4387    java-sdk-progress-listener-callback-thread  WAITING
    25  map-output-dispatcher-0 WAITING
    26  map-output-dispatcher-1 WAITING
    27  map-output-dispatcher-2 WAITING
    28  map-output-dispatcher-3 WAITING
    29  map-output-dispatcher-4 WAITING
    30  map-output-dispatcher-5 WAITING
    31  map-output-dispatcher-6 WAITING
    32  map-output-dispatcher-7 WAITING
    48  MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
    44  netty-rpc-env-timeout   TIMED_WAITING
    92  
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
WAITING
    62  pool-19-thread-1    TIMED_WAITING
    2   Reference Handler   WAITING
    61  Scheduler-1112394071    TIMED_WAITING
    20  shuffle-server-0    RUNNABLE
    55  shuffle-server-0    RUNNABLE
    21  shuffle-server-1    RUNNABLE
    56  shuffle-server-1    RUNNABLE
    22  shuffle-server-2    RUNNABLE
    57  shuffle-server-2    RUNNABLE
    23  shuffle-server-3    RUNNABLE
    58  shuffle-server-3    RUNNABLE
    4   Signal Dispatcher   RUNNABLE
    59  Spark Context Cleaner   TIMED_WAITING
    9   SparkListenerBus    WAITING
    35  SparkUI-35-selector-ServerConnectorManager@651d3734/0   RUNNABLE
    36  
SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040} 
RUNNABLE
    37  SparkUI-37-selector-ServerConnectorManager@651d3734/1   RUNNABLE
    38  SparkUI-38  TIMED_WAITING
    39  SparkUI-39  TIMED_WAITING
    40  SparkUI-40  TIMED_WAITING
    41  SparkUI-41  RUNNABLE
    42  SparkUI-42  TIMED_WAITING
    438 task-result-getter-0    WAITING
    450 task-result-getter-1    WAITING
    489 task-result-getter-2    WAITING
    492 task-result-getter-3    WAITING
    75  threadDeathWatcher-2-1  TIMED_WAITING
    45  Timer-0 WAITING
{code}

Thread dump on the executors. It's the same on all of them:

{code}
    24  dispatcher-event-loop-0 WAITING
    25  dispatcher-event-loop-1 WAITING
    26  dispatcher-event-loop-2 RUNNABLE
    27  dispatcher-event-loop-3 WAITING
    39  driver-heartbeater  TIMED_WAITING
    3   Finalizer   WAITING
    58  java-sdk-http-connection-reaper TIMED_WAITING
    75  java-sdk-progress-listener-callback-thread  WAITING
    1   main    TIMED_WAITING
    33  netty-rpc-env-timeout   TIMED_WAITING
    55  
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
WAITING
    59  pool-17-thread-1    TIMED_WAITING
    2   Reference Handler   WAITING
    28  shuffle-client-0    RUNNABLE
    35  shuffle-client-0    RUNNABLE
    41  shuffle-client-0    RUNNABLE
    37  shuffle-server-0    RUNNABLE
    5   Signal Dispatcher   RUNNABLE
    23  threadDeathWatcher-2-1  TIMED_WAITING
{code}


> FileSystem$Statistics$StatisticsDataReferenceCleaner hangs on s3 write
> ----------------------------------------------------------------------
>
>                 Key: SPARK-18343
>                 URL: https://issues.apache.org/jira/browse/SPARK-18343
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.1
>         Environment: Spark 2.0.1
> Hadoop 2.7.1
> Mesos 1.0.1
> Ubuntu 14.04
>            Reporter: Luke Miner
>
> I have a driver program where I write read data in from Cassandra using 
> spark, perform some operations, and then write out to JSON on S3. The program 
> runs fine when I use Spark 1.6.1 and the spark-cassandra-connector 1.6.0-M1.
> However, if I try to upgrade to Spark 2.0.1 (hadoop 2.7.1) and 
> spark-cassandra-connector 2.0.0-M3, the program completes in the sense that 
> all the expected files are written to S3, but the program never terminates.
> I do run `sc.stop()` at the end of the program. I am also using Mesos 1.0.1. 
> In both cases I use the default output committer.
> From the thread dump (included below) it seems like it could be waiting on: 
> `org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner`
> Code snippet:
> {code}
>     // get MongoDB oplog operations
>     val operations = sc.cassandraTable[JsonOperation](keyspace, namespace)
>       .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp)
>     
>     // replay oplog operations into documents
>     val documents = operations
>       .spanBy(op => op.id)
>       .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) }
>       .filter { case (id, result) => result.isInstanceOf[Document] }
>       .map { case (id, document) => MergedDocument(id = id, document = 
> document
>         .asInstanceOf[Document])
>       }
>     
>     // write documents to json on s3
>     documents
>       .map(document => document.toJson)
>       .coalesce(partitions)
>       .saveAsTextFile(path, classOf[GzipCodec])
>     sc.stop()
> {code}
> Thread dump on the driver:
> {code}
>     60  context-cleaner-periodic-gc TIMED_WAITING
>     46  dag-scheduler-event-loop    WAITING
>     4389    DestroyJavaVM   RUNNABLE
>     12  dispatcher-event-loop-0 WAITING
>     13  dispatcher-event-loop-1 WAITING
>     14  dispatcher-event-loop-2 WAITING
>     15  dispatcher-event-loop-3 WAITING
>     47  driver-revive-thread    TIMED_WAITING
>     3   Finalizer   WAITING
>     82  ForkJoinPool-1-worker-17    WAITING
>     43  heartbeat-receiver-event-loop-thread    TIMED_WAITING
>     93  java-sdk-http-connection-reaper TIMED_WAITING
>     4387    java-sdk-progress-listener-callback-thread  WAITING
>     25  map-output-dispatcher-0 WAITING
>     26  map-output-dispatcher-1 WAITING
>     27  map-output-dispatcher-2 WAITING
>     28  map-output-dispatcher-3 WAITING
>     29  map-output-dispatcher-4 WAITING
>     30  map-output-dispatcher-5 WAITING
>     31  map-output-dispatcher-6 WAITING
>     32  map-output-dispatcher-7 WAITING
>     48  MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE
>     44  netty-rpc-env-timeout   TIMED_WAITING
>     92  
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
> WAITING
>     62  pool-19-thread-1    TIMED_WAITING
>     2   Reference Handler   WAITING
>     61  Scheduler-1112394071    TIMED_WAITING
>     20  shuffle-server-0    RUNNABLE
>     55  shuffle-server-0    RUNNABLE
>     21  shuffle-server-1    RUNNABLE
>     56  shuffle-server-1    RUNNABLE
>     22  shuffle-server-2    RUNNABLE
>     57  shuffle-server-2    RUNNABLE
>     23  shuffle-server-3    RUNNABLE
>     58  shuffle-server-3    RUNNABLE
>     4   Signal Dispatcher   RUNNABLE
>     59  Spark Context Cleaner   TIMED_WAITING
>     9   SparkListenerBus    WAITING
>     35  SparkUI-35-selector-ServerConnectorManager@651d3734/0   RUNNABLE
>     36  
> SparkUI-36-acceptor-0@467924cb-ServerConnector@3b5eaf92{HTTP/1.1}{0.0.0.0:4040}
>  RUNNABLE
>     37  SparkUI-37-selector-ServerConnectorManager@651d3734/1   RUNNABLE
>     38  SparkUI-38  TIMED_WAITING
>     39  SparkUI-39  TIMED_WAITING
>     40  SparkUI-40  TIMED_WAITING
>     41  SparkUI-41  RUNNABLE
>     42  SparkUI-42  TIMED_WAITING
>     438 task-result-getter-0    WAITING
>     450 task-result-getter-1    WAITING
>     489 task-result-getter-2    WAITING
>     492 task-result-getter-3    WAITING
>     75  threadDeathWatcher-2-1  TIMED_WAITING
>     45  Timer-0 WAITING
> {code}
> Thread dump on the executors. It's the same on all of them:
> {code}
>     24  dispatcher-event-loop-0 WAITING
>     25  dispatcher-event-loop-1 WAITING
>     26  dispatcher-event-loop-2 RUNNABLE
>     27  dispatcher-event-loop-3 WAITING
>     39  driver-heartbeater  TIMED_WAITING
>     3   Finalizer   WAITING
>     58  java-sdk-http-connection-reaper TIMED_WAITING
>     75  java-sdk-progress-listener-callback-thread  WAITING
>     1   main    TIMED_WAITING
>     33  netty-rpc-env-timeout   TIMED_WAITING
>     55  
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner   
> WAITING
>     59  pool-17-thread-1    TIMED_WAITING
>     2   Reference Handler   WAITING
>     28  shuffle-client-0    RUNNABLE
>     35  shuffle-client-0    RUNNABLE
>     41  shuffle-client-0    RUNNABLE
>     37  shuffle-server-0    RUNNABLE
>     5   Signal Dispatcher   RUNNABLE
>     23  threadDeathWatcher-2-1  TIMED_WAITING
> {code}
> Jstack of an executor:
> {code}
> ubuntu@ip-10-0-230-88:~$ sudo jstack  21811
> 2016-11-08 21:38:02
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x00007f8234003800 nid=0x5a4c waiting on 
> condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "java-sdk-progress-listener-callback-thread" daemon prio=10 
> tid=0x00007f8218001000 nid=0x55c5 waiting on condition [0x00007f81e98d5000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x000000078797f4f8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "pool-17-thread-1" daemon prio=10 tid=0x00007f82141f9000 nid=0x5597 waiting 
> on condition [0x00007f81fc2bb000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x000000074d9008e8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
>       at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "java-sdk-http-connection-reaper" daemon prio=10 tid=0x00007f820837e000 
> nid=0x5596 waiting on condition [0x00007f81fc3bc000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>       at java.lang.Thread.sleep(Native Method)
>       at 
> com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:112)
> "org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" 
> daemon prio=10 tid=0x00007f8208352800 nid=0x5594 in Object.wait() 
> [0x00007f824cc13000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>       - locked <0x0000000756803100> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>       at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3063)
>       at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x00007f8208110800 nid=0x5593 runnable 
> [0x00007f824ca11000]
>    java.lang.Thread.State: RUNNABLE
>       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>       - locked <0x0000000756803238> (a 
> io.netty.channel.nio.SelectedSelectionKeySet)
>       - locked <0x0000000756803258> (a java.util.Collections$UnmodifiableSet)
>       - locked <0x00000007568031f0> (a sun.nio.ch.EPollSelectorImpl)
>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>       at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x00007f820803b800 nid=0x5578 runnable 
> [0x00007f824c704000]
>    java.lang.Thread.State: RUNNABLE
>       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>       - locked <0x00000007568033e0> (a 
> io.netty.channel.nio.SelectedSelectionKeySet)
>       - locked <0x0000000756803400> (a java.util.Collections$UnmodifiableSet)
>       - locked <0x0000000756803398> (a sun.nio.ch.EPollSelectorImpl)
>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>       at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
> "driver-heartbeater" daemon prio=10 tid=0x00007f8200047800 nid=0x5573 waiting 
> on condition [0x00007f81fdefb000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000007568036b8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
>       at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "shuffle-server-0" daemon prio=10 tid=0x00007f8200044000 nid=0x5572 runnable 
> [0x00007f81fdffc000]
>    java.lang.Thread.State: RUNNABLE
>       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>       - locked <0x00000007568038c0> (a 
> io.netty.channel.nio.SelectedSelectionKeySet)
>       - locked <0x00000007568038e0> (a java.util.Collections$UnmodifiableSet)
>       - locked <0x0000000756803878> (a sun.nio.ch.EPollSelectorImpl)
>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>       at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
> "shuffle-client-0" daemon prio=10 tid=0x000000000222c000 nid=0x5571 runnable 
> [0x00007f824c1ff000]
>    java.lang.Thread.State: RUNNABLE
>       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>       - locked <0x0000000756803a68> (a 
> io.netty.channel.nio.SelectedSelectionKeySet)
>       - locked <0x0000000756803a88> (a java.util.Collections$UnmodifiableSet)
>       - locked <0x0000000756803a20> (a sun.nio.ch.EPollSelectorImpl)
>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>       at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
> "netty-rpc-env-timeout" daemon prio=10 tid=0x00007f8285248000 nid=0x5570 
> waiting on condition [0x00007f824c300000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756803b80> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
>       at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-3" daemon prio=10 tid=0x00007f82851f4800 nid=0x556e 
> waiting on condition [0x00007f824c502000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756802418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-2" daemon prio=10 tid=0x00007f82851f3800 nid=0x556d 
> waiting on condition [0x00007f824c805000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756802418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-1" daemon prio=10 tid=0x00007f82851f3000 nid=0x556c 
> waiting on condition [0x00007f824cf15000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756802418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "dispatcher-event-loop-0" daemon prio=10 tid=0x00007f82851f2000 nid=0x556b 
> waiting on condition [0x00007f824c906000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756802418> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:207)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> "threadDeathWatcher-2-1" daemon prio=10 tid=0x00007f820400e000 nid=0x5567 
> waiting on condition [0x00007f824c603000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>       at java.lang.Thread.sleep(Native Method)
>       at 
> io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137)
>       at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>       at java.lang.Thread.run(Thread.java:745)
> "Service Thread" daemon prio=10 tid=0x00007f82842ae000 nid=0x555a runnable 
> [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" daemon prio=10 tid=0x00007f82842ab000 nid=0x5559 waiting 
> on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" daemon prio=10 tid=0x00007f82842a9000 nid=0x5558 waiting 
> on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" daemon prio=10 tid=0x00007f82842a6800 nid=0x5557 runnable 
> [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "Surrogate Locker Thread (Concurrent GC)" daemon prio=10 
> tid=0x00007f82842a4800 nid=0x5556 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "Finalizer" daemon prio=10 tid=0x00007f8284282800 nid=0x5555 in Object.wait() 
> [0x00007f8280dfc000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>       - locked <0x00000007568040a0> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>       at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> "Reference Handler" daemon prio=10 tid=0x00007f8284280800 nid=0x5554 in 
> Object.wait() [0x00007f8280efd000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
>       at java.lang.Object.wait(Object.java:503)
>       at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>       - locked <0x00000007568040e0> (a java.lang.ref.Reference$Lock)
> "main" prio=10 tid=0x00007f8284021000 nid=0x5547 waiting on condition 
> [0x00007f828da05000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000756804ac8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>       at 
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
>       at 
> org.apache.spark.rpc.netty.Dispatcher.awaitTermination(Dispatcher.scala:180)
>       at 
> org.apache.spark.rpc.netty.NettyRpcEnv.awaitTermination(NettyRpcEnv.scala:273)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:217)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>       at 
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:174)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:270)
>       at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> "VM Thread" prio=10 tid=0x00007f828427c000 nid=0x5553 runnable 
> "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f8284035800 
> nid=0x5548 runnable 
> "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f8284037800 
> nid=0x5549 runnable 
> "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f8284039000 
> nid=0x554a runnable 
> "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f828403b000 
> nid=0x554b runnable 
> "G1 Main Concurrent Mark GC Thread" prio=10 tid=0x00007f828404f800 nid=0x5551 
> runnable 
> "Gang worker#0 (G1 Parallel Marking Threads)" prio=10 tid=0x00007f8284062000 
> nid=0x5552 runnable 
> "G1 Concurrent Refinement Thread#0" prio=10 tid=0x00007f8284045800 nid=0x5550 
> runnable 
> "G1 Concurrent Refinement Thread#1" prio=10 tid=0x00007f8284043800 nid=0x554f 
> runnable 
> "G1 Concurrent Refinement Thread#2" prio=10 tid=0x00007f8284041800 nid=0x554e 
> runnable 
> "G1 Concurrent Refinement Thread#3" prio=10 tid=0x00007f828403f800 nid=0x554d 
> runnable 
> "G1 Concurrent Refinement Thread#4" prio=10 tid=0x00007f828403e000 nid=0x554c 
> runnable 
> "VM Periodic Task Thread" prio=10 tid=0x00007f82842b8800 nid=0x555b waiting 
> on condition 
> JNI global references: 358
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to