[ https://issues.apache.org/jira/browse/CASSANDRA-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Shuler updated CASSANDRA-13522: --------------------------------------- Environment: Cassandra 2.1.13, Ubuntu 14.04.5 LTS, Docker version 1.9.1, run as a container, 4 core server with 16GB memory. (was: Ubuntu 14.04.5 LTS, Docker version 1.9.1, run as a container, 4 core server with 16GB memory.) > AbstractTracingAwareExecutorService - Uncaught exception on thread - leads to > JVM exit > -------------------------------------------------------------------------------------- > > Key: CASSANDRA-13522 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13522 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.1.13, Ubuntu 14.04.5 LTS, Docker version > 1.9.1, run as a container, 4 core server with 16GB memory. > Reporter: Matthew O'Riordan > Priority: Major > Labels: bug, crash > Fix For: 2.1.x > > > Initially saw the following exception numerous times: > {code} > WARN [SharedPool-Worker-8] 2017-05-09 23:04:00,018 > AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-8,5,main]: {} > java.lang.NullPointerException: null > at java.lang.Double.compareTo(Double.java:49) ~[na:1.8.0_101] > at > java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:655) > ~[na:1.8.0_101] > at > java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:835) > ~[na:1.8.0_101] > at > java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1962) > ~[na:1.8.0_101] > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:104) > ~[metrics-core-2.2.0.jar:na] > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > ~[metrics-core-2.2.0.jar:na] > at com.yammer.metrics.core.Histogram.update(Histogram.java:110) > ~[metrics-core-2.2.0.jar:na] > at com.yammer.metrics.core.Timer.update(Timer.java:198) > ~[metrics-core-2.2.0.jar:na] > at com.yammer.metrics.core.Timer.update(Timer.java:76) > ~[metrics-core-2.2.0.jar:na] > at > org.apache.cassandra.metrics.LatencyMetrics.addNano(LatencyMetrics.java:108) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > org.apache.cassandra.metrics.LatencyMetrics.addNano(LatencyMetrics.java:114) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1863) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:53) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_101] > at > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > {code} > Then this lead to a high rate of these warnings: > {code} > WARN [SharedPool-Worker-91] 2017-05-09 23:04:14,682 > AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-91,5,main]: {} > java.lang.ClassCastException: null > WARN [SharedPool-Worker-92] 2017-05-09 23:04:14,704 > AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-92,5,main]: {} > java.lang.RuntimeException: java.lang.ClassCastException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2244) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_101] > at > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > {code} > The same errors continued until the last error reported: > {code} > WARN [SharedPool-Worker-161] 2017-05-09 23:06:18,617 > AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-161,5,main]: {} > java.lang.RuntimeException: java.lang.ClassCastException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2244) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_101] > at > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) > ~[cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [cassandra-all-2.1.13.1218.jar:2.1.13.1218] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > Caused by: java.lang.ClassCastException: null > {code} > At which point the JVM crashed completely and exited. Looking error.log this > is an extract: > {code} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00007fe80626a955, pid=79, tid=0x00007fe80433d700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_101-b13) (build > 1.8.0_101-b13) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.101-b13 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # V [libjvm.so+0x5c3955] > G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, > markOopDesc*)+0x45 > # > # Failed to write core dump. Core dumps have been disabled. To enable core > dumping, try "ulimit -c unlimited" before starting Java again > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --------------- T H R E A D --------------- > Current thread (0x00007fe800035800): GCTaskThread [stack: > 0x00007fe80423d000,0x00007fe80433e000] [id=256] > siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: > 0x000000030b252310 > Registers: > RAX=0x00007fe806c34b90, RBX=0x00007fe806c34b80, RCX=0x0000000000000003, > RDX=0x0000000000000001 > RSP=0x00007fe80433c2d0, RBP=0x00007fe80433c350, RSI=0x0000000000000001, > RDI=0x000000030b252308 > R8 =0x00007fe80002e8f0, R9 =0x00000000f7b096ba, R10=0x00000006f1f78f50, > R11=0x00007fe80433c5a0 > R12=0x00000005cd63de64, R13=0x00000007bd84b5d0, R14=0x00007fe80433c5a0, > R15=0x000000000000378f > RIP=0x00007fe80626a955, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, > ERR=0x0000000000000004 > TRAPNO=0x000000000000000e > Top of Stack: (sp=0x00007fe80433c2d0) > 0x00007fe80433c2d0: 00000006f1f78f50 746e756f63a5826c > 0x00007fe80433c2e0: 01007fe800030c80 0000000100000000 > 0x00007fe80433c2f0: 00007fe80433c320 0000000c06639e22 > 0x00007fe80433c300: 00007fe800030c80 00007fe800030c80 > 0x00007fe80433c310: 00007fe80433c5b0 01007fe7e4a7c1f0 > 0x00007fe80433c320: 00007fe80433c350 00007fe806c34b80 > 0x00007fe80433c330: 00000005cd63de64 00007fe8000ba970 > 0x00007fe80433c340: 00007fe80433c5a0 000000000000378f > 0x00007fe80433c350: 00007fe80433c430 00007fe80626b50b > 0x00007fe80433c360: 00007fe80433c3f0 00007fe80433caa0 > 0x00007fe80433c370: 00007fe80433c390 00007fe80433c3d0 > 0x00007fe80433c380: 00007fe80433c3c0 00007fe80433c3b0 > 0x00007fe80433c390: 00007fe80433c3e0 00007fe80433c5b0 > 0x00007fe80433c3a0: 00007fe80433c710 00007fe80433c3f0 > 0x00007fe80433c3b0: 00007fe806c0c120 00007fe800030d50 > 0x00007fe80433c3c0: 0000000727504742 0000000000000000 > 0x00007fe80433c3d0: 0000000000000000 0000000000000800 > 0x00007fe80433c3e0: 00007fe7c819b400 00007fe80433c4b0 > 0x00007fe80433c3f0: 00000005cd63de65 00007fe80433c4b0 > 0x00007fe80433c400: 00007fe80433ca00 00007fe80433caa0 > 0x00007fe80433c410: 0000000000000000 00007fe80433c8d0 > 0x00007fe80433c420: 00007fe80433c5a0 00007fe80433ca00 > 0x00007fe80433c430: 00007fe80433c500 00007fe806245d17 > 0x00007fe80433c440: 00007fe80433c460 00007fe80625fb18 > 0x00007fe80433c450: 00007fe80433ca00 0000000000000000 > 0x00007fe80433c460: 00007fe80433c500 00007fe806271049 > 0x00007fe80433c470: 00007fe806c001d0 00007fe80433cb20 > 0x00007fe80433c480: 00007fe806c001f0 00000000043c9800 > 0x00007fe80433c490: 00007fe80002e8f0 00007fe80433ca00 > 0x00007fe80433c4a0: 00007fe7f203ea50 00007fe80433caa0 > 0x00007fe80433c4b0: 0000000000000000 00007fe80433c8d0 > 0x00007fe80433c4c0: 00007fe7d0bdf580 00007fe80433ca00 > Instructions: (pc=0x00007fe80626a955) > 0x00007fe80626a935: 88 0f b6 10 84 d2 0f 84 3f 01 00 00 48 8b 05 40 > 0x00007fe80626a945: b1 9a 00 41 8b 7d 08 8b 48 08 48 d3 e7 48 03 38 > 0x00007fe80626a955: 8b 77 08 83 fe 00 0f 8e 2f 01 00 00 40 f6 c6 01 > 0x00007fe80626a965: 0f 85 35 01 00 00 89 f0 c1 f8 03 4c 63 f8 49 8b > Register to memory mapping: > RAX=0x00007fe806c34b90: <offset 0xf8db90> in > /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so at > 0x00007fe805ca7000 > RBX=0x00007fe806c34b80: <offset 0xf8db80> in > /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so at > 0x00007fe805ca7000 > RCX=0x0000000000000003 is an unknown value > RDX=0x0000000000000001 is an unknown value > RSP=0x00007fe80433c2d0 is an unknown value > RBP=0x00007fe80433c350 is an unknown value > RSI=0x0000000000000001 is an unknown value > RDI=0x000000030b252308 is an unknown value > R8 =0x00007fe80002e8f0 is an unknown value > R9 =0x00000000f7b096ba is an unknown value > R10=0x00000006f1f78f50 is an oop > java.util.concurrent.ConcurrentSkipListMap$Node > - klass: 'java/util/concurrent/ConcurrentSkipListMap$Node' > R11=0x00007fe80433c5a0 is an unknown value > R12=0x00000005cd63de64 is pointing into object: 0x00000005cd63de50 > java.util.concurrent.ConcurrentSkipListMap$Node > - klass: 'java/util/concurrent/ConcurrentSkipListMap$Node' > R13=0x00000007bd84b5d0 is pointing into object: 0x00000007bd83c1e8 > [B > - klass: {type array byte} > - length: 65536 > R14=0x00007fe80433c5a0 is an unknown value > R15=0x000000000000378f is an unknown value > Stack: [0x00007fe80423d000,0x00007fe80433e000], sp=0x00007fe80433c2d0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > V [libjvm.so+0x5c3955] > G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, > markOopDesc*)+0x45 > V [libjvm.so+0x5c450b] G1ParScanThreadState::trim_queue()+0x4ab > V [libjvm.so+0x59ed17] G1ParEvacuateFollowersClosure::do_void()+0x27 > V [libjvm.so+0x5aa923] G1ParTask::work(unsigned int)+0x463 > V [libjvm.so+0xae5a6f] GangWorker::loop()+0xcf > V [libjvm.so+0x924698] java_start(Thread*)+0x108 > --------------- P R O C E S S --------------- > Java Threads: ( => current thread ) > 0x00007fe543263000 JavaThread "MemtablePostFlush:36560" daemon > [_thread_blocked, id=9747, stack(0x00007fe7070c5000,0x00007fe707106000)] > 0x00007fe540c31800 JavaThread "ReadRepairStage:33062" daemon > [_thread_blocked, id=9746, stack(0x00007fe7c8286000,0x00007fe7c82c7000)] > 0x00007fe7c4ced800 JavaThread "CompactionExecutor:16930" daemon > [_thread_blocked, id=9743, stack(0x00007fe7c89e8000,0x00007fe7c8a29000)] > 0x00007fe802ffa000 JavaThread "CompactionExecutor:16929" daemon > [_thread_blocked, id=9742, stack(0x00007fe7c3005000,0x00007fe7c3046000)] > 0x00007fe4f4f09800 JavaThread "ReadRepairStage:33061" daemon > [_thread_blocked, id=9739, stack(0x00007fe52eab6000,0x00007fe52eaf7000)] > 0x00007fe64a846000 JavaThread "ReadRepairStage:33059" daemon > [_thread_blocked, id=9737, stack(0x00007fe7c2f83000,0x00007fe7c2fc4000)] > 0x00007fe64a823000 JavaThread "ReadRepairStage:33057" daemon > [_thread_blocked, id=9735, stack(0x00007fe7c8245000,0x00007fe7c8286000)] > 0x00007fe540cac000 JavaThread "StreamConnectionEstablisher:4" daemon > [_thread_blocked, id=31748, stack(0x00007fe7c325d000,0x00007fe7c329e000)] > 0x00007fe521bb4000 JavaThread "StreamingTransferTaskTimeouts:1" daemon > [_thread_blocked, id=31693, stack(0x00007fe7c81ad000,0x00007fe7c81ee000)] > 0x00007fe50e479000 JavaThread "StreamConnectionEstablisher:3" daemon > [_thread_blocked, id=31688, stack(0x00007fe7c9926000,0x00007fe7c9967000)] > 0x00007fe4f729c000 JavaThread "StreamConnectionEstablisher:2" daemon > [_thread_blocked, id=31686, stack(0x00007fe7c9dee000,0x00007fe7c9e2f000)] > 0x00007fe4f5087800 JavaThread "StreamConnectionEstablisher:1" daemon > [_thread_blocked, id=31681, stack(0x00007fe707f50000,0x00007fe707f91000)] > 0x00007fe7151a1000 JavaThread "MessagingService-Incoming-/52.221.228.170" > [_thread_in_native, id=16239, stack(0x00007fe707f91000,0x00007fe707fd2000)] > 0x00007fe716a41800 JavaThread "MessagingService-Incoming-/54.169.103.14" > [_thread_in_native, id=16235, stack(0x00007fe7c1f8f000,0x00007fe7c1fd0000)] > 0x00007fe50e38b800 JavaThread "MessagingService-Incoming-/54.179.183.26" > [_thread_in_native, id=16231, stack(0x00007fe7c8055000,0x00007fe7c8096000)] > 0x00007fe7ecaf5000 JavaThread "MessagingService-Incoming-/52.221.228.170" > [_thread_blocked, id=16230, stack(0x00007fe7bfaf0000,0x00007fe7bfb31000)] > 0x00007fe50fc71000 JavaThread "MessagingService-Incoming-/54.169.103.14" > [_thread_blocked, id=16229, stack(0x00007fe7c2b39000,0x00007fe7c2b7a000)] > 0x00007fe71748a800 JavaThread "MessagingService-Incoming-/52.221.217.27" > [_thread_in_native, id=16226, stack(0x00007fe7c846d000,0x00007fe7c84ae000)] > 0x00007fe50f5ff000 JavaThread "MessagingService-Incoming-/52.221.217.27" > [_thread_blocked, id=16223, stack(0x00007fe53fbf0000,0x00007fe53fc31000)] > 0x00007fe71732f000 JavaThread "MessagingService-Incoming-/54.179.183.26" > [_thread_blocked, id=16222, stack(0x00007fe707188000,0x00007fe7071c9000)] > 0x00007fe521ea9000 JavaThread "SharedPool-Worker-1641" daemon > [_thread_blocked, id=7064, stack(0x00007fe7c84ef000,0x00007fe7c8530000)] > 0x00007fe50dee4800 JavaThread "SharedPool-Worker-1638" daemon > [_thread_blocked, id=7063, stack(0x00007fe7c8e04000,0x00007fe7c8e45000)] > 0x00007fe521fda800 JavaThread "SharedPool-Worker-1637" daemon > [_thread_blocked, id=7062, stack(0x00007fe7c90d3000,0x00007fe7c9114000)] > 0x00007fe52090d800 JavaThread "SharedPool-Worker-1639" daemon > [_thread_blocked, id=7061, stack(0x00007fe7c9167000,0x00007fe7c91a8000)] > 0x00007fe542cc3800 JavaThread "SharedPool-Worker-1640" daemon > [_thread_blocked, id=7060, stack(0x00007fe7c9386000,0x00007fe7c93c7000)] > 0x00007fe543ea2000 JavaThread "SharedPool-Worker-1621" daemon > [_thread_blocked, id=7059, stack(0x00007fe7c93c7000,0x00007fe7c9408000)] > 0x00007fe52337e000 JavaThread "SharedPool-Worker-1623" daemon > [_thread_blocked, id=7058, stack(0x00007fe7c9408000,0x00007fe7c9449000)] > 0x00007fe50de15000 JavaThread "SharedPool-Worker-1625" daemon > [_thread_blocked, id=7057, stack(0x00007fe7c9449000,0x00007fe7c948a000)] > 0x00007fe52290d000 JavaThread "SharedPool-Worker-1627" daemon > [_thread_blocked, id=7056, stack(0x00007fe7c94bb000,0x00007fe7c94fc000)] > 0x00007fe5216e5800 JavaThread "SharedPool-Worker-1629" daemon > [_thread_blocked, id=7055, stack(0x00007fe7c9724000,0x00007fe7c9765000)] > 0x00007fe5208df000 JavaThread "SharedPool-Worker-1631" daemon > [_thread_blocked, id=7054, stack(0x00007fe7c9765000,0x00007fe7c97a6000)] > 0x00007fe714b44000 JavaThread "SharedPool-Worker-1633" daemon > [_thread_blocked, id=7053, stack(0x00007fe7c97a6000,0x00007fe7c97e7000)] > 0x00007fe521230000 JavaThread "SharedPool-Worker-1635" daemon > [_thread_blocked, id=7052, stack(0x00007fe7c99df000,0x00007fe7c9a20000)] > 0x00007fe542898000 JavaThread "SharedPool-Worker-1619" daemon > [_thread_blocked, id=7051, stack(0x00007fe7c9a20000,0x00007fe7c9a61000)] > 0x00007fe714fa3000 JavaThread "SharedPool-Worker-1636" daemon > [_thread_blocked, id=7050, stack(0x00007fe7c9ac8000,0x00007fe7c9b09000)] > 0x00007fe5213f7000 JavaThread "SharedPool-Worker-1634" daemon > [_thread_blocked, id=7049, stack(0x00007fe7c9b09000,0x00007fe7c9b4a000)] > 0x00007fe7edc55000 JavaThread "SharedPool-Worker-1632" daemon > [_thread_blocked, id=7048, stack(0x00007fe7c9b4a000,0x00007fe7c9b8b000)] > 0x00007fe50ccb3000 JavaThread "SharedPool-Worker-1630" daemon > [_thread_blocked, id=7047, stack(0x00007fe7c9b8b000,0x00007fe7c9bcc000)] > 0x00007fe50c641800 JavaThread "SharedPool-Worker-1628" daemon > [_thread_blocked, id=7046, stack(0x00007fe7c9cac000,0x00007fe7c9ced000)] > ... > {code} > Complete JVM crash report at > https://dl.dropboxusercontent.com/u/1575409/Ably/logs/2017-05-10-cassandra-crash/us-west-1/error.log > I also have the entire log from Cassandra at the time if useful, although > looking at it there was nothing logged for a few minutes before this happened > so no clear indication what triggered it. > There were no CPU, load, memory issues at the time (the crash occurred at > 2017-05-09 23:06) > https://dl.dropboxusercontent.com/u/1575409/Ably/logs/2017-05-10-cassandra-crash/us-west-1/Voila_Capture%202017-05-11_08-18-57_am.png -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org