[
https://issues.apache.org/jira/browse/CASSANDRA-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-19668:
-----------------------------------------
Fix Version/s: 4.1.x
5.0-rc
5.x
> SIGSEV originating in Paxos V2 Scheduled Task
> ---------------------------------------------
>
> Key: CASSANDRA-19668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19668
> Project: Cassandra
> Issue Type: Bug
> Components: Feature/Lightweight Transactions
> Reporter: Jon Haddad
> Assignee: Jon Haddad
> Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>
> I haven't gotten to the root cause of this yet. Several 4.1 nodes have
> crashed in in production. I'm not sure if this is related to Paxos v2 or
> not, but it is enabled. offheap_objects also enabled.
> I'm not sure if this affects 5.0, yet.
> Most of the crashes don't have a stacktrace - they only reference this
> {noformat}
> Stack: [0x00007fabf4c34000,0x00007fabf4d34000], sp=0x00007fabf4d31f00, free
> space=1015k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> v ~StubRoutines::jint_disjoint_arraycopy
> {noformat}
> They all are in the {{ScheduledTasks}} thread.
> However, one node does have this in the crash log:
> {noformat}
> --------------- T H R E A D ---------------
> Current thread (0x000078b375eac800): JavaThread "ScheduledTasks:1" daemon
> [_thread_in_Java, id=151791, stack(0x000078b34b780000,0x000078b34b880000)]
> Stack: [0x000078b34b780000,0x000078b34b880000], sp=0x000078b34b87c350, free
> space=1008k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> J 29467 c2
> org.apache.cassandra.db.rows.AbstractCell.clone(Lorg/apache/cassandra/utils/memory/ByteBufferCloner;)Lorg/apache/cassandra/db/rows/Cell;
> (50 bytes) @ 0x000078b3dd40a42f [0x000078b3dd409de0+0x000000000000064f]
> J 17669 c2
> org.apache.cassandra.db.rows.Cell.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/ColumnData;
> (6 bytes) @ 0x000078b3dc54edc0 [0x000078b3dc54ed40+0x0000000000000080]
> J 17816 c2
> org.apache.cassandra.db.rows.BTreeRow$$Lambda$845.apply(Ljava/lang/Object;)Ljava/lang/Object;
> (12 bytes) @ 0x000078b3dbed01a4 [0x000078b3dbed0120+0x0000000000000084]
> J 17828 c2
> org.apache.cassandra.utils.btree.BTree.transform([Ljava/lang/Object;Ljava/util/function/Function;)[Ljava/lang/Object;
> (194 bytes) @ 0x000078b3dc5f35f0 [0x000078b3dc5f34a0+0x0000000000000150]
> J 35096 c2
> org.apache.cassandra.db.rows.BTreeRow.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/Row;
> (37 bytes) @ 0x000078b3dda9111c [0x000078b3dda90fe0+0x000000000000013c]
> J 30500 c2
> org.apache.cassandra.utils.memory.EnsureOnHeap$CloneToHeap.applyToRow(Lorg/apache/cassandra/db/rows/Row;)Lorg/apache/cassandra/db/rows/Row;
> (16 bytes) @ 0x000078b3dd59b91c [0x000078b3dd59b8c0+0x000000000000005c]
> J 26498 c2 org.apache.cassandra.db.transform.BaseRows.hasNext()Z (215 bytes)
> @ 0x000078b3dcf1c454 [0x000078b3dcf1c180+0x00000000000002d4]
> J 30775 c2
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext()Ljava/lang/Object;
> (49 bytes) @ 0x000078b3dc789020 [0x000078b3dc788fc0+0x0000000000000060]
> J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @
> 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104]
> J 35593 c2
> org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Lorg/apache/cassandra/service/paxos/uncommitted/PaxosKeyState;
> (126 bytes) @ 0x000078b3dc7ceeec [0x000078b3dc7cee20+0x00000000000000cc]
> J 35591 c2
> org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Ljava/lang/Object;
> (5 bytes) @ 0x000078b3dc7d09e4 [0x000078b3dc7d09a0+0x0000000000000044]
> J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @
> 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104]
> J 34146 c2
> com.google.common.collect.Iterators.addAll(Ljava/util/Collection;Ljava/util/Iterator;)Z
> (41 bytes) @ 0x000078b3dd9197e8 [0x000078b3dd919680+0x0000000000000168]
> J 38256 c1
> org.apache.cassandra.service.paxos.uncommitted.PaxosRows.toIterator(Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;Lorg/apache/cassandra/schema/TableId;Z)Lorg/apache/cassandra/utils/CloseableIterator;
> (49 bytes) @ 0x000078b3d6b677ac [0x000078b3d6b672e0+0x00000000000004cc]
> J 34823 c1
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedIndex.repairIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator;
> (212 bytes) @ 0x000078b3d5675e0c [0x000078b3d5673be0+0x000000000000222c]
> J 38259 c1
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.uncommittedKeyIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator;
> (116 bytes) @ 0x000078b3d6b6bc54 [0x000078b3d6b6b7e0+0x0000000000000474]
> J 38257 c1
> org.apache.cassandra.service.StorageService.autoRepairPaxos(Lorg/apache/cassandra/schema/TableId;)Lorg/apache/cassandra/utils/concurrent/Future;
> (57 bytes) @ 0x000078b3d6b6902c [0x000078b3d6b68e00+0x000000000000022c]
> j
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.schedulePaxosAutoRepairs()V+146
> j
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1773.run()V+4
> J 39703 c1
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.runAndLogException(Ljava/lang/String;Ljava/lang/Runnable;)V
> (39 bytes) @ 0x000078b3d435adfc [0x000078b3d435ad00+0x00000000000000fc]
> j
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.maintenance()V+19
> j
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1534.run()V+4
> J 30376 c2
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V
> [email protected] (57 bytes) @ 0x000078b3dd56543c
> [0x000078b3dd565100+0x000000000000033c]
> J 27255% c2
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
> [email protected] (187 bytes) @ 0x000078b3dd114d58
> [0x000078b3dd114ac0+0x0000000000000298]
> j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 [email protected]
> j io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4
> j java.lang.Thread.run()V+11 [email protected]
> v ~StubRoutines::call_stub
> V [libjvm.so+0x877453] JavaCalls::call_helper(JavaValue*, methodHandle
> const&, JavaCallArguments*, Thread*)+0x373
> V [libjvm.so+0x875a96] JavaCalls::call_virtual(JavaValue*, Handle, Klass*,
> Symbol*, Symbol*, Thread*)+0x186
> V [libjvm.so+0x925653] thread_entry(JavaThread*, Thread*)+0xa3
> V [libjvm.so+0xe41391] JavaThread::thread_main_inner()+0x131
> V [libjvm.so+0xe3d790] Thread::call_run()+0x140
> V [libjvm.so+0xbf97de] thread_native_entry(Thread*)+0xee
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]