Jon Haddad created CASSANDRA-19668:
--------------------------------------

             Summary: SIGSEV origininating in Paxos Scheduled Task
                 Key: CASSANDRA-19668
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19668
             Project: Cassandra
          Issue Type: Bug
            Reporter: Jon Haddad


I haven't gotten to the root cause of this yet. Several 4.1 nodes have crashed 
in in production.  I'm not sure if this is related to Paxos v2 or not, but it 
is enabled.

Most of the crashes don't have a stacktrace - they only reference this

{noformat}
Stack: [0x00007fabf4c34000,0x00007fabf4d34000],  sp=0x00007fabf4d31f00,  free 
space=1015k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, 
Vv=VM code, C=native code)
v  ~StubRoutines::jint_disjoint_arraycopy

{noformat}

They all are in the {{ScheduledTasks}} thread.

>From the crash log:

{noformat}
---------------  T H R E A D  ---------------

Current thread (0x000078b375eac800):  JavaThread "ScheduledTasks:1" daemon 
[_thread_in_Java, id=151791, stack(0x000078b34b780000,0x000078b34b880000)]

Stack: [0x000078b34b780000,0x000078b34b880000],  sp=0x000078b34b87c350,  free 
space=1008k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, 
Vv=VM code, C=native code)
J 29467 c2 
org.apache.cassandra.db.rows.AbstractCell.clone(Lorg/apache/cassandra/utils/memory/ByteBufferCloner;)Lorg/apache/cassandra/db/rows/Cell;
 (50 bytes) @ 0x000078b3dd40a42f [0x000078b3dd409de0+0x000000000000064f]
J 17669 c2 
org.apache.cassandra.db.rows.Cell.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/ColumnData;
 (6 bytes) @ 0x000078b3dc54edc0 [0x000078b3dc54ed40+0x0000000000000080]
J 17816 c2 
org.apache.cassandra.db.rows.BTreeRow$$Lambda$845.apply(Ljava/lang/Object;)Ljava/lang/Object;
 (12 bytes) @ 0x000078b3dbed01a4 [0x000078b3dbed0120+0x0000000000000084]
J 17828 c2 
org.apache.cassandra.utils.btree.BTree.transform([Ljava/lang/Object;Ljava/util/function/Function;)[Ljava/lang/Object;
 (194 bytes) @ 0x000078b3dc5f35f0 [0x000078b3dc5f34a0+0x0000000000000150]
J 35096 c2 
org.apache.cassandra.db.rows.BTreeRow.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/Row;
 (37 bytes) @ 0x000078b3dda9111c [0x000078b3dda90fe0+0x000000000000013c]
J 30500 c2 
org.apache.cassandra.utils.memory.EnsureOnHeap$CloneToHeap.applyToRow(Lorg/apache/cassandra/db/rows/Row;)Lorg/apache/cassandra/db/rows/Row;
 (16 bytes) @ 0x000078b3dd59b91c [0x000078b3dd59b8c0+0x000000000000005c]
J 26498 c2 org.apache.cassandra.db.transform.BaseRows.hasNext()Z (215 bytes) @ 
0x000078b3dcf1c454 [0x000078b3dcf1c180+0x00000000000002d4]
J 30775 c2 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext()Ljava/lang/Object;
 (49 bytes) @ 0x000078b3dc789020 [0x000078b3dc788fc0+0x0000000000000060]
J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ 
0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104]
J 35593 c2 
org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Lorg/apache/cassandra/service/paxos/uncommitted/PaxosKeyState;
 (126 bytes) @ 0x000078b3dc7ceeec [0x000078b3dc7cee20+0x00000000000000cc]
J 35591 c2 
org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Ljava/lang/Object;
 (5 bytes) @ 0x000078b3dc7d09e4 [0x000078b3dc7d09a0+0x0000000000000044]
J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ 
0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104]
J 34146 c2 
com.google.common.collect.Iterators.addAll(Ljava/util/Collection;Ljava/util/Iterator;)Z
 (41 bytes) @ 0x000078b3dd9197e8 [0x000078b3dd919680+0x0000000000000168]
J 38256 c1 
org.apache.cassandra.service.paxos.uncommitted.PaxosRows.toIterator(Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;Lorg/apache/cassandra/schema/TableId;Z)Lorg/apache/cassandra/utils/CloseableIterator;
 (49 bytes) @ 0x000078b3d6b677ac [0x000078b3d6b672e0+0x00000000000004cc]
J 34823 c1 
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedIndex.repairIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator;
 (212 bytes) @ 0x000078b3d5675e0c [0x000078b3d5673be0+0x000000000000222c]
J 38259 c1 
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.uncommittedKeyIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator;
 (116 bytes) @ 0x000078b3d6b6bc54 [0x000078b3d6b6b7e0+0x0000000000000474]
J 38257 c1 
org.apache.cassandra.service.StorageService.autoRepairPaxos(Lorg/apache/cassandra/schema/TableId;)Lorg/apache/cassandra/utils/concurrent/Future;
 (57 bytes) @ 0x000078b3d6b6902c [0x000078b3d6b68e00+0x000000000000022c]
j  
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.schedulePaxosAutoRepairs()V+146
j  
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1773.run()V+4
J 39703 c1 
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.runAndLogException(Ljava/lang/String;Ljava/lang/Runnable;)V
 (39 bytes) @ 0x000078b3d435adfc [0x000078b3d435ad00+0x00000000000000fc]
j  
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.maintenance()V+19
j  
org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1534.run()V+4
J 30376 c2 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V 
[email protected] (57 bytes) @ 0x000078b3dd56543c 
[0x000078b3dd565100+0x000000000000033c]
J 27255% c2 
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
 [email protected] (187 bytes) @ 0x000078b3dd114d58 
[0x000078b3dd114ac0+0x0000000000000298]
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 [email protected]
j  io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4
j  java.lang.Thread.run()V+11 [email protected]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x877453]  JavaCalls::call_helper(JavaValue*, methodHandle 
const&, JavaCallArguments*, Thread*)+0x373
V  [libjvm.so+0x875a96]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, 
Symbol*, Symbol*, Thread*)+0x186
V  [libjvm.so+0x925653]  thread_entry(JavaThread*, Thread*)+0xa3
V  [libjvm.so+0xe41391]  JavaThread::thread_main_inner()+0x131
V  [libjvm.so+0xe3d790]  Thread::call_run()+0x140
V  [libjvm.so+0xbf97de]  thread_native_entry(Thread*)+0xee
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to