Michael Stack created HBASE-26062:
-------------------------------------

             Summary: SIGSEGV in AsyncFSWAL consume
                 Key: HBASE-26062
                 URL: https://issues.apache.org/jira/browse/HBASE-26062
             Project: HBase
          Issue Type: Sub-task
            Reporter: Michael Stack


Seems related to the parent issue. Its happened a few times on one of our 
clusters here. Below are two examples. Need more detail but perhaps the call 
has timed out, the buffer has thus been freed, but the late consume on the 
other side of the ringbuffer doesn't know that and goes ahead (Just 
speculation).

 
{code:java}
#  SIGSEGV (0xb) at pc=0x00007f8b3ef5b77c, pid=37631, tid=0x00007f61560ed700

RAX=0x00000000ffffdf6e is an unknown valueRBX=0x00007f8a38d7b6f8 is an 
oopjava.nio.DirectByteBuffer - klass: 
'java/nio/DirectByteBuffer'RCX=0x00007f60e2767898 is pointing into 
metadataRDX=0x0000000000000de7 is an unknown valueRSP=0x00007f61560ec6f0 is 
pointing into the stack for thread: 0x00007f8b3017b800RBP=[error occurred 
during error reporting (printing register info), id 0xb]

Stack: [0x00007f6155fed000,0x00007f61560ee000],  sp=0x00007f61560ec6f0,  free 
space=1021kNative frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)J 23901 C2 
java.util.stream.MatchOps$1MatchSink.accept(Ljava/lang/Object;)V (44 bytes) @ 
0x00007f8b3ef5b77c [0x00007f8b3ef5b640+0x13c]J 16165 C2 
java.util.ArrayList$ArrayListSpliterator.tryAdvance(Ljava/util/function/Consumer;)Z
 (79 bytes) @ 0x00007f8b3d67b344 [0x00007f8b3d67b2c0+0x84]J 16160 C2 
java.util.stream.MatchOps$MatchOp.evaluateSequential(Ljava/util/stream/PipelineHelper;Ljava/util/Spliterator;)Ljava/lang/Object;
 (7 bytes) @ 0x00007f8b3d67bc9c [0x00007f8b3d67b900+0x39c]J 17729 C2 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALActionListener.visitLogEntryBeforeWrite(Lorg/apache/hadoop/hbase/wal/WALKey;Lorg/apache/hadoop/hbase/wal/WALEdit;)V
 (10 bytes) @ 0x00007f8b3fc39010 [0x00007f8b3fc388a0+0x770]J 29991 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.appendAndSync()V (261 
bytes) @ 0x00007f8b3fd03d90 [0x00007f8b3fd039e0+0x3b0]J 20773 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume()V (474 bytes) @ 
0x00007f8b40283728 [0x00007f8b40283480+0x2a8]J 15191 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL$$Lambda$76.run()V (8 bytes) 
@ 0x00007f8b3ed69ecc [0x00007f8b3ed69ea0+0x2c]J 17383% C2 
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
 (225 bytes) @ 0x00007f8b3d9423f8 [0x00007f8b3d942260+0x198]j  
java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5j  
java.lang.Thread.run()V+11v  ~StubRoutines::call_stubV  [libjvm.so+0x66b9ba]  
JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, 
Thread*)+0xe1aV  [libjvm.so+0x669073]  JavaCalls::call_virtual(JavaValue*, 
KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x263V  
[libjvm.so+0x669647]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, 
Symbol*, Symbol*, Thread*)+0x57V  [libjvm.so+0x6aaa4c]  
thread_entry(JavaThread*, Thread*)+0x6cV  [libjvm.so+0xa224cb]  
JavaThread::thread_main_inner()+0xdbV  [libjvm.so+0xa22816]  
JavaThread::run()+0x316V  [libjvm.so+0x8c4202]  java_start(Thread*)+0x102C  
[libpthread.so.0+0x76ba]  start_thread+0xca {code}
 

This one is from a month previous and has a deeper stack... we're trying to 
read a Cell...

 
{code:java}
Stack: [0x00007fa1d5fb8000,0x00007fa1d60b9000],  sp=0x00007fa1d60b7660,  free 
space=1021kNative frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)J 30665 C2 
org.apache.hadoop.hbase.PrivateCellUtil.matchingFamily(Lorg/apache/hadoop/hbase/Cell;[BII)Z
 (59 bytes) @ 0x00007fcc2d29eeb2 [0x00007fcc2d29e7c0+0x6f2]J 25816 C2 
org.apache.hadoop.hbase.CellUtil.matchingFamily(Lorg/apache/hadoop/hbase/Cell;[B)Z
 (28 bytes) @ 0x00007fcc2a0430f8 [0x00007fcc2a0430e0+0x18]J 17236 C2 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALActionListener$$Lambda$254.test(Ljava/lang/Object;)Z
 (8 bytes) @ 0x00007fcc2b40bc68 [0x00007fcc2b40bc20+0x48]J 13735 C2 
java.util.ArrayList$ArrayListSpliterator.tryAdvance(Ljava/util/function/Consumer;)Z
 (79 bytes) @ 0x00007fcc2b7d936c [0x00007fcc2b7d92c0+0xac]J 17162 C2 
java.util.stream.MatchOps$MatchOp.evaluateSequential(Ljava/util/stream/PipelineHelper;Ljava/util/Spliterator;)Ljava/lang/Object;
 (7 bytes) @ 0x00007fcc29bc05e8 [0x00007fcc29bbfe80+0x768]J 16934 C2 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALActionListener.visitLogEntryBeforeWrite(Lorg/apache/hadoop/hbase/wal/WALKey;Lorg/apache/hadoop/hbase/wal/WALEdit;)V
 (10 bytes) @ 0x00007fcc2bb313f8 [0x00007fcc2bb30c60+0x798]J 30732 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.appendAndSync()V (261 
bytes) @ 0x00007fcc2ae5a420 [0x00007fcc2ae59d60+0x6c0]J 22203 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume()V (474 bytes) @ 
0x00007fcc2a987420 [0x00007fcc2a987200+0x220]J 16857 C2 
org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL$$Lambda$126.run()V (8 
bytes) @ 0x00007fcc2b0bf28c [0x00007fcc2b0bf260+0x2c]J 13721% C2 
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
 (225 bytes) @ 0x00007fcc2b7d77c0 [0x00007fcc2b7d7240+0x580]j  
java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5j  
java.lang.Thread.run()V+11v  ~StubRoutines::call_stub {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to