[
https://issues.apache.org/jira/browse/HBASE-26036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371823#comment-17371823
]
Xiaolin Ha commented on HBASE-26036:
------------------------------------
Hi,[~stack],the scanner closed in get() is a little conspicuous when I looking
through the codes according to the coredump logs. But the pressure in UT is not
big enough to verify the suspect by letting the program auto rewrite the
released DBBs. I add a test BYTEBUFF_ALLOCATOR_CLASS to overwrite the DBBs
right after released, results shows that checkAndMutate has read dirty data.
> DBB released too early and dirty data for checkAndMutate
> --------------------------------------------------------
>
> Key: HBASE-26036
> URL: https://issues.apache.org/jira/browse/HBASE-26036
> Project: HBase
> Issue Type: Bug
> Components: rpc
> Affects Versions: 3.0.0-alpha-1, 2.0.0
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Critical
>
> Before HBASE-25187, we found there are regionserver JVM crashing problems on
> our production clusters, the coredump infos are as follows,
> {code:java}
> Stack: [0x00007f621ba8d000,0x00007f621bb8e000], sp=0x00007f621bb8c0e0, free
> space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> J 10829 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getTimestamp()J (9
> bytes) @ 0x00007f6a5ee11b2d [0x00007f6a5ee11ae0+0x4d]
> J 22844 C2
> org.apache.hadoop.hbase.regionserver.HRegion.doCheckAndRowMutate([B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/client/RowMutations;Lorg/apache/hadoop/hbase/client/Mutation;Z)Z
> (540 bytes) @ 0x00007f6a60bed144 [0x00007f6a60beb320+0x1e24]
> J 17972 C2
> org.apache.hadoop.hbase.regionserver.RSRpcServices.checkAndRowMutate(Lorg/apache/hadoop/hbase/regionserver/Region;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;[B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;)Z
> (312 bytes) @ 0x00007f6a5f4a7ed0 [0x00007f6a5f4a6f40+0xf90]
> J 26197 C2
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(Lorg/apache/hbase/thirdparty/com/google/protobuf/RpcController;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiRequest;)Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiResponse;
> (644 bytes) @ 0x00007f6a61538b0c [0x00007f6a61537940+0x11cc]
> J 26332 C2
> org.apache.hadoop.hbase.ipc.RpcServer.call(Lorg/apache/hadoop/hbase/ipc/RpcCall;Lorg/apache/hadoop/hbase/monitoring/MonitoredRPCHandler;)Lorg/apache/hadoop/hbase/util/Pair;
> (566 bytes) @ 0x00007f6a615e8228 [0x00007f6a615e79c0+0x868]
> J 20563 C2 org.apache.hadoop.hbase.ipc.CallRunner.run()V (1196 bytes) @
> 0x00007f6a60711a4c [0x00007f6a60711000+0xa4c]
> J 19656% C2
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(Ljava/util/concurrent/BlockingQueue;Ljava/util/concurrent/atomic/AtomicInteger;)V
> (338 bytes) @ 0x00007f6a6039a414 [0x00007f6a6039a320+0xf4]
> j org.apache.hadoop.hbase.ipc.RpcExecutor$1.run()V+24
> j java.lang.Thread.run()V+11
> v ~StubRoutines::call_stub
> {code}
> I have made a UT to reproduce this error, it can occur 100%。
> After HBASE-25187,the check result of the checkAndMutate will be false,
> because it read wrong/dirty data from the released ByteBuff.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)