[
https://issues.apache.org/jira/browse/HBASE-25628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
SweetCandy updated HBASE-25628:
-------------------------------
Summary: Regionserver crash occasionally due to SIGSEGV (was: RegionServer
crash occasionally due to SIGSEGV)
> Regionserver crash occasionally due to SIGSEGV
> ----------------------------------------------
>
> Key: HBASE-25628
> URL: https://issues.apache.org/jira/browse/HBASE-25628
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 2.1.0
> Reporter: SweetCandy
> Priority: Major
> Attachments: hs_err_pid15967.log, hs_err_pid3375.log
>
>
> Regionserver had a problem which is JVM crash after running for a period of
> time. From the error dump logs, the main reason is due to SIGSEGV singal. how
> to deal with it.
> BTW, from the start of the regionserver to the crash, there is no error
> record in the hbase log
> After many crash, there are some common points.
> # Thread is AsyncFSWAL-0.
> # Problematic frame is v ~StubRoutines::jshort_disjoint_arraycopy
> # Error method is
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter$OutputStreamWrapper.write.
> jdk: Oracle jdk-1.8_181
> os: CentOS Linux release 7.4.1708 (Core)
> hbase: 2.1.0-cdh6.3.0
>
> {code:java}
> # JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build
> 1.8.0_181-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode
> linux-amd64 )
> # Problematic frame:# v ~StubRoutines::jshort_disjoint_arraycopy
> {code}
>
> {code:java}
> --------------- T H R E A D ---------------
> Current thread (0x00007f4758820800): JavaThread "AsyncFSWAL-0" daemon
> [_thread_in_Java, id=17396, stack(0x00007f36f0aa3000,0x00007f36f0ae4000)]
> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr:
> 0x00007f36bc3dd93a
> Stack: [0x00007f36f0aa3000,0x00007f36f0ae4000], sp=0x00007f36f0ae2310, free
> space=252kNative frames: (J=compiled Java code, j=interpreted, Vv=VM code,
> C=native code)
> v ~StubRoutines::jshort_disjoint_arraycopy
> J 13350 C2
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter$OutputStreamWrapper.write(Ljava/nio/ByteBuffer;II)V
> (34 bytes) @ 0x00007f475f113fd1 [0x00007f475f113cc0+0x311]
> J 16839 C2
> org.apache.hadoop.hbase.ByteBufferKeyValue.write(Ljava/io/OutputStream;Z)I
> (21 bytes) @ 0x00007f475f4329dc [0x00007f475f432960+0x7c]
> J 12594 C2
> org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$EnsureKvEncoder.write(Lorg/apache/hadoop/hbase/Cell;)V
> (27 bytes) @ 0x00007f475ef80414 [0x00007f475ef7fc20+0x7f4]
> VM Arguments:
> jvm_args:
> -Dproc_regionserver -XX:OnOutOfMemoryError=kill -9 %p
> -Djava.net.preferIPv4Stack=true -Xms33285996544 -Xmx33285996544 -Xmx64g
> -Xms32g -Xmn6g -Xss256k
> -XX:MaxPermSize=384m -XX:SurvivorRatio=6 -XX:+UseParNewGC
> -XX:ParallelGCThreads=10 -XX:+UseConcMarkSweepGC
> -XX:ParallelCMSThreads=16 -XX:+CMSParallelRemarkEnabled
> -XX:+UseCMSCompactAtFullCollection -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=70 -XX:CMSMaxAbortablePrecleanTime=500
> -XX:CMSFullGCsBeforeCompaction=5 -XX:+CMSClassUnloadingEnabled
> -XX:+HeapDumpOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/log/tmp/hbase/hbase_hbase-REGIONSERVER-58f025f30c5505722d8dee21c92b9469_pid15967.hprof
> -XX:OnOutOfMemoryError=/opt/cloudera/cm-agent/service/common/killparent.sh
> -Dhbase.log.dir=/log/hbase
> -Dhbase.log.file=hbase-cmf-hbase-REGIONSERVER-spbsjzy05.log.out
> -Dhbase.home.dir=/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/hbase
> -Dhbase.id.str= -Dhbase.root.logger=INFO,RFA
> -Djava.library.path=/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/hadoop/lib/native:/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/hbase/lib/native/Linux-amd64-64
> -Dhbase.security.logger=INFO,RFAS
> java_command: org.apache.hadoop.hbase.regionserver.HRegionServer start
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)