Gokhan Cagrici created PHOENIX-2719:
---------------------------------------
Summary: RS crashed and HBase is not recovering during log split
Key: PHOENIX-2719
URL: https://issues.apache.org/jira/browse/PHOENIX-2719
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.5.2
Environment: We are using phoenix 4.5.2 in CDH 5.5 as a parcel.
Reporter: Gokhan Cagrici
Priority: Blocker
Hi,
Several RSs crashed and now HBase is trying to recover but log splitting phase
is getting exception:
Caught throwable while processing event RS_LOG_REPLAY
java.lang.NoSuchFieldError: in
at
org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec$IndexKeyValueDecoder.parseCell(IndexedWALEditCodec.java:98)
at
org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec$IndexKeyValueDecoder.parseCell(IndexedWALEditCodec.java:85)
at
org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:67)
at
org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFromCells(WALEdit.java:244)
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:343)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:104)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:87)
at
org.apache.hadoop.hbase.wal.WALSplitter.getNextLogLine(WALSplitter.java:799)
at
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:332)
at
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:242)
at
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104)
at
org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I believe this bug is related to PHOENIX-2629 however we cannot build the jar
file from github since this is a CDH parcel and definitely needs som
intervention.
Our system is completely down at the moment.
What needs to be done?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)