churro morales created HBASE-13724:
--------------------------------------
Summary: ReplicationSource dies under certain conditions reading a
sequence file
Key: HBASE-13724
URL: https://issues.apache.org/jira/browse/HBASE-13724
Project: HBase
Issue Type: Bug
Reporter: churro morales
A little background,
We run our server in -ea mode and have seen quite a few replication sources
silently die over the past few months.
Note: the stacktrace I posted below comes from a regionserver running 0.94 but
quickly looking at this issue, I believe this will happen in 98 too.
Should we harden replication source to deal with these types of assertion
errors by catching throwables, should we be dealing with this at the sequence
file reader level? Still looking into the root cause of this issue but when
manually shutdown our regionservers the regionserver that recovered its queue
replicated that log just fine. So in our case a simple retry would've worked
just fine.
{code}
2015-05-08 11:04:23,348 ERROR
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected
exception in ReplicationSource,
currentPath=hdfs://hm6.xxx.flurry.com:9000/hbase/.logs/xxxxx.yy.flurry.com,60020,1426792702998/xxxxx.atl.flurry.com%2C60020%2C1426792702998.1431107922449
java.lang.AssertionError
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader$WALReaderFSDataInputStream.getPos(SequenceFileLogReader.java:121)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1489)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:55)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:178)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:583)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:373)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)