[
https://issues.apache.org/jira/browse/HBASE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Daniel Cryans reassigned HBASE-8615:
-----------------------------------------
Assignee: Jean-Daniel Cryans
Assigning to me, it failed again in this build:
http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/org.apache.hbase$hbase-server/421/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationKillMasterRSCompressed/killOneMasterRS/
I tried to repro on Hadoop 1 and I'm not successful. Even tho it shouldn't
matter, I'll give it a shot on Hadoop 2.
The cause for this issue is that there's seems to be one case where we clean
the compression context in the middle of reading a file.
> TestReplicationQueueFailoverCompressed#queueFailover fails on hadoop 2.0 due
> to IndexOutOfBoundsException
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-8615
> URL: https://issues.apache.org/jira/browse/HBASE-8615
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Ted Yu
> Assignee: Jean-Daniel Cryans
> Attachments:
> org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed-output.txt
>
>
> In a recent test run, I noticed the following in test output:
> {code}
> 2013-05-24 22:01:02,424 DEBUG
> [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2]
> fs.HFileSystem$ReorderWALBlocks(327):
> /user/hortonzy/hbase/.logs/kiyo.gq1.ygridcore.net,42690,1369432806911/kiyo.gq1.ygridcore.net%2C42690%2C1369432806911.1369432840428
> is an HLog file, so reordering blocks, last hostname will
> be:kiyo.gq1.ygridcore.net
> 2013-05-24 22:01:02,429 DEBUG
> [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2]
> wal.ProtobufLogReader(118): After reading the trailer: walEditsStopOffset:
> 132235, fileLength: 132243, trailerPresent: true
> 2013-05-24 22:01:02,438 ERROR
> [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2]
> wal.ProtobufLogReader(236): Error while reading 691 WAL KVs; started
> reading at 53272 and read up to 65538
> 2013-05-24 22:01:02,438 WARN
> [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2]
> regionserver.ReplicationSource(324): 2 Got:
> java.io.IOException: Error while reading 691 WAL KVs; started reading at
> 53272 and read up to 65538
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:237)
> at
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:96)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:404)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:320)
> Caused by: java.lang.IndexOutOfBoundsException: index (30062) must be less
> than size (1)
> at
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
> at
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
> at
> org.apache.hadoop.hbase.regionserver.wal.LRUDictionary$BidirectionalLRUMap.get(LRUDictionary.java:124)
> at
> org.apache.hadoop.hbase.regionserver.wal.LRUDictionary$BidirectionalLRUMap.access$000(LRUDictionary.java:71)
> at
> org.apache.hadoop.hbase.regionserver.wal.LRUDictionary.getEntry(LRUDictionary.java:42)
> at
> org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.readIntoArray(WALCellCodec.java:210)
> at
> org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.parseCell(WALCellCodec.java:184)
> at
> org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:46)
> at
> org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFromCells(WALEdit.java:213)
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:217)
> ... 4 more
> 2013-05-24 22:01:02,439 DEBUG
> [RegionServer:0;kiyo.gq1.ygridcore.net,42690,1369432806911.replicationSource,2]
> regionserver.ReplicationSource(583): Nothing to replicate, sleeping 100
> times 10
> {code}
> Will attach test output.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira