[ https://issues.apache.org/jira/browse/HBASE-28390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819406#comment-17819406 ]
Bryan Beaudreault commented on HBASE-28390: ------------------------------------------- Oh another thing – we initially noticed this as dropped data. So unless you were doing a full VerifyReplication run between the 2 clusters you might not have seen a failure. I need to dig more into this to be 100% sure, but I believe its possible that the WAL reader code will interpret the error as an EOF. > WAL value compression fails for cells with large values > ------------------------------------------------------- > > Key: HBASE-28390 > URL: https://issues.apache.org/jira/browse/HBASE-28390 > Project: HBase > Issue Type: Bug > Reporter: Bryan Beaudreault > Priority: Major > > We are testing out WAL compression and noticed that it fails for large values > when both features (wal compression and wal value compression) are enabled. > It works fine with either feature independently, but not when combined. It > seems to fail for all of the value compressor types, and the failure is in > the LRUDictionary of wal key compression: > > {code:java} > java.io.IOException: Error while reading 2 WAL KVs; started reading at 230 > and read up to 396 > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALStreamReader.next(ProtobufWALStreamReader.java:94) > ~[classes/:?] > at > org.apache.hadoop.hbase.wal.CompressedWALTestBase.doTest(CompressedWALTestBase.java:181) > ~[test-classes/:?] > at > org.apache.hadoop.hbase.wal.CompressedWALTestBase.testForSize(CompressedWALTestBase.java:129) > ~[test-classes/:?] > at > org.apache.hadoop.hbase.wal.CompressedWALTestBase.testLarge(CompressedWALTestBase.java:94) > ~[test-classes/:?] > at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:?] > at > jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:?] > at > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:?] > at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > ~[junit-4.13.2.jar:4.13.2] > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > ~[junit-4.13.2.jar:4.13.2] > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > ~[junit-4.13.2.jar:4.13.2] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at java.lang.Thread.run(Thread.java:829) ~[?:?] > Caused by: java.lang.IndexOutOfBoundsException: index (21) must be less than > size (1) > at > org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1371) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1353) > ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5] > at > org.apache.hadoop.hbase.io.util.LRUDictionary$BidirectionalLRUMap.get(LRUDictionary.java:153) > ~[classes/:?] > at > org.apache.hadoop.hbase.io.util.LRUDictionary$BidirectionalLRUMap.access$000(LRUDictionary.java:79) > ~[classes/:?] > at > org.apache.hadoop.hbase.io.util.LRUDictionary.getEntry(LRUDictionary.java:43) > ~[classes/:?] > at > org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.readIntoArray(WALCellCodec.java:366) > ~[classes/:?] > at > org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.parseCell(WALCellCodec.java:307) > ~[classes/:?] > at org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:66) > ~[classes/:?] > at org.apache.hadoop.hbase.wal.WALEdit.readFromCells(WALEdit.java:313) > ~[classes/:?] > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALStreamReader.next(ProtobufWALStreamReader.java:84) > ~[classes/:?] > ... 27 more {code} > We've created a unit test which reproduces for each compressor type. It seems > to fail around the 200kb value size for each. -- This message was sent by Atlassian Jira (v8.20.10#820010)