[ https://issues.apache.org/jira/browse/HBASE-27706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Kyle Purtell reassigned HBASE-27706: ------------------------------------------- Assignee: (was: Andrew Kyle Purtell) > Additional Zstandard codec compatible with the Hadoop native one > ---------------------------------------------------------------- > > Key: HBASE-27706 > URL: https://issues.apache.org/jira/browse/HBASE-27706 > Project: HBase > Issue Type: Bug > Components: compatibility > Affects Versions: 2.5.3 > Reporter: Frens Jan Rumph > Priority: Major > > > We're in the process of upgrading a HBase installation from 2.2.4 to 2.5.3. > We're currently using Zstd compression from our Hadoop installation. Due to > some other class path issues (Netty issues in relation to the async WAL > provider), we would like to remove Hadoop from the class path. > However, using the Zstd compression from HBase (which uses > [https://github.com/luben/zstd-jni]) we seem to hit some incompatibility. > When restarting a node to use this implementation we had errors like the > following: > {code:java} > 2023-03-10 16:33:01,925 WARN [RS_OPEN_REGION-regionserver/n2:16020-0] > handler.AssignRegionHandler: Failed to open region > NAMESPACE:TABLE,,1673888962751.cdb726dad4eaabf765969f195e91c737., will report > to master > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data > index and meta index from file > hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1148) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1091) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:994) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:941) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7228) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7183) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7159) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7118) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7074) > at > org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading data > index and meta index from file > hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 > at > org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:338) > at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:297) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6359) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1114) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1111) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > ... 3 more > Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem > reading data index and meta index from file > hdfs://CLUSTER/hbase/data/NAMESPACE/TABLE/cdb726dad4eaabf765969f195e91c737/e/aea6eddaa8ee476197d064a4b4c345b9 > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:392) > at > org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:394) > at > org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:518) > at > org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:225) > at > org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:266) > ... 6 more > Caused by: java.io.IOException: Premature EOF from inputStream, but still > need 2883 bytes > at > org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:153) > at > org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:104) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:644) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlock(HFileBlock.java:1397) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl$1.nextBlockWithBlockType(HFileBlock.java:1407) > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.initMetaAndIndex(HFileInfo.java:365) > ... 10 more {code} > I've been able to reproduce the issue with something like: > {code:java} > Configuration conf = HBaseConfiguration.create(); > conf.set("hbase.io.compress.zstd.codec", > "org.apache.hadoop.hbase.io.compress.zstd.ZstdCodec"); > HFileSystem fs = (HFileSystem) HFileSystem.get(conf); > HFile.createReader(fs, new Path(...), conf); {code} > with a file from HDFS that was created with the native compressor from Hadoop. > Note that I only _suspect_ that this issue is caused by Zstd! In our test > environment we are already running 2.5.3 with reasonable succes. This issue > arises when we drop Hadoop from the class path and use the 'built in' > compression. But that's not hard evidence of Zstd being the root cause of > course. -- This message was sent by Atlassian Jira (v8.20.10#820010)