Dear Folks,
Recently we met a production issue where GET/Scan operation is failing while
reading the HFile block.
Below exception occurred,
Caused by: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner
for reader reader=<HfileName>, compression=snappy,
cacheConf=blockCache=LruBlockCache{blockCount=100604, currentSize=8077097536,
freeSize=137037248, maxSize=8214134784, heapSize=8077097536,
minSize=7803427840, minFactor=0.95, multiSize=3901713920, multiFactor=0.5,
singleSize=1950856960, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false,
firstKey=-----, lastKey=-----, avgKeyLen=49, avgValueLen=642, entries=2822972,
length=1038938707, cur=----- to key -----
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:228)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:423)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:363)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:309)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:271)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:988)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:977)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:658)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:150)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6076)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6236)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6010)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2879)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3080)
... 5 more
Caused by: java.io.IOException: Passed in onDiskSizeWithHeader=85372 != 33,
offset=761929468, fileContext=HFileContext [ usesHBaseChecksum=true
checksumType=CRC32C bytesPerChecksum=16384 blocksize=65536 encoding=NONE
includesMvcc=true includesTags=false compressAlgo=SNAPPY compressTags=false
cryptoContext=[ cipher=NONE keyHash=NONE ] ]
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.verifyOnDiskSizeMatchesHeader(HFileBlock.java:1673)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1734)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1567)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:454)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:271)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:651)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:631)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:303)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:211)
... 18 more
This exception occurred multiple times in the same environment and same problem
observed even though region was reassigned to different RS. To ensure the HFile
is not corrupted we copied the HFile into another cluster but read was
successful.
Problem got resolved after major compaction in the production environment.
Production environment is using HBase 1.3.1 + OS JIRA bug fixes.
Anyone met this problem in their environment, any lead would be greatly
appreciated.
Regards,
Pankaj