[jira] [Commented] (HBASE-21607) HBase Region server read fails in case of datanode disk error.
[ https://issues.apache.org/jira/browse/HBASE-21607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721916#comment-16721916 ] Lars Hofhansl commented on HBASE-21607: --- I guess the question I have: Why is there a NULL-checksum option in the first place? We turned on HBase checksums, and there's an option to configure a no-op checksum... Seems at least somewhat pointless. :) > HBase Region server read fails in case of datanode disk error. > -- > > Key: HBASE-21607 > URL: https://issues.apache.org/jira/browse/HBASE-21607 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.2 >Reporter: Rushabh S Shah >Priority: Major > > Hbase region server reads failed with following error. > {noformat} > 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] hdfs.BlockReaderFactory - > BlockReaderFactory(fileName=, > block=BP-1618467445--1516873463430:blk_1090719164_16982933): error > creating ShortCircuitReplica. > java.io.IOException: invalid metadata header version 0. Can only handle > version 1. > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.(ShortCircuitReplica.java:129) > at > org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:558) > at > org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) > at > org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1145) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1087) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1444) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1407) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:834) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1530) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1781) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1624) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:455) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.seekTo(HFileReaderV2.java:1263) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:297) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:189) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:372) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:220) > at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2164) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5916) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5890) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2739) > at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2719) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7197) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7156) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7149) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2250) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35068) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2373) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > > 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] > shortcircuit.ShortCircuitCache - ShortCircuitCache(0x16fd768f): failed to > load 1090719164_BP-1618467445--1516873463430 > 2018-11-30 16:49:18,761 DEBUG [,queue=12,port=60020] ipc.RpcServer - > RpcServer.FifoWFPBQ.default.handler=246,queue=12,port=60020: callId: 46940 > service: ClientService methodName: Get size: 443 connection: :48798 > deadline: 1543596678759 > 2018-11-30 16:49:18,761 DEBUG
[jira] [Commented] (HBASE-21607) HBase Region server read fails in case of datanode disk error.
[ https://issues.apache.org/jira/browse/HBASE-21607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721914#comment-16721914 ] Lars Hofhansl commented on HBASE-21607: --- This of course is rare case... Choosing 0 read from a datastore is just bad practice, though. Not sure how we can change it. I suppose we could have a new HFile version and have 0 indicate an error and pick only non-zero (and non-255) value. > HBase Region server read fails in case of datanode disk error. > -- > > Key: HBASE-21607 > URL: https://issues.apache.org/jira/browse/HBASE-21607 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.3.2 >Reporter: Rushabh S Shah >Priority: Major > > Hbase region server reads failed with following error. > {noformat} > 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] hdfs.BlockReaderFactory - > BlockReaderFactory(fileName=, > block=BP-1618467445--1516873463430:blk_1090719164_16982933): error > creating ShortCircuitReplica. > java.io.IOException: invalid metadata header version 0. Can only handle > version 1. > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.(ShortCircuitReplica.java:129) > at > org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:558) > at > org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) > at > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) > at > org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1145) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1087) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1444) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1407) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:834) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1530) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1781) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1624) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:455) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.seekTo(HFileReaderV2.java:1263) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:297) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:189) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:372) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:220) > at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2164) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5916) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5890) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2739) > at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2719) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7197) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7156) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7149) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2250) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35068) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2373) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > > 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] > shortcircuit.ShortCircuitCache - ShortCircuitCache(0x16fd768f): failed to > load 1090719164_BP-1618467445--1516873463430 > 2018-11-30 16:49:18,761 DEBUG [,queue=12,port=60020] ipc.RpcServer - > RpcServer.FifoWFPBQ.default.handler=246,queue=12,port=60020: callId: 46940 > service: ClientService methodName: Get size: 443 connection: :48798 > deadline: 1543596678759 > 2018-11-30