[jira] [Commented] (HBASE-21607) HBase Region server read fails in case of datanode disk error.

2018-12-14 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721916#comment-16721916
 ] 

Lars Hofhansl commented on HBASE-21607:
---

I guess the question I have: Why is there a NULL-checksum option in the first 
place? We turned on HBase checksums, and there's an option to configure a no-op 
checksum... Seems at least somewhat pointless. :)

> HBase Region server read fails in case of datanode disk error.
> --
>
> Key: HBASE-21607
> URL: https://issues.apache.org/jira/browse/HBASE-21607
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Rushabh S Shah
>Priority: Major
>
> Hbase region server reads failed with following error.
> {noformat}
> 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] hdfs.BlockReaderFactory - 
> BlockReaderFactory(fileName=, 
> block=BP-1618467445--1516873463430:blk_1090719164_16982933): error 
> creating ShortCircuitReplica.
> java.io.IOException: invalid metadata header version 0. Can only handle 
> version 1.
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.(ShortCircuitReplica.java:129)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:558)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490)
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782)
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1145)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1087)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1444)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1407)
> at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:834)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1530)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1781)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1624)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:455)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.seekTo(HFileReaderV2.java:1263)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:297)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:189)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:372)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:220)
> at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2164)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5916)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5890)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2739)
> at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2719)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7197)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7156)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7149)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2250)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35068)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2373)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
>  
> 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] 
> shortcircuit.ShortCircuitCache - ShortCircuitCache(0x16fd768f): failed to 
> load 1090719164_BP-1618467445--1516873463430
> 2018-11-30 16:49:18,761 DEBUG [,queue=12,port=60020] ipc.RpcServer - 
> RpcServer.FifoWFPBQ.default.handler=246,queue=12,port=60020: callId: 46940 
> service: ClientService methodName: Get size: 443 connection: :48798 
> deadline: 1543596678759
> 2018-11-30 16:49:18,761 DEBUG 

[jira] [Commented] (HBASE-21607) HBase Region server read fails in case of datanode disk error.

2018-12-14 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721914#comment-16721914
 ] 

Lars Hofhansl commented on HBASE-21607:
---

This of course is rare case... Choosing 0 read from a datastore is just bad 
practice, though. Not sure how we can change it.
I suppose we could have a new HFile version and have 0 indicate an error and 
pick only non-zero (and non-255) value.


> HBase Region server read fails in case of datanode disk error.
> --
>
> Key: HBASE-21607
> URL: https://issues.apache.org/jira/browse/HBASE-21607
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Rushabh S Shah
>Priority: Major
>
> Hbase region server reads failed with following error.
> {noformat}
> 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] hdfs.BlockReaderFactory - 
> BlockReaderFactory(fileName=, 
> block=BP-1618467445--1516873463430:blk_1090719164_16982933): error 
> creating ShortCircuitReplica.
> java.io.IOException: invalid metadata header version 0. Can only handle 
> version 1.
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.(ShortCircuitReplica.java:129)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:558)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490)
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782)
> at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1145)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1087)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1444)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1407)
> at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:834)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1530)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1781)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1624)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:455)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.seekTo(HFileReaderV2.java:1263)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:297)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:189)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:372)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:220)
> at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2164)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5916)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:5890)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2739)
> at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2719)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7197)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7156)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7149)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2250)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35068)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2373)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
>  
> 2018-11-30 16:49:18,760 WARN [,queue=12,port=60020] 
> shortcircuit.ShortCircuitCache - ShortCircuitCache(0x16fd768f): failed to 
> load 1090719164_BP-1618467445--1516873463430
> 2018-11-30 16:49:18,761 DEBUG [,queue=12,port=60020] ipc.RpcServer - 
> RpcServer.FifoWFPBQ.default.handler=246,queue=12,port=60020: callId: 46940 
> service: ClientService methodName: Get size: 443 connection: :48798 
> deadline: 1543596678759
> 2018-11-30