[
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461290#comment-13461290
]
binlijin commented on HBASE-6868:
---------------------------------
[~lhofhansl]
I check the current implementations, hbase.regionserver.checksum.verify is
enabled by default, so when reading HFile, it uses the noChecksumFs in
HFileSystem, when reading HLog , it uses the fs in HFileSystem, they use
different FS.
fs in HFileSystem // filesystem object that has checksum verification turned
on.
noChecksumFs in HFileSystem // filesystem object that has checksum verification
turned off.
(1) dfs.client.read.shortcircuit = falseļ¼ short circuit read turned off.
DataNode read file data and send it to DFSClient(HRegionServer is a DFSClient)
HFile : DataNode will read block file and meta file. DFSClient will not
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DataNode will read block file and meta file. DFSClient will checksum the
data, HRegionServer will not checksum HLog data.
(2)dfs.client.read.shortcircuit = true,
dfs.client.read.shortcircuit.skip.checksum=false, short circuit read turned on.
If the block is local, DFSClient will read file data direct (HRegionServer is a
DFSClient).
HFile : DFSClient will read block file and meta file. DFSClient will not
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DFSClient will read block file and meta file. DFSClient will checksum
the data, HRegionServer will not checksum HLog data.
(3)dfs.client.read.shortcircuit = true,
dfs.client.read.shortcircuit.skip.checksum=true, short circuit read turned on.
If the block is local, DFSClient will read file data direct (HRegionServer is a
DFSClient).
HFile : DFSClient will read block file only. DFSClient will not checksum the
data, HRegionServer(HFile) will checksum the HFile data.
HLog : DFSClient will read block file and meta file. DFSClient will checksum
the data, HRegionServer will not checksum HLog data.
If i am wrong, please corrent me.
> Skip checksum is broke; are we double-checksumming by default?
> --------------------------------------------------------------
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
> Issue Type: Bug
> Components: HFile, wal
> Affects Versions: 0.94.0, 0.94.1
> Reporter: LiuLei
> Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile
> , that dont't need to read the checksum from meta file of HDFS. But HLog
> file of Hbase don't contain the checksum, so when HBase read the HLog, that
> must read checksum from meta file of HDFS. We could add setSkipChecksum per
> file to hdfs or we could write checksums into WAL if this skip checksum
> facility is enabled
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira