[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204107#comment-13204107
]
Phabricator commented on HBASE-5074:
------------------------------------
todd has commented on the revision "[jira] [HBASE-5074] Support checksums in
HBase block cache".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HConstants.java:598 typo: verification
and still not sure what true/false means here... would be better to clarify
either here or in src/main/resources/hbase-default.xml if you anticipate users
ever changing this.
If I set it to false does that mean I get no checksumming? or hdfs
checksumming as before? please update the comment
src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:41-43 I
think this API would be cleaner with the following changes:
- rather than use the constant HFileBlock.HEADER_SIZE below, make the API:
appendChecksums(ChecksumByteArrayOutputStream baos,
int dataOffset, int dataLen,
ChecksumType checksumType,
int bytesPerChecksum) {
}
where it would checksum the data between dataOffset and dataOffset + dataLen,
and append it to the baos
src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 same
here, I think it's better to take the offset as a parameter instead of assume
HFileBlock.HEADER_SIZE
src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 if this
is performance critical, use DataOutputBuffer, presized to right size, and then
return its underlying buffer directly to avoid a copy and realloc
src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:123 seems
strange that this is inconsistent with the above -- if the block desn't have a
checksum, why is that differently handled than if the block is from a prior
version which doesn't have a checksum?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:100 typo
re-enable
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:79-80 should
clarify which part of the data is checksummed.
As I read the code, only the non-header data (ie the "user data") is
checksummed. Is this correct?
It seems to me like this is potentially dangerous -- eg a flipped bit in an
hfile block header might cause the "compressedDataSize" field to be read as 2GB
or something, in which case the faulty allocation could cause the server to
OOME. I think we need a checksum on the hfile block header as well.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:824 rename to
doCompressionAndChecksumming, and update javadoc
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:815 I was a
bit confused by this at first - I think it would be nice to add a comment here
saying:
// set the header for the uncompressed bytes (for cache-on-write)
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:852 this weird
difference between compressed and uncompressed case could be improved, I think:
Why not make the uncompressedBytesWithHeader leave free space for the
checksums at the end of the array, and have it generate the checksums into that
space?
Or change generateChecksums to take another array as an argument, rather than
having it append to the same 'baos'?
It's currently quite confusing that "onDiskChecksum" ends up empty in the
compressed case, even though we _did_ write a checksum lumped in with the
onDiskBytesWithHeader.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1375-1379
Similar to above comment about the block headers, I think we need to do our own
checksumming on the hfile metadata itself -- what about a corruption in the
file header? Alternatively we could always use the checksummed stream when
loading the file-wide header which is probably much simpler
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 confused
by this - if we dn't have an HFileSystem, then wouldn't we assume that the
checksumming is done by the underlying dfs, and not use hbase checksums?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1580 s/it
never changes/because it is marked final/
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1588-1590 this
isn't thread-safe: multiple threads might decrement and skip -1, causing it to
never get re-enabled.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1599 add
comment here // checksum verification failed
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1620-1623 msg
should include file path
src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:53 typo: delegate
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3620 Given we
have rsServices.getFileSystem, why do we need to also pass this in?
REVISION DETAIL
https://reviews.facebook.net/D1521
> support checksums in HBase block cache
> --------------------------------------
>
> Key: HBASE-5074
> URL: https://issues.apache.org/jira/browse/HBASE-5074
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch,
> D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch,
> D1521.5.patch, D1521.5.patch
>
>
> The current implementation of HDFS stores the data in one block file and the
> metadata(checksum) in another block file. This means that every read into the
> HBase block cache actually consumes two disk iops, one to the datafile and
> one to the checksum file. This is a major problem for scaling HBase, because
> HBase is usually bottlenecked on the number of random disk iops that the
> storage-hardware offers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira