[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201679#comment-13201679
]
Phabricator commented on HBASE-5074:
------------------------------------
mbautin has commented on the revision "[jira] [HBASE-5074] Support checksums in
HBase block cache".
Some more comments. I am still concerned about the copy-paste stuff in
backwards-compatibility checking. Is there a way to minimize that?
I also mentioned this in the comments below, but it would probably make sense
to add more "canned" files in the no-checksum format generated by the old
writer and read them with the new reader, the same way HFile v1 compatibility
is ensured. I don't mind keeping the old writer code around in the unit test,
but I think it is best to remove as much code from that legacy writer as
possible (e.g. versatile API, toString, etc.) and only leave the parts
necessary to generate the file for testing.
INLINE COMMENTS
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164
Long line
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83
Can this be made private if it is not accessed outside of this class?
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78
Use ALL_CAPS for constants
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76
There seems to be a lot of copy-and-paste from the old HFileBlock code here.
Is there a way to reduce that?
I think we also need to create some canned old-format HFiles (using the old
code) and read them with the new reader code as part of the test.
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365
Make this class final.
Also, it would make sense to strip this class down as much as possible to
maintain the bare minimum of code required to test compatibility (if you have
not done that already).
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800
Do we ever use this function?
src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188
Is 0 the minor version with no checksums? If so, please replace it with a
constant for readability.
src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356
Is 0 the minor version with no checksums? If so, please replace it with a
constant for readability.
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300
Is 0 the minor version with no checksums? If so, please replace it with a
constant for readability.
REVISION DETAIL
https://reviews.facebook.net/D1521
> support checksums in HBase block cache
> --------------------------------------
>
> Key: HBASE-5074
> URL: https://issues.apache.org/jira/browse/HBASE-5074
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch,
> D1521.2.patch, D1521.3.patch, D1521.3.patch
>
>
> The current implementation of HDFS stores the data in one block file and the
> metadata(checksum) in another block file. This means that every read into the
> HBase block cache actually consumes two disk iops, one to the datafile and
> one to the checksum file. This is a major problem for scaling HBase, because
> HBase is usually bottlenecked on the number of random disk iops that the
> storage-hardware offers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira