[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201678#comment-13201678
 ] 

Phabricator commented on HBASE-5074:
------------------------------------

mbautin has commented on the revision "[jira] [HBASE-5074] Support checksums in 
HBase block cache".

  Some more comments. I am still concerned about the copy-paste stuff in 
backwards-compatibility checking. Is there a way to minimize that?

  I also mentioned this in the comments below, but it would probably make sense 
to add more "canned" files in the no-checksum format generated by the old 
writer and read them with the new reader, the same way HFile v1 compatibility 
is ensured. I don't mind keeping the old writer code around in the unit test, 
but I think it is best to remove as much code from that legacy writer as 
possible (e.g. versatile API, toString, etc.) and only leave the parts 
necessary to generate the file for testing.

INLINE COMMENTS
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:164
 Long line
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:83
 Can this be made private if it is not accessed outside of this class?
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:78
 Use ALL_CAPS for constants
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:76
 There seems to be a lot of copy-and-paste from the old HFileBlock code here. 
Is there a way to reduce that?

  I think we also need to create some canned old-format HFiles (using the old 
code) and read them with the new reader code as part of the test.
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:365
 Make this class final.

  Also, it would make sense to strip this class down as much as possible to 
maintain the bare minimum of code required to test compatibility (if you have 
not done that already).
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:800
 Do we ever use this function?
  
src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java:188
 Is 0 the minor version with no checksums? If so, please replace it with a 
constant for readability.
  
src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java:356 
Is 0 the minor version with no checksums? If so, please replace it with a 
constant for readability.
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java:300
 Is 0 the minor version with no checksums? If so, please replace it with a 
constant for readability.

REVISION DETAIL
  https://reviews.facebook.net/D1521

                
> support checksums in HBase block cache
> --------------------------------------
>
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
> D1521.2.patch, D1521.3.patch, D1521.3.patch
>
>
> The current implementation of HDFS stores the data in one block file and the 
> metadata(checksum) in another block file. This means that every read into the 
> HBase block cache actually consumes two disk iops, one to the datafile and 
> one to the checksum file. This is a major problem for scaling HBase, because 
> HBase is usually bottlenecked on the number of random disk iops that the 
> storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to