[ 
https://issues.apache.org/jira/browse/HBASE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249383#comment-13249383
 ] 

Matt Corgan commented on HBASE-5720:
------------------------------------

Sounds fine to me. .94 is the important fix, while trunk has some time for more 
thorough testing, and things will probbaly get moved around even more by other 
features. 

Eventually it would be nice to isolate the code that deals with the fragile 
byte[]'s inside HFile blocks into separate src directories or projects. Then 
have that code implement interfaces similar to DataBlockEncoder.java. This 
would: 
* make it more testable, like a normal in-memory data structure without having 
to set up heavyweight testing environments 
* separate the encoding concerns from IO concerns. after the checksum happens, 
encoders/decoders should not even know what an IOException is 
* strongly discourage people from modifying anything in the codec packages 
without knowing what they're getting into 
* ensure the main project code only references the interfaces and not any codec 
internals (see if main project compiles without codecs in classpath) 
* make it easier for contributors to develop and profile the codecs without 
having to become experts in all aspects of hbase 
* help to simplify the main project. imagine if the gzip or snappy internals 
were sprinkled throughout the regionserver code. yikes.
                
> HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no 
> checksums
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-5720
>                 URL: https://issues.apache.org/jira/browse/HBASE-5720
>             Project: HBase
>          Issue Type: Bug
>          Components: io, regionserver
>    Affects Versions: 0.94.0
>            Reporter: Matt Corgan
>            Priority: Blocker
>             Fix For: 0.94.0
>
>         Attachments: 5720-trunk-v2.txt, 5720-trunk.txt, 5720v4.txt, 
> 5720v4.txt, 5720v4.txt, HBASE-5720-v1.patch, HBASE-5720-v2.patch, 
> HBASE-5720-v3.patch
>
>
> When reading a .92 HFile without checksums, encoding it, and storing in the 
> block cache, the HFileDataBlockEncoderImpl always allocates a dummy header 
> appropriate for checksums even though there are none.  This corrupts the 
> byte[].
> Attaching a patch that allocates a DUMMY_HEADER_NO_CHECKSUM in that case 
> which I think is the desired behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to