[
https://issues.apache.org/jira/browse/HBASE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249383#comment-13249383
]
Matt Corgan commented on HBASE-5720:
------------------------------------
Sounds fine to me. .94 is the important fix, while trunk has some time for more
thorough testing, and things will probbaly get moved around even more by other
features.
Eventually it would be nice to isolate the code that deals with the fragile
byte[]'s inside HFile blocks into separate src directories or projects. Then
have that code implement interfaces similar to DataBlockEncoder.java. This
would:
* make it more testable, like a normal in-memory data structure without having
to set up heavyweight testing environments
* separate the encoding concerns from IO concerns. after the checksum happens,
encoders/decoders should not even know what an IOException is
* strongly discourage people from modifying anything in the codec packages
without knowing what they're getting into
* ensure the main project code only references the interfaces and not any codec
internals (see if main project compiles without codecs in classpath)
* make it easier for contributors to develop and profile the codecs without
having to become experts in all aspects of hbase
* help to simplify the main project. imagine if the gzip or snappy internals
were sprinkled throughout the regionserver code. yikes.
> HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no
> checksums
> --------------------------------------------------------------------------------------
>
> Key: HBASE-5720
> URL: https://issues.apache.org/jira/browse/HBASE-5720
> Project: HBase
> Issue Type: Bug
> Components: io, regionserver
> Affects Versions: 0.94.0
> Reporter: Matt Corgan
> Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5720-trunk-v2.txt, 5720-trunk.txt, 5720v4.txt,
> 5720v4.txt, 5720v4.txt, HBASE-5720-v1.patch, HBASE-5720-v2.patch,
> HBASE-5720-v3.patch
>
>
> When reading a .92 HFile without checksums, encoding it, and storing in the
> block cache, the HFileDataBlockEncoderImpl always allocates a dummy header
> appropriate for checksums even though there are none. This corrupts the
> byte[].
> Attaching a patch that allocates a DUMMY_HEADER_NO_CHECKSUM in that case
> which I think is the desired behavior.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira