[
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224820#comment-13224820
]
Phabricator commented on HBASE-5521:
------------------------------------
mbautin has requested changes to the revision "HBASE-5521 [jira] Move
compression/decompression to an encoder specific encoding context".
Yongqiang: thanks for the revised version. I have some more comments
(inline). Also, could you please add some more unit tests specifically for the
encoding/decoding context stuff?
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:328
Rename this to internalEncodeKeyValues ("Encoding" -> "Encode").
Also, this should probably be at least protected, because a public method
called "internalFoo" is confusing.
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:336
Please replace this with just
if (encodingCtx.getClass() != HFileBlockDefaultEncodingContext.class)
This works because class objects are unique in Java within the same class
loader.
http://stackoverflow.com/questions/3738919/does-java-guarantee-that-object-getclass-object-getclass
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:348
What if encoding type is NONE? Would ENCODED_DATA still be valid here?
src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java:36
-> internalEncodeKeyValues
src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:28-37
Update the Javadoc to reflect the fact that encoder also deals with block
compression and describe the relation between these two steps.
src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java:109
wrong parameter name
src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:50
Fill out the description for the encoding parameter
src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:56
Get rid of this variable and use the constant directly
src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:346
-> internalEncodeKeyValues
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:25
Readability:
one reader's -> a reader's
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:38
need to done -> need to be done
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:42-44
It is not clear what parameter is input and what is output. Please fill in the
descriptions.
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java:39
Make this final
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java:34
Not sure what is meant by "as a whole bytes buffer". Could you please re-word
this?
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java:59-60
Do these need to be package-private? Can they be made private?
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:26
the write's whole lifetime -> the writer's whole lifetime (I assume that is
what you meant)
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:25
one writer's encoder -> a writer's encoder
The "one writer's encoder" wording hints that an encoding context may
potentially be shared with other writers. I am assuming that is not the case.
If it is, could you please clarify the scope of sharing of an encoding context?
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:54
It is not clear what "prepare an encoding" actually means. Could you please
provide more detail here?
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:67
any action that need -> any action that needs
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:68-69
NONE could also be considered a "valid compression algorithm" by some. Please
remove the ambiguity.
src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:78
-> internalEncodeKeyValues
src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java:332-333 Add
descriptions. It looks like there is an assumption that the dest byte array has
enough space allocated to store the decompressed block.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:538 Add
javadoc clarifying in what cases we use this context.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:687 Moving
compression into the encoding framework introduces the need for separate logic
for compressing data blocks and all other blocks. Is it possible to reduce code
duplication between compression for data and non-data blocks?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:795 Code
style: add a space after "if".
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1397 to it ->
in it.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1421 Make this
line less than 80 characters.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:94
create a encoder -> Create an encoder
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:44
Can this be made private and final?
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:171
The function used to return its result but now it does not. Please specify
what is the function's output in the javadoc.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:193
Add javadoc describing the input and the output.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:235
Add space between if and (
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:237
Add space between if and (
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:249
Add space between if and (
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:251
Add space between if and (
src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java:53
Where is the output now? Add javadoc.
src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java:65
This seems to assume that the payload starts at the beginning of the buffer's
backing array, and discards the buffer's arrayOffset() and limit(). Is that
intended? Why pass a ByteBuffer then?
Alternatively, should compressAfterEncoding take a starting offset and a
length along with the byte array?
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:100-102
Fix long line
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:119-123
Remove commented-out code
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:100
Does "Baos" stand for "Byte array output stream"? This function returns a
ByteArrayInputStream.
Please rename this function to remove the confusion. Also, please avoid using
"Baos" in function names.
Also please add javadoc.
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:385 Please
make this an assertion with a readable error message. No one will notice
"mismatch.." in test output or understand what it means.
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:447 Add
space between if and (
REVISION DETAIL
https://reviews.facebook.net/D2097
BRANCH
svn
> Move compression/decompression to an encoder specific encoding context
> ----------------------------------------------------------------------
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
> Issue Type: Improvement
> Reporter: He Yongqiang
> Assignee: He Yongqiang
> Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch,
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar
> encoder/decoder. It makes sense to move compression to be part of
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer.
> If there is no encoder specified for a writer, the HBlock.Writer will use a
> default compression-context to do something very similar to today's code.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira