[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224820#comment-13224820
 ] 

Phabricator commented on HBASE-5521:
------------------------------------

mbautin has requested changes to the revision "HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context".

  Yongqiang: thanks for the revised version. I have some more comments 
(inline). Also, could you please add some more unit tests specifically for the 
encoding/decoding context stuff?

INLINE COMMENTS
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:328
 Rename this to internalEncodeKeyValues ("Encoding" -> "Encode").

  Also, this should probably be at least protected, because a public method 
called "internalFoo" is confusing.
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:336
 Please replace this with just

    if (encodingCtx.getClass() != HFileBlockDefaultEncodingContext.class)

  This works because class objects are unique in Java within the same class 
loader.

  
http://stackoverflow.com/questions/3738919/does-java-guarantee-that-object-getclass-object-getclass
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:348
 What if encoding type is NONE? Would ENCODED_DATA still be valid here?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java:36
 -> internalEncodeKeyValues
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:28-37 
Update the Javadoc to reflect the fact that encoder also deals with block 
compression and describe the relation between these two steps.
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java:109 
wrong parameter name
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:50 
Fill out the description for the encoding parameter
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:56 
Get rid of this variable and use the constant directly
  
src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:346 
-> internalEncodeKeyValues
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:25
 Readability:

  one reader's -> a reader's
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:38
 need to done -> need to be done
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java:42-44
 It is not clear what parameter is input and what is output. Please fill in the 
descriptions.
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java:39
 Make this final
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java:34
 Not sure what is meant by "as a whole bytes buffer". Could you please re-word 
this?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java:59-60
 Do these need to be package-private? Can they be made private?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:26
 the write's whole lifetime -> the writer's whole lifetime (I assume that is 
what you meant)
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:25
 one writer's encoder -> a writer's encoder

  The "one writer's encoder" wording hints that an encoding context may 
potentially be shared with other writers. I am assuming that is not the case. 
If it is, could you please clarify the scope of sharing of an encoding context?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:54
 It is not clear what "prepare an encoding" actually means. Could you please 
provide more detail here?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:67
 any action that need -> any action that needs
  
src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java:68-69
 NONE could also be considered a "valid compression algorithm" by some. Please 
remove the ambiguity.
  
src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java:78 
-> internalEncodeKeyValues
  src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java:332-333 Add 
descriptions. It looks like there is an assumption that the dest byte array has 
enough space allocated to store the decompressed block.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:538 Add 
javadoc clarifying in what cases we use this context.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:687 Moving 
compression into the encoding framework introduces the need for separate logic 
for compressing data blocks and all other blocks. Is it possible to reduce code 
duplication between compression for data and non-data blocks?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:795 Code 
style: add a space after "if".
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1397 to it -> 
in it.

  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1421 Make this 
line less than 80 characters.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java:94 
create a encoder -> Create an encoder
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:44
 Can this be made private and final?
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:171
 The function used to return its result but now it does not. Please specify 
what is the function's output in the javadoc.
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:193
 Add javadoc describing the input and the output.
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:235
 Add space between if and (
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:237
 Add space between if and (
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:249
 Add space between if and (
  
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:251
 Add space between if and (
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java:53 
Where is the output now? Add javadoc.
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java:65 
This seems to assume that the payload starts at the beginning of the buffer's 
backing array, and discards the buffer's arrayOffset() and limit(). Is that 
intended? Why pass a ByteBuffer then?

  Alternatively, should compressAfterEncoding take a starting offset and a 
length along with the byte array?
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:100-102
 Fix long line
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:119-123
 Remove commented-out code
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java:100
 Does "Baos" stand for "Byte array output stream"? This function returns a 
ByteArrayInputStream.

  Please rename this function to remove the confusion. Also, please avoid using 
"Baos" in function names.

  Also please add javadoc.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:385 Please 
make this an assertion with a readable error message. No one will notice 
"mismatch.." in test output or understand what it means.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:447 Add 
space between if and (

REVISION DETAIL
  https://reviews.facebook.net/D2097

BRANCH
  svn

                
> Move compression/decompression to an encoder specific encoding context
> ----------------------------------------------------------------------
>
>                 Key: HBASE-5521
>                 URL: https://issues.apache.org/jira/browse/HBASE-5521
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to