[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524687#comment-16524687
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---------------------------------------

[~chia7712] patch003 is as follows,

Bugs(4)
- Encoder ROW_INDEX_V1 throw error, things go wrong in class EncodedDataBlock 
{code:java}
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
{code}
the problem is ROW_INDEX_V1 write _onDiskDataSize_ in _out(DataOutputStream)_, 
the others write _onDisDataSize_ in _baosBytes(byte array)_ directly,
since _onDiskDataSize_ is neccessary in the next steps, we need to flush _out_ 
again after _endBlockEncoding_ to write _onDiskDataSize_.

- DataBlockEncodingTool _checkStatistics_ would let currentKV be null, fixed. 

- DataBlockEncodingTool _checkStatistics_ missing MemstoreTS.

- _compressedStream.reset()_ should happen before 
_compressingStream.resetState()_ since in GZ 
  _resetStatue()_ will write header in outputstream. If we let 
_compressedStream.reset()_ under _compressingStream.resetState()_, the header 
is gone.

Tests(1)
- Going through all the DataBlockEncodingTool with GZ compression algorithm to 
make sure it will run, not just compile correct.

Docs(1)
- Write down how to use DataBlockEncodingTool, which options is neccessary and 
all the options in detail, and the result after using this tool.

Others(1)
- Change options name OPT_ENCODING_ALGORITHM to OPT_COMPRESSION_ALGORITHM.

> add UT and docs for DataBlockEncodingTool
> -----------------------------------------
>
>                 Key: HBASE-18201
>                 URL: https://issues.apache.org/jira/browse/HBASE-18201
>             Project: HBase
>          Issue Type: Task
>          Components: tooling
>            Reporter: Chia-Ping Tsai
>            Assignee: Kuan-Po Tseng
>            Priority: Minor
>              Labels: beginner
>         Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to