[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527455#comment-16527455 ]
Kuan-Po Tseng commented on HBASE-18201: --------------------------------------- {quote}BufferedDataBlockEncoder should work as well? Those encoders don't write onDiskDataSize.{quote} BufferedDataBlockEncoer#endBlockEncoding not writes onDiskDataSize but still writes integer called unencodedDataSizeWritten {code:java} @Override public void endBlockEncoding(HFileBlockEncodingContext encodingCtx, DataOutputStream out, byte[] uncompressedBytesWithHeader) throws IOException { BufferedDataBlockEncodingState state = (BufferedDataBlockEncodingState) encodingCtx .getEncodingState(); // Write the unencodedDataSizeWritten (with header size) Bytes.putInt(uncompressedBytesWithHeader, HConstants.HFILEBLOCK_HEADER_SIZE + DataBlockEncoding.ID_SIZE, state.unencodedDataSizeWritten ); postEncoding(encodingCtx); } {code} {quote}That's what i meant, but i'm not sure if it should.{quote} Yeah, DataBlockEncodingTool#checkStatistics iteratively called KeyValue#getBuffer and write it to unCompressedOutputStream. KeyValue#getBuffer return the backing array which should contains all elements in KV. {code:java} uncompressedOutputStream.write(currentKV.getBuffer(), currentKV.getOffset(), currentKV.getLength()); {code} > add UT and docs for DataBlockEncodingTool > ----------------------------------------- > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling > Reporter: Chia-Ping Tsai > Assignee: Kuan-Po Tseng > Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)