[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527156#comment-16527156 ]
Reid Chan commented on HBASE-18201: ----------------------------------- {quote} Encoder ROW_INDEX_V1 throw error, things go wrong in class EncodedDataBlock this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes); the problem is ROW_INDEX_V1 write onDiskDataSize in out(DataOutputStream), the others write onDisDataSize in baosBytes(byte array) directly, since onDiskDataSize is neccessary in the next steps, we need to flush out again after endBlockEncoding to write onDiskDataSize. {quote} I think adjust the call order like following should works. No need to add another if branch, kind of confusing. {code} this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes); baos.flush(); baosBytes = baos.toByteArray(); {code} bq. boolean useTag = (prevKV.getTagsLength() > 0); Could we add {{useTag = currentKV.getTagsLength() > 0}} in while loop above? Once it is set true, the rest no needs to check. {code} HStoreFile hsf = new HStoreFile(fs, path, conf, cacheConf, BloomType.NONE, true); StoreFileReader reader = hsf.getReader(); boolean useTag = reader.getHFileReader().getFileContext().isIncludesTags(); {code} Kinds of heavy to create a HStoreFile instance just to use its {{isIncludesTags}} method. Few style problems: blank between '=', '{', '}', '(', ')'. > add UT and docs for DataBlockEncodingTool > ----------------------------------------- > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling > Reporter: Chia-Ping Tsai > Assignee: Kuan-Po Tseng > Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)