[ 
https://issues.apache.org/jira/browse/HBASE-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098543#comment-14098543
 ] 

Sean Busbey commented on HBASE-11729:
-------------------------------------

bq. Also FYI.. During compaction, if all the files under compaction having max 
tags length as 0, we even skip the 2 bytes tags length part with every KV and 
saving that overhead. Absence of MAX_TAGS_LENGTH file info means the Cell 
format is just like as in HFile V2

Okay, to make sure I understand let me try restating things.

1) If an HFile has a version number of 3.0+, I need to read the File Info Block 
before I can read any Data Blocks.

2) If the File Info Block contains an entry for MAX_TAGS_LENGTH, then KeyValues 
will be serialized as KeyValues + Tags (even if that length is 0). Otherwise, 
they'll be serialized the same as in earlier versions.

3) a File Info Block should only contain a TAGS_COMPRESSED entry if it has a 
MAX_TAGS_LENGTH entry.

Is that all correct?

> Document HFile v3
> -----------------
>
>                 Key: HBASE-11729
>                 URL: https://issues.apache.org/jira/browse/HBASE-11729
>             Project: HBase
>          Issue Type: Task
>          Components: documentation, HFile
>    Affects Versions: 0.98.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Trivial
>              Labels: beginner
>         Attachments: HBASE-11729.patch, HBASE-11729.pdf
>
>
> 0.98 added HFile v3. There are a couple of mentions of it in the book on the 
> sections on cell tags, but there isn't an actual overview or design 
> explanation like there is for [HFile 
> v2|http://hbase.apache.org/book/hfilev2.html].



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to