[
https://issues.apache.org/jira/browse/HBASE-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694470#comment-13694470
]
ramkrishna.s.vasudevan commented on HBASE-8496:
-----------------------------------------------
bq.avg_tag_len = 0 would indicate that there is no tag present. Why do we need
two flags (tagpresent and avg_tag_len) ?
When we don't use the keyvaluecodec approach, when i flush the memstore i would
be getting every KV and then writing it into blocks. So in one flush i can
have only one KV with tag and all others without tag. So i cannot make a
decision on the presence of tags before i could complete one Hfileblock.
So while flush happens i would always write the taglength part of the kv but
there may not be any tags in that block. So inorder to decode this block when
i read i should have an indicator that i have written this block with the
taglength (tagpresent flag) but the avg_tag_len would indicate whether i need
not read the 4 byte INT but just skip the 4 bytes and reposition the buffer.
bq.Later compaction is mentioned where tagpresent is changed to false. But we
should be able to achieve this at the time of flush, right ?
Always flush would right the taglength even if tags are not present. when the
same HFileblock is being read for compaction i would just use the above logic
and avoid writing the even taglength part in the compacted file and so now in
this compacted file the hfileblock would have tagpresent=false and
avg_tag_len=0. Pls note that in the HFileblock level this would be two
individual bytes - 1 indicates true and 0 indicates false.
bq.In the above sample, I would expect decodeTag() to return more than one Tag.
Yes Ted you are right. In the example that i attached there there was only one
tag. Ideally we would be using like this
{code}
Iterator<byte[]> tagIterator = CellUtil.getTagIterator(tagArray);
List<Tag> tagList = new ArrayList<Tag>();
while (tagIterator.hasNext()) {
byte[] tag = tagIterator.next();
Tag t =(KeyValueUtil.decodeTag(tag));
}
{code}
bq.I think it would be better if Tag.Type.Visibility is passed to decodeTag()
so that only visibility Tag is returned.
We can have one such method so that decodeTag would only return a KV if the
specified type of tag is present.
bq.Would all Tags in the KeyValue be returned to filterKeyValue() ?
If you see in terms of visibility/ACL tags if a user is authorised to read that
KV then returning the KV with tags should be fine i feel. We can discuss on
this.
I am working on the performance reports on using KeyValueCodec. Will share more
info on that soon.
Thanks for your reviews.
> Implement tags and the internals of how a tag should look like
> --------------------------------------------------------------
>
> Key: HBASE-8496
> URL: https://issues.apache.org/jira/browse/HBASE-8496
> Project: HBase
> Issue Type: New Feature
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.98.0
>
> Attachments: Tag design.pdf
>
>
> The intent of this JIRA comes from HBASE-7897.
> This would help us to decide on the structure and format of how the tags
> should look like.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira