Can you share some more details about it? A graph/chart/table showing the specific difference will be helpful.
Thanks, Jimmy On Thu, Jul 18, 2013 at 10:23 AM, Ted Yu <yuzhih...@gmail.com> wrote: > I have been following comments on HBASE-8496. > > I think introducing cell tagging through HFile v3 is acceptable. > > Looking forward to seeing your implementation. > > Cheers > > On Thu, Jul 18, 2013 at 10:14 AM, ramkrishna vasudevan < > ramkrishna.s.vasude...@gmail.com> wrote: > > > For the past couple of months, we have been working through various > > prototypes for supporting inline storage of tags in cells as persisted on > > disk. Our goals are to support optional use of tags with minimal changes > to > > core code while also avoiding performance impacts to users who do not use > > tags. > > > > For background, refer to the comments in > > > > > > > https://issues.apache.org/jira/browse/HBASE-8496?focusedCommentId=13708228&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13708228 > > > > and > > > > > > > https://issues.apache.org/jira/browse/HBASE-8496?focusedCommentId=13710653&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13710653 > > > > We have iterated on a couple of prototypes that implement tag awareness > in > > DataBlockEncoders, later as a new type of Codec for Cells. This point is > > discussed in the above comments in HBASE-8496. > > > > We think that tag awareness in Cell Codecs is the right way, but there > are > > some shortcomings with the current interfaces internal to HFile that need > > to addressed in order to avoid any performance impacts for those who do > not > > want to use inline tags, and that may involve a drastic amount of code > > change. > > > > We can avoid several problems with HFile V2 internals, and backwards > > compatibility concerns, and allow for working tags support with no > > performance impact and low risk to all HBase users who do not want tag > > support, while still allowing for inline tags capabilities in a shipping > > version of HBase, by introducing this in a new V3 version for HFile. > > > > The new V3 version for HFile differs from earlier versions by supporting > > inline tag storage. This version does not change the HFileBlock format > > whereas it just serializes and deserializes the Tag information that > would > > be persisted in the HFile. Having HFile V3 would also help to keep Tags > > optional such that the existing cases where there are no tags are totally > > unaffected. Also we ensure that we keep the changes outside of the V3 > > reader and writer minimal. Compatibility would not be a problem with > > future versions when we go with Cell Codecs. What Codecs used for > writing > > the file will be persisted in the HFile header. Now for files that are > > either V2 or V3 we will instantiate two default codecs that know to deal > > with serializations with and without tags. > > > > There have been thoughts on an HFile V3 prior, e.g.: > > > > > > > https://issues.apache.org/jira/browse/HBASE-8496?focusedCommentId=13710653&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13710653 > > > > We have been working on this and will have a clean patch with good > amount > > of testing in time for 0.96. > > > > Although our focus is on performance-neutral persistence of inline cell > > tags in 0.96 to enable a couple of security coprocessor users, > introducing > > an HFile V3 provides design freedom for some other features and problems > > too that can be developed through the 0.96 cycle into 0.98. > > > > Pls voice your opinion on this so that we can make this clear and may be > > define the scope of the patch. Also feel free to comment on HBASE-8496 > on > > your thoughts and ideas. > > > > Regards > > > > Ram > > >