[
https://issues.apache.org/jira/browse/HBASE-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293863#comment-15293863
]
ramkrishna.s.vasudevan commented on HBASE-15554:
------------------------------------------------
Thanks for the comments.
bq.The HasKey goes to far
I can try pushing this under CellUtils.
But the idea of seeing key as consecutive entity is helpful in all the places
where we build an index. All index in the system (root index, bloom index) is
from the key part. The idea of knowing that key is consecutive helps us to
avoid multiple copies or part by part copies that we do.
bq..How many Interfaces do we have currently around Cell and KeyValue? It might
be worth listing them?
Sure we can. Infact if we accept HasKey we can avoid the KeyOnlyKV and
BufferedKeyonlyKV concepts.
bq.Up to this we had Cell and we had Cell with empty value. Key is new concept,
or rather, it is an old one in that we always just kept Key in indices and
blooms... and you are trying to formalize it now?
Yes . We are only trying to ascertain that if possible try to identify keys
directly. Infact in another issue, [~anoop.hbase] was also saying that
getKeyBuffer cannot be deprecated and better to have it. So this HasKey is
making it more formal.
bq.Does a Cell have a Key?
No. Cell need not have a key. If you see the current patch HasKey is attributed
only with the Cell types. I don't think Cells need to have a key. Let Cell be
with the notion that row, families, quals, tags and values are all independent.
bq.KeyValue has a Key (makes sense). If anything the Interface shoudl be called
Key? So, we have getKeyArray and offset and length. Do the latter work if a
byte [] or BB? In same way as ServerCell?
Ya we can call it Key. No problem. Yes. The Key interface also will have both
byte[] and BB based API. Let us discuss on how to handle if a getKeyBB is
called on a byte[] cell and if a getKeyArray is called on a BB cell. But
remember this Key interface is going to be private and will not be exposed to
the user or in clients.
bq.You can ask it for family and qualifier pieces? And timestamps? You use the
Cell APIs to do this against a Key?
But to construct the index that we have now how do we create it without copying
them every time? Can you say more on what you think here. May be am not getting
your bigger idea.
bq.Maybe there are some type refactorings we could do that could get rid of a
bunch of Interfaces?
We could surely see that. Infact last time we took up that task but could not
find much. But can revisit once again.
Thanks once again for the comments.
> StoreFile$Writer.appendGeneralBloomFilter generates extra KV
> ------------------------------------------------------------
>
> Key: HBASE-15554
> URL: https://issues.apache.org/jira/browse/HBASE-15554
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Reporter: Vladimir Rodionov
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-15554.patch, HBASE-15554_3.patch,
> HBASE-15554_4.patch
>
>
> Accounts for 10% memory allocation in compaction thread when BloomFilterType
> is ROWCOL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)