[
https://issues.apache.org/jira/browse/HBASE-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412234#comment-15412234
]
Anoop Sam John commented on HBASE-15554:
----------------------------------------
Am sorry if I was not saying it clear. I dont mean still patch is having
duplicate. What I mean is when I say Iterator based HashKey, I wanted it to be
single structure we use with Hash rather than byte[]/BB/Cell.. But if the algo
demands an offset based byte getter am fine.
bq.one is that what ever be the cell format we should finally assume the back
end is KV format key only. Because the offset and length that we pass to the
hash algo is assuming that it is continuous
Why we need pass an offset to hash() function? We need pass HashKey.
Internally the impl of HashKey has to know which byte to be returned when
getters are called on it. Ya if u dont have iterator model u will have get(int)
which return byte. So the Hash functions has to call get() based on relative
offset eg: get(0), get(1) etc. Not like cur way of offset+1, offset+2. When
the impl gets these calls, it has to convert it into absolute offsets. It is
not that simple in ROW_COL case. Here based on the coming offset you have it
map it which area of the Cell this belongs also. That is what I was trying to
say. When get(0) or get(1) is called, those comes in rkLen part. get(2) -
get(<rkLen>+2) these belong to rk bytes. So will have to deal some sort of
math. So you really dont have to assume that the Cell is of KV serialization.
Just like in the past which all bytes of Cell where , continue to use those.
Am I making it clear now? It would be good if we can remove any sort of KV
assumption from the code path. I think it is pending only in this Bloom area.
> StoreFile$Writer.appendGeneralBloomFilter generates extra KV
> ------------------------------------------------------------
>
> Key: HBASE-15554
> URL: https://issues.apache.org/jira/browse/HBASE-15554
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Reporter: Vladimir Rodionov
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-15554.patch, HBASE-15554_10.patch,
> HBASE-15554_3.patch, HBASE-15554_4.patch, HBASE-15554_6.patch,
> HBASE-15554_7.patch, HBASE-15554_9.patch
>
>
> Accounts for 10% memory allocation in compaction thread when BloomFilterType
> is ROWCOL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)