[
https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763176#comment-15763176
]
Anoop Sam John commented on HBASE-16421:
----------------------------------------
Again sorry for me not being clear
I dont mean that the way Cell pojos will get created be different.
What I mean is this
In case of read path of reading from off heap cache or from HDFS fetch block
data, this creation of Cells are a must. And this was happening from day one.
So there is no diff happened wrt that in off heaping read path work. So no
possible perf impact because of that
But write to memstore and read from there case is diff.. Till now we have Cell
objects as such as long as those are in memstore. (Active segment or Immutable
segment). So when read happens (this read can be because of user read or scan
for in memory compaction / disk flush) no need to create Cell objects newly.
They are already in Java heap and just return those. But once we have
CellChunkMap, what that flatten will do is remove these Cell objects and keep
an index instead (chunk id + offset + length). So when there is a read over
the segment, we will end up creating all these Cell POJOs back in heap. There
wont be any data copy. Just POJO creation. So u can see the diff it is going
to make from what we have today and all these days. Am not saying this will
kill perf. This is surely an added pressure wrt more garbage. But surely the
CellChunkMap helps with avoiding heap pressure as it eliminate the need to keep
Cells (As long no read happens).. So there is +ve and -ve sides. What I say
is we just need to make sure the +ve side is more so that overall we get
benefit.. I guess we will get. Just to make sure. So before splitting the code
into patches and spending time in making it and review and rework on those, a
small PoC and testing these with more data and load (the way ur recent tests
did) can help getting these perf numbers. U might be having most part of the
code CellChunkMap.. Any thing u need from us, we can provide as a PoC patch
too. Just to make sure no effort will go waste am saying this.
Again am NOT saying CellChunkMap will make perf impact and am NOT against doing
it. Just want to make sure we know the perf numbers before doing actual
patches.
U get me now? :-)
> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>
> Key: HBASE-16421
> URL: https://issues.apache.org/jira/browse/HBASE-16421
> Project: HBase
> Issue Type: Umbrella
> Reporter: Anastasia Braginsky
> Attachments: CellChunkMapRevived.pdf,
> IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Follow up for HBASE-14921. This is going to be the umbrella JIRA to include
> all the parts of integration of the CellChunkMap to the MemStore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)