[
https://issues.apache.org/jira/browse/HBASE-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585971#comment-15585971
]
ramkrishna.s.vasudevan commented on HBASE-16608:
------------------------------------------------
[~ebortnik]
What we are trying to do is actually helping your use case and also default use
case. In your use case what you need is in memory flushes and compaction so
that you remove all duplicates. So as per the existing trunk code that was
committed that can happen quite easily when the input has lot of duplicates and
you can reduce the size of the segment and combine them into one. So when the
pipeline is flushed you get one small segment and so lesser IO and that is the
minimum data that you will be needing. It should be fine. While compacting and
creating a new segment you could actually move that to a ARRAY based map or
chunk based map (flattening). It will solve GC issues too and at the same time
your use case also.
Now if there is a toggle or flag sort of thing for user like in case of default
use cases, there is no need for compaction then we can create a list of
segments (pipeline) and let the flush flush all the segments of the pipeline.
Now every time we move active to the pipeline we can ensure that we do
flattening of the pipleine either to array based or chunk based. This will
ensure that GC overhead is also reduced and at the same time we have the same
amount of data being flushed as that of default memstore.
Yes, with pipeline memstore we have done some benchmarks and we have published
them over here
https://docs.google.com/document/d/1fj5P8JeutQ-Uadb29ChDscMuMaJqaMNRI86C4k5S1rQ/edit.
No problem we can have a discussion to see what are your thoughts here.
Actually if we keep things simple then others in the community will also get a
better confidence on this and with 2.0 coming up it wil be easy to have a fully
functional feature as per user's requirement can use it efficiently.
> Introducing the ability to merge ImmutableSegments without copy-compaction or
> SQM usage
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-16608
> URL: https://issues.apache.org/jira/browse/HBASE-16608
> Project: HBase
> Issue Type: Sub-task
> Reporter: Anastasia Braginsky
> Assignee: Anastasia Braginsky
> Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch,
> HBASE-16417-V06.patch, HBASE-16417-V07.patch, HBASE-16417-V08.patch,
> HBASE-16417-V10.patch, HBASE-16608-V01.patch, HBASE-16608-V03.patch,
> HBASE-16608-V04.patch, HBASE-16608-V08.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)