ramkrishna.s.vasudevan commented on HBASE-16608:

What we are trying to do is actually helping your use case and also default use 
case. In your use case what you need is in memory flushes and compaction so 
that you remove all duplicates. So as per the existing trunk code that was 
committed that can happen quite easily when the input has lot of duplicates and 
you can reduce the size of the segment and combine them into one. So when the 
pipeline is flushed  you get one small segment and so lesser IO and that is the 
minimum data that you will be needing. It should be fine. While compacting and 
creating a new segment you could actually move that to a ARRAY based map or 
chunk based map (flattening). It will solve GC issues too and at the same time 
your use case also.
Now if there is a toggle or flag sort of thing for user like in case of default 
use cases, there is no need for compaction then we can create a list of 
segments (pipeline) and let the flush flush all the segments of the pipeline. 
Now every time we move active to the pipeline we can ensure that we do 
flattening of the pipleine either to array based or chunk based. This will 
ensure that GC overhead is also reduced and at the same time we have the same 
amount of data being flushed as that of default memstore. 
Yes, with pipeline memstore we have done some benchmarks and we have published 
them over here

No problem we can have a discussion to see what are your thoughts here. 
Actually if we keep things simple then others in the community will also get a 
better confidence on this and with 2.0 coming up it wil be easy to have a fully 
functional feature as per user's requirement can use it efficiently.

> Introducing the ability to merge ImmutableSegments without copy-compaction or 
> SQM usage
> ---------------------------------------------------------------------------------------
>                 Key: HBASE-16608
>                 URL: https://issues.apache.org/jira/browse/HBASE-16608
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Anastasia Braginsky
>         Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch, 
> HBASE-16417-V06.patch, HBASE-16417-V07.patch, HBASE-16417-V08.patch, 
> HBASE-16417-V10.patch, HBASE-16608-V01.patch, HBASE-16608-V03.patch, 
> HBASE-16608-V04.patch, HBASE-16608-V08.patch

This message was sent by Atlassian JIRA

Reply via email to