[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516156#comment-16516156
 ] 

Eshcar Hillel commented on HBASE-20542:
---------------------------------------

Patch is attached.
To reduce internal fragmentation the size of the active segment is set to be 
the size of one MSLAB chunk (by default 2MB).
An add operation is supplemented with pre-update and post update procedures.
The pre-update procedure atomically increases the size of the segment if this 
increment does not exceed the segment size threshold, and then continues with 
the normal path of updating the memstore.
If the increment will exceed the segment size threshold then the size is not 
increased and instead 
(1) the segment is flushed into the compaction pipeline,
(2) a new active segment is created, 
(3) an IMC task is scheduled in the background,
(4) the operation re-runs the pre-update procedure, this time with the new 
active segment.

This changes calls for an additional optimization.
The IMC no longer needs to acquire the region level updates lock. Instead we 
use segment level read-write lock to synchronize IMC with concurrent update 
operations. This is better since with the new solution IMC only needs to wait 
only for those few operations that already updated the size of the segment in 
the pre-update procedure but are still updating the segment skip list, and does 
not need to wait for operations of other stores. Moreover, update operation do 
not wait for in-memory flush to complete as before.
To synchronize, update operation take the read lock of the segment they are 
updating in the pre-update procedure, and release it in the post-update 
procedure. IMC thread take the write lock of each segment it is compacting. 
This ensures all updates that started before the in-memory flush have completed.

I will upload the patch also in RB.
Feel free to ask questions and comment.


> Better heap utilization for IMC with MSLABs
> -------------------------------------------
>
>                 Key: HBASE-20542
>                 URL: https://issues.apache.org/jira/browse/HBASE-20542
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>            Priority: Major
>         Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to