[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516186#comment-16516186
 ] 

Eshcar Hillel commented on HBASE-20542:
---------------------------------------

Attaching the ycsb scripts used for benchmarking.
Two sets of runs  [^run.sh] .
First, write-only zipfian  [^workloadx]  with 10 region pre-split, followed by 
a read-only zipfian  [^workloady] reading only one column.
Second is the standard uniform load (a), mixed read-write  [^workloada] , 
read-only  [^workloadc]  reading all columns.
This is a comparison of the average throughput and lift vs no-IMC:

||comp ||       index || workloadx || workloady || load || workloada || 
workloadc ||
| NONE | - | 49,369      | 17,682 |     11,010 |        10,468 |        7,779 |
| BASIC | CAM | 57965 | 17,132 |        11,854 | 10,318 |       7,552 |
| | | +17.41% | -3.11%  | +7.67%        | -1.44%        | -2.91% |
|BASIC| CCM | 52,296 | 16,644 | 12,140 | 9,705 | 7,465 |
| | | +6%       | -6% | +10%    | -7% | -4%|



> Better heap utilization for IMC with MSLABs
> -------------------------------------------
>
>                 Key: HBASE-20542
>                 URL: https://issues.apache.org/jira/browse/HBASE-20542
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>            Priority: Major
>         Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to