[ https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516186#comment-16516186 ]
Eshcar Hillel commented on HBASE-20542: --------------------------------------- Attaching the ycsb scripts used for benchmarking. Two sets of runs [^run.sh] . First, write-only zipfian [^workloadx] with 10 region pre-split, followed by a read-only zipfian [^workloady] reading only one column. Second is the standard uniform load (a), mixed read-write [^workloada] , read-only [^workloadc] reading all columns. This is a comparison of the average throughput and lift vs no-IMC: ||comp || index || workloadx || workloady || load || workloada || workloadc || | NONE | - | 49,369 | 17,682 | 11,010 | 10,468 | 7,779 | | BASIC | CAM | 57965 | 17,132 | 11,854 | 10,318 | 7,552 | | | | +17.41% | -3.11% | +7.67% | -1.44% | -2.91% | |BASIC| CCM | 52,296 | 16,644 | 12,140 | 9,705 | 7,465 | | | | +6% | -6% | +10% | -7% | -4%| > Better heap utilization for IMC with MSLABs > ------------------------------------------- > > Key: HBASE-20542 > URL: https://issues.apache.org/jira/browse/HBASE-20542 > Project: HBase > Issue Type: Sub-task > Reporter: Eshcar Hillel > Assignee: Eshcar Hillel > Priority: Major > Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, > workloadc, workloadx, workloady > > > Following HBASE-20188 we realized in-memory compaction combined with MSLABs > may suffer from heap under-utilization due to internal fragmentation. This > jira presents a solution to circumvent this problem. The main idea is to have > each update operation check if it will cause overflow in the active segment > *before* it is writing the new value (instead of checking the size after the > write is completed), and if it is then the active segment is atomically > swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to > the compaction pipeline. Later on the IMC deamon will run its compaction > operation (flatten index/merge indices/data compaction) in the background. > Some subtle concurrency issues should be handled with care. We next elaborate > on them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)