[
https://issues.apache.org/jira/browse/HBASE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anoop Sam John updated HBASE-16162:
-----------------------------------
Summary: Compacting Memstore : unnecessary push of active segments to
pipeline (was: Unnecessary push active segments to pipeline)
> Compacting Memstore : unnecessary push of active segments to pipeline
> ---------------------------------------------------------------------
>
> Key: HBASE-16162
> URL: https://issues.apache.org/jira/browse/HBASE-16162
> Project: HBase
> Issue Type: Sub-task
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Priority: Critical
>
> We have flow like this
> {code}
> protected void checkActiveSize() {
> if (shouldFlushInMemory()) {
> InMemoryFlushRunnable runnable = new InMemoryFlushRunnable();
> }
> getPool().execute(runnable);
> }
> }
> private boolean shouldFlushInMemory() {
> if(getActive().getSize() > inmemoryFlushSize) {
> // size above flush threshold
> return (allowCompaction.get() && !inMemoryFlushInProgress.get());
> }
> return false;
> }
> void flushInMemory() throws IOException {
> // Phase I: Update the pipeline
> getRegionServices().blockUpdates();
> try {
> MutableSegment active = getActive();
> pushActiveToPipeline(active);
> } finally {
> getRegionServices().unblockUpdates();
> }
> // Phase II: Compact the pipeline
> try {
> if (allowCompaction.get() &&
> inMemoryFlushInProgress.compareAndSet(false, true)) {
> // setting the inMemoryFlushInProgress flag again for the case this
> method is invoked
> // directly (only in tests) in the common path setting from true to
> true is idempotent
> // Speculative compaction execution, may be interrupted if flush is
> forced while
> // compaction is in progress
> compactor.startCompaction();
> }
> {code}
> So every write of cell will produce the check checkActiveSize(). When we
> are at border of in mem flush, many threads doing writes to this memstore
> can get this checkActiveSize () to pass. Yes the AtomicBoolean is still
> false only. It is turned ON after some time once the new thread is started
> run and it push the active to pipeline etc.
> In the new thread code of inMemFlush, we dont have any size check. It just
> takes the active segment and pushes that to pipeline. Yes we dont allow any
> new writes to memstore at this time. But before that write lock on
> region, other handler thread also might have added entry to this thread pool.
> When the 1st one finishes, it releases the lock on region and handler
> threads trying for write to memstore, might get lock and add some data. Now
> this 2nd in mem flush thread may get a chance and get the lock and so it just
> takes current active segment and flush that in memory ! This will produce
> very small sized segments to pipeline.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)