[
https://issues.apache.org/jira/browse/HBASE-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-2752:
-------------------------
Attachment: 2752.txt
Notes on the patch for 0.20.5RC4:
Adds a delayqueue to hold regions to flush.
The delay queue takes a data structure that holds Region and time of
construction so can tell how long we've been hanging out in the queue.
The flushRegion method was refactored to remove crud. Alot of the crud was old
comments talking of 'compactions running inline with flush', behaviors
long-since left behind. The flushRegion was split up into two methods. One
that will check if we should delay first and then another method that holds
everything else. The former is called whenever we flush normally. The latter
is called directly when emergency flush required.
Removed checkStoreFileCount. It was only being used in the emergency flush
case but, its use here was incorrect (if its an emergency flush, don't want to
wait if too many store files).
Please review.
> Don't retry forever when waiting on too many store files
> --------------------------------------------------------
>
> Key: HBASE-2752
> URL: https://issues.apache.org/jira/browse/HBASE-2752
> Project: HBase
> Issue Type: Improvement
> Reporter: Jean-Daniel Cryans
> Assignee: stack
> Priority: Critical
> Fix For: 0.20.5, 0.21.0
>
> Attachments: 2752.txt
>
>
> HBASE-2087 introduced a way to not block all flushes when on region has too
> many store files. Unfortunately, that undid the behavior that if we waited
> for longer than 90 secs then that we would still flush the region... which
> means that when a region blocks inserts because its memstore is too big it's
> actually holding off writes for a very long time, occupying handlers, etc.
> We need to add more smarts in MemStoreFlusher so that we detect when a region
> was held up for too long.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.