[
https://issues.apache.org/jira/browse/HBASE-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880426#action_12880426
]
stack commented on HBASE-2752:
------------------------------
Thanks j-d for review. I added in your first suggestion. For the second, I
kept count. I think it'll be of use when we have a jsp page that dumps current
state of the flush queue.
I've been running it up on cluster. I see some of these during a big upload:
{code}
2010-06-18 18:02:17,864 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Waited 90495ms on a
compaction to clean up 'too many store files'; waited long enough... proceeding
with flush
{code}
...so it looks like we got the 0.20.3 behavior back where we'll go ahead and
flush regardless if we've waited N ms (I left the interval at the 0.20.3 90
seconds which seems a bit long but...).
I'm going to commit and roll an RC
> Don't retry forever when waiting on too many store files
> --------------------------------------------------------
>
> Key: HBASE-2752
> URL: https://issues.apache.org/jira/browse/HBASE-2752
> Project: HBase
> Issue Type: Improvement
> Reporter: Jean-Daniel Cryans
> Assignee: stack
> Priority: Critical
> Fix For: 0.20.5, 0.21.0
>
> Attachments: 2752.txt
>
>
> HBASE-2087 introduced a way to not block all flushes when on region has too
> many store files. Unfortunately, that undid the behavior that if we waited
> for longer than 90 secs then that we would still flush the region... which
> means that when a region blocks inserts because its memstore is too big it's
> actually holding off writes for a very long time, occupying handlers, etc.
> We need to add more smarts in MemStoreFlusher so that we detect when a region
> was held up for too long.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.