[ 
https://issues.apache.org/jira/browse/HBASE-12457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208708#comment-14208708
 ] 

Lars Hofhansl commented on HBASE-12457:
---------------------------------------

We've seen this on two machines now. The wait on the other machine was also 
close to 20m (18m to be precise).

Last question: Is 30s wait time before we interrupt good enough? The 
compactions should cancel themselves (in our case we find that unless they hang 
in the described way, they cancel themselves after no more than 8s). Could 
maybe wait a minute too. Not sure.


> Regions in transition for a long time when CLOSE interleaves with a slow 
> compaction
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-12457
>                 URL: https://issues.apache.org/jira/browse/HBASE-12457
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.7
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 2.0.0, 0.98.8, 0.99.2
>
>         Attachments: 12457-combined-0.98-v2.txt, 12457-combined-0.98.txt, 
> 12457-minifix.txt, 12457.interrupt-v2.txt, 12457.interrupt.txt
>
>
> Under heave load we have observed regions remaining in transition for 20 
> minutes when the master requests a close while a slow compaction is running.
> The pattern is always something like this:
> # RS starts a compaction
> # HM request the region to be closed on this RS
> # Compaction is not aborted for another 20 minutes
> # The region is in transition and not usable.
> In every case I tracked down so far the time between the requested CLOSE and 
> abort of the compaction is almost exactly 20 minutes, which is suspicious.
> Of course part of the issue is having compactions that take over 20 minutes, 
> but maybe we can do better here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to