[ 
https://issues.apache.org/jira/browse/HBASE-25720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377934#comment-17377934
 ] 

Xiaolin Ha commented on HBASE-25720:
------------------------------------

Hi, [~stack], we noticed this problem always after the RS killed itselves, 
sorry there is no jstack now, and we have no more ideas about the reason of WAL 
stuck. But we have made a script monitor for this problem, I'll attach the 
jstack files once get and will dig more about this problem, thanks.

> Sync WAL stuck when prepare flush cache will prevent flush cache and cause OOM
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-25720
>                 URL: https://issues.apache.org/jira/browse/HBASE-25720
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.4.13
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>         Attachments: prepare-flush-cache-stuck.png
>
>
> We call HRegion#doSyncOfUnflushedWALChanges when preparing to flush cache. 
> But this WAL sync may stuck, and abort the flush of cache. 
> !prepare-flush-cache-stuck.png|width=519,height=246!
> If we cannot aware of this problem in time, RS will OOM kill.
> I think we should force abort RS when sync stuck in preparing, like in 
> committing snapshots.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to