[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

Duo Zhang (JIRA) Wed, 23 Jan 2019 22:43:17 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750782#comment-16750782
 ]


Duo Zhang commented on HBASE-21564:
-----------------------------------

We can confirm that 2.0+ will be impacted. But I'm not sure whether the 1.x 
branches are safe... The LogRoller part is always a pain and need to be 
refactored...

Personally I would like to change it to a multiple threaded, for now if there 
is a bad WAL which makes us retry for a long time, then the rolling of all 
other WALs(if any, for example, we use multi WAL) will be blocked...

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-21564
>                 URL: https://issues.apache.org/jira/browse/HBASE-21564
>             Project: HBase
>          Issue Type: Bug
>          Components: asyncclient, wal
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>         Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch, 
> HBASE-21564.master.004.patch, HBASE-21564.master.005.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

Reply via email to