[ https://issues.apache.org/jira/browse/HBASE-27230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang reopened HBASE-27230: ------------------------------- For backporting to other branches. > RegionServer should be aborted when WAL.sync throws TimeoutIOException > ---------------------------------------------------------------------- > > Key: HBASE-27230 > URL: https://issues.apache.org/jira/browse/HBASE-27230 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 3.0.0-alpha-4 > Reporter: chenglei > Assignee: chenglei > Priority: Major > Labels: pull-request-available > Fix For: 3.0.0-alpha-4 > > > As HBASE-27223 said, if {{WAL.sync}} get a timeout exception, we should > abort the region server, as the design of WAL sync, is to succeed or die, > there is no 'failure'. It is usually not a big deal is because we set a very > large default value(5 minutes) for {{AbstractFSWAL.WAL_SYNC_TIMEOUT_MS}}, > usually the WAL system will abort the region server if it can not finish the > sync within 5 minutes. > In the PR, only the {{WAL.sync}} timeout in {{HRegion#doWALAppend}} > ,regionServer is always aborted. For {{WALUtil.writeMarker}}, it is just > record the internal state and seems it is no need to always abort the > regionServer when {{WAL.sync}} timeout,it is the internal state transition > that determines whether regionServer is aborted. -- This message was sent by Atlassian Jira (v8.20.10#820010)