[
https://issues.apache.org/jira/browse/HBASE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell reassigned HBASE-24099:
-------------------------------------------
Assignee: Andrew Kyle Purtell
> Use a fair ReentrantReadWriteLock for the region lock used to guard closes
> --------------------------------------------------------------------------
>
> Key: HBASE-24099
> URL: https://issues.apache.org/jira/browse/HBASE-24099
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Kyle Purtell
> Assignee: Andrew Kyle Purtell
> Priority: Major
>
> Consider creating the region's ReentrantReadWriteLock with the fair locking
> policy. We have had a couple of production incidents where a regionserver
> stalled in shutdown for a very very long time, leading to RIT (FAILED_CLOSE).
> The latest example is a 43 minute shutdown, ~40 minutes (2465280 ms) of that
> time was spent waiting to acquire the write lock on the region in order to
> finish closing it.
> {quote}
> ...
> Finished memstore flush of ~66.92 MB/70167112, currentsize=0 B/0 for region
> XXXX. in 927ms, sequenceid=6091133815, compaction requested=false at
> 1585175635349 (+60 ms)
> Disabling writes for close at 1585178100629 (+2465280 ms)
> {quote}
> This time was spent in between the memstore flush and the task status change
> "Disabling writes for close at...". This is at HRegion.java:1481 in 1.3.6:
> {code}
> 1480: // block waiting for the lock for closing
> 1481: lock.writeLock().lock(); // FindBugs: Complains
> UL_UNRELEASED_LOCK_EXCEPTION_PATH but seems fine
> {code}
>
> The close lock is operating in unfair mode. The table in question is under
> constant high query load. When the close request was received, there were
> active readers. After the close request there were more active readers,
> near-continuous contention. Although the clients would receive
> RegionServerStoppingException and other error notifications, because the
> region could not be reassigned, they kept coming, region (re-)location would
> find the region still hosted on the stuck server. Finally the closing thread
> waiting for the write lock became no longer starved (by chance) after 40
> minutes.
> The ReentrantReadWriteLock javadoc is clear about the possibility of
> starvation when continuously contended: "_When constructed as non-fair (the
> default), the order of entry to the read and write lock is unspecified,
> subject to reentrancy constraints. A nonfair lock that is continuously
> contended may indefinitely postpone one or more reader or writer threads, but
> will normally have higher throughput than a fair lock._"
> We could try changing the acquisition semantics of this lock to fair. This is
> a one line change, where we call the RW lock constructor. Then:
> "_When constructed as fair, threads contend for entry using an approximately
> arrival-order policy. When the currently held lock is released, either the
> longest-waiting single writer thread will be assigned the write lock, or if
> there is a group of reader threads waiting longer than all waiting writer
> threads, that group will be assigned the read lock._"
> This could be better. The close process will have to wait until all readers
> and writers already waiting for acquisition either acquire and release or go
> away but won't be starved by future/incoming requests.
> There could be a throughput loss in request handling, though, because this is
> the global reentrant RW lock for the region.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)