Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

Pranith Kumar Karampuri Sun, 24 Jan 2016 18:43:45 -0800


On 01/25/2016 02:17 AM, Richard Wareing wrote:

Hello all,
Just gave a talk at SCaLE 14x today and I mentioned our new locksrevocation feature which has had a significant impact on our GFScluster reliability. As such I wanted to share the patch with thecommunity, so here's the bugzilla report:
https://bugzilla.redhat.com/show_bug.cgi?id=1301401

=====
Summary:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause clusterinstability and eventual complete unavailability due to failures inreleasing entry/inode locks in a timely manner.
Classic symptoms on this are increased brick (and/or gNFSd) memoryusage due the high number of (lock request) frames piling up in theprocesses. The failure-mode results in bricks eventually slowing downto a crawl due to swapping, or OOMing due to complete memoryexhaustion; during this period the entire cluster can begin to fail.End-users will experience this as hangs on the filesystem, first in aspecific region of the file-system and ultimately the entirefilesystem as the offending brick begins to turn into a zombie (i.e.not quite dead, but not quite alive either).
Currently, these situations must be handled by an administratordetecting & intervening via the "clear-locks" CLI command.Unfortunately this doesn't scale for large numbers of clusters, andit depends on the correct (external) detection of the locks piling up(for which there is little signal other than state dumps).
This patch introduces two features to remedy this situation:
1. Monkey-unlocking - This is a feature targeted at developers (only!)to help track down crashes due to stale locks, and prove the utilityof he lock revocation feature. It does this by silently dropping 1%of unlock requests; simulating bugs or mis-behaving clients.
The feature is activated via:
features.locks-monkey-unlocking <on/off>

You'll see the message
"[<timestamp>] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEYLOCKING (forcing stuck lock)!" ... in the logs indicating a requesthas been dropped.
2. Lock revocation - Once enabled, this feature will revoke a*contended*lock (i.e. if nobody else asks for the lock, we will notrevoke it)either by the amount of time the lock has been held, howmany other lock requests are waiting on the lock to be freed, or somecombination of both. Clients which are losing their locks will benotified by receiving EAGAIN (send back to their callback function).
The feature is activated via these options:
features.locks-revocation-secs <integer; 0 to disable>
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked <integer>
Recommended settings are: 1800 seconds for a time based timeout (giveclients the benefit of the doubt, or chose a max-blocked requires someexperimentation depending on your workload, but generally values ofhundreds to low thousands (it's normal for many ten's of locks to betaken out when files are being written @ high throughput).

I really like this feature. One question though, self-heal, rebalancedomain locks are active until self-heal/rebalance is complete which cantake more than 30 minutes if the files are in TBs. I will try to seewhat we can do to handle these without increasing the revocation-secstoo much. May be we can come up with per domain revocation timeouts.Comments are welcome.


Pranith

=====
The patch supplied will patch clean the the v3.7.6 release tag, andprobably to any 3.7.x release & master (posix locks xlator is rarelytouched).
Richard





_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

Reply via email to