Hello, We use ZFS and NFS storage fairly heavily, and back in 'the Sun days' we used to have a trouble with OpenOwner locks leaking. Requiring periodic reboots of the NFS servers.
At the time, thanks to the Sun engineers, in particular Marcel Telka, the problem was eventually tracked down to; ~~~Quote~~~ It looks like our NFSv4 server does not follow this (from RFC 3530): A given client might generate many open_owner4 data structures for a given clientid. The client will periodically either dispose of its open_owner4s or stop using them for indefinite periods of time. The latter situation is why the NFS version 4 protocol does not have an explicit operation to exit an open_owner4: such an operation is of no use in that situation. Instead, to avoid unbounded memory use, the server needs to implement a strategy for disposing of open_owner4s that have no current lock, open, or delegation state for any files and have not been used recently. The time period used to determine when to dispose of open_owner4s is an implementation choice. The time period should certainly be no less than the lease time plus any grace period the server wishes to implement beyond a lease time. The OPEN_CONFIRM operation allows the server to safely dispose of unused open_owner4 data structures. Apparently, unused OpenOwner entries are not disposed after some period of time in case the client is active somehow. They are disposed only for inactive clients. It is visible in rfs4_openowner_expiry(). This is similar to CR 6906432 but it is a completely different scenario. I believe this is a bug, not yet covered by any filed CR, nor fixed. FYI, I filed this CR: 6976554 Stale OpenOwner entries are not reaped for active clients ~~~Quote~~~ Looking to the future, we are exploring changing our NFS Storage OS, and have tried IllumOS (OpenIndiana) Alas, we appear to get this trouble yet again. I suppose the issue was never fixed in OpenSolaris/IllumOs. What are the chances of this happening? # uname -a SunOS nfs02.dw 5.11 oi_151a4 i86pc i386 i86pc Solaris echo '::rfs4_db' | mdb -k rfs4_database=ffffffa6167afb50 debug_flags=00000000 shutdown: count=0 tables=ffffff2646d4fd60 ------------------ Table ------------------- Bkt ------- Indices ------- Address Name Flags Cnt Cnt Pointer Cnt Max ffffff2646d4fd60 DelegStateID 00000000 12057 2047 fffffff7e25f31c0 0002 0002 fffffffeb5ea4630 File 00000000 19922 2047 ffffffefcfb7a140 0001 0001 ffffffefbc0b23f0 Lockowner 00000000 2035 2047 ffffffd58124cc00 0002 0002 ffffffefbcad8088 LockStateID 00000000 1743 2047 fffffffe2347ac40 0002 0002 ffffffffb8e56c88 OpenStateID 00000000 9270 2047 ffffffd5827202c0 0003 0003 ffffffefbb3641b8 OpenOwner 00000000 705410 2047 ffffffff7d187c40 0001 0001 ffffffefc2dd0358 ClntIP 00000000 0000 2047 ffffffd581d2df40 0001 0001 ffffffefbd2370c0 Client 00000000 0007 2047 fffffffedb66ba00 0002 0002 In particular, the OpenOwner. [email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner ffffffefbb3641b8 OpenOwner 00000000 705957 2047 ffffffff7d187c40 0001 0001 root@nfs-client# ./locktest [email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner ffffffefbb3641b8 OpenOwner 00000000 706022 2047 ffffffff7d187c40 0001 0001 locktest perl program can be found here; http://mail.opensolaris.org/pipermail/nfs-discuss/2010-October/002154.html Jorgen Lundman -- Jorgen Lundman | <[email protected]> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home) ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
