Hello,

We use ZFS and NFS storage fairly heavily, and back in 'the Sun days' we
used to have a trouble with OpenOwner locks leaking. Requiring periodic
reboots of the NFS servers.

At the time, thanks to the Sun engineers, in particular Marcel Telka, the
problem was eventually tracked down to;

~~~Quote~~~
It looks like our NFSv4 server does not follow this (from RFC 3530):

   A given client might generate many open_owner4 data structures for a
   given clientid.  The client will periodically either dispose of its
   open_owner4s or stop using them for indefinite periods of time.  The
   latter situation is why the NFS version 4 protocol does not have an
   explicit operation to exit an open_owner4: such an operation is of no
   use in that situation.  Instead, to avoid unbounded memory use, the
   server needs to implement a strategy for disposing of open_owner4s
   that have no current lock, open, or delegation state for any files
   and have not been used recently.  The time period used to determine
   when to dispose of open_owner4s is an implementation choice.  The
   time period should certainly be no less than the lease time plus any
   grace period the server wishes to implement beyond a lease time.  The
   OPEN_CONFIRM operation allows the server to safely dispose of unused
   open_owner4 data structures.

Apparently, unused OpenOwner entries are not disposed after some period of time
in case the client is active somehow. They are disposed only for inactive
clients. It is visible in rfs4_openowner_expiry(). This is similar to CR
6906432 but it is a completely different scenario. I believe this is a bug, not
yet covered by any filed CR, nor fixed.


FYI, I filed this CR:

6976554 Stale OpenOwner entries are not reaped for active clients

~~~Quote~~~

Looking to the future, we are exploring changing our NFS Storage OS, and
have tried IllumOS (OpenIndiana)

Alas, we appear to get this trouble yet again. I suppose the issue was
never fixed in OpenSolaris/IllumOs. What are the chances of this happening?

# uname -a
SunOS nfs02.dw 5.11 oi_151a4 i86pc i386 i86pc Solaris

echo '::rfs4_db' | mdb -k
rfs4_database=ffffffa6167afb50
  debug_flags=00000000   shutdown:      count=0 tables=ffffff2646d4fd60
------------------ Table ------------------- Bkt  ------- Indices -------
Address          Name          Flags    Cnt  Cnt  Pointer          Cnt  Max
ffffff2646d4fd60 DelegStateID  00000000 12057 2047 fffffff7e25f31c0 0002 0002
fffffffeb5ea4630 File          00000000 19922 2047 ffffffefcfb7a140 0001 0001
ffffffefbc0b23f0 Lockowner     00000000 2035 2047 ffffffd58124cc00 0002 0002
ffffffefbcad8088 LockStateID   00000000 1743 2047 fffffffe2347ac40 0002 0002
ffffffffb8e56c88 OpenStateID   00000000 9270 2047 ffffffd5827202c0 0003 0003
ffffffefbb3641b8 OpenOwner     00000000 705410 2047 ffffffff7d187c40 0001 0001
ffffffefc2dd0358 ClntIP        00000000 0000 2047 ffffffd581d2df40 0001 0001
ffffffefbd2370c0 Client        00000000 0007 2047 fffffffedb66ba00 0002 0002

In particular, the OpenOwner.

[email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner

ffffffefbb3641b8 OpenOwner     00000000 705957 2047 ffffffff7d187c40 0001 0001
root@nfs-client# ./locktest

[email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner
ffffffefbb3641b8 OpenOwner     00000000 706022 2047 ffffffff7d187c40 0001 0001


locktest perl program can be found here;
 http://mail.opensolaris.org/pipermail/nfs-discuss/2010-October/002154.html


Jorgen Lundman

-- 
Jorgen Lundman       | <[email protected]>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to