Hi Jorgen,

This was filed for illumos at https://www.illumos.org/issues/417 and fixed in
summer 2012. The fix implemented here is different when compared to what I did
in Solaris. I didn't looked at the illumos fix thoroughly, but from the quick
scan I think it should work.

The fix should be a part of the OI 151a7. Are you able to reproduce your issue
with 151a7? If so, please file new bug.

Thanks.

On Wed, Jan 30, 2013 at 04:05:06PM +0900, Jorgen Lundman wrote:
> 
> Hello,
> 
> We use ZFS and NFS storage fairly heavily, and back in 'the Sun days' we
> used to have a trouble with OpenOwner locks leaking. Requiring periodic
> reboots of the NFS servers.
> 
> At the time, thanks to the Sun engineers, in particular Marcel Telka, the
> problem was eventually tracked down to;
> 
> ~~~Quote~~~
> It looks like our NFSv4 server does not follow this (from RFC 3530):
> 
>    A given client might generate many open_owner4 data structures for a
>    given clientid.  The client will periodically either dispose of its
>    open_owner4s or stop using them for indefinite periods of time.  The
>    latter situation is why the NFS version 4 protocol does not have an
>    explicit operation to exit an open_owner4: such an operation is of no
>    use in that situation.  Instead, to avoid unbounded memory use, the
>    server needs to implement a strategy for disposing of open_owner4s
>    that have no current lock, open, or delegation state for any files
>    and have not been used recently.  The time period used to determine
>    when to dispose of open_owner4s is an implementation choice.  The
>    time period should certainly be no less than the lease time plus any
>    grace period the server wishes to implement beyond a lease time.  The
>    OPEN_CONFIRM operation allows the server to safely dispose of unused
>    open_owner4 data structures.
> 
> Apparently, unused OpenOwner entries are not disposed after some period of 
> time
> in case the client is active somehow. They are disposed only for inactive
> clients. It is visible in rfs4_openowner_expiry(). This is similar to CR
> 6906432 but it is a completely different scenario. I believe this is a bug, 
> not
> yet covered by any filed CR, nor fixed.
> 
> 
> FYI, I filed this CR:
> 
> 6976554 Stale OpenOwner entries are not reaped for active clients
> 
> ~~~Quote~~~
> 
> Looking to the future, we are exploring changing our NFS Storage OS, and
> have tried IllumOS (OpenIndiana)
> 
> Alas, we appear to get this trouble yet again. I suppose the issue was
> never fixed in OpenSolaris/IllumOs. What are the chances of this happening?
> 
> # uname -a
> SunOS nfs02.dw 5.11 oi_151a4 i86pc i386 i86pc Solaris
> 
> echo '::rfs4_db' | mdb -k
> rfs4_database=ffffffa6167afb50
>   debug_flags=00000000   shutdown:      count=0 tables=ffffff2646d4fd60
> ------------------ Table ------------------- Bkt  ------- Indices -------
> Address          Name          Flags    Cnt  Cnt  Pointer          Cnt  Max
> ffffff2646d4fd60 DelegStateID  00000000 12057 2047 fffffff7e25f31c0 0002 0002
> fffffffeb5ea4630 File          00000000 19922 2047 ffffffefcfb7a140 0001 0001
> ffffffefbc0b23f0 Lockowner     00000000 2035 2047 ffffffd58124cc00 0002 0002
> ffffffefbcad8088 LockStateID   00000000 1743 2047 fffffffe2347ac40 0002 0002
> ffffffffb8e56c88 OpenStateID   00000000 9270 2047 ffffffd5827202c0 0003 0003
> ffffffefbb3641b8 OpenOwner     00000000 705410 2047 ffffffff7d187c40 0001 0001
> ffffffefc2dd0358 ClntIP        00000000 0000 2047 ffffffd581d2df40 0001 0001
> ffffffefbd2370c0 Client        00000000 0007 2047 fffffffedb66ba00 0002 0002
> 
> In particular, the OpenOwner.
> 
> [email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner
> 
> ffffffefbb3641b8 OpenOwner     00000000 705957 2047 ffffffff7d187c40 0001 0001
> root@nfs-client# ./locktest
> 
> [email protected]:~# echo '::rfs4_db' | mdb -k | grep OpenOwner
> ffffffefbb3641b8 OpenOwner     00000000 706022 2047 ffffffff7d187c40 0001 0001
> 
> 
> locktest perl program can be found here;
>  http://mail.opensolaris.org/pipermail/nfs-discuss/2010-October/002154.html
> 
> 
> Jorgen Lundman
> 
> -- 
> Jorgen Lundman       | <[email protected]>
> Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
> Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
> Japan                | +81 (0)3 -3375-1767          (home)
> 
> 
> -------------------------------------------
> illumos-discuss
> Archives: https://www.listbox.com/member/archive/182180/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182180/23046997-5a38a7d8
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com

-- 
+-------------------------------------------+
| Marcel Telka   e-mail:   [email protected]  |
|                homepage: http://telka.sk/ |
|                jabber:   [email protected] |
+-------------------------------------------+


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to