Hi Marcel,

I really appreciate your input. We will try to find out what is wrong in
connmgr_get() which caused port leaking.

In the meantime, we want to change clnt_cots_do_bindresvport from 1 to 0
through mdb so that new connection will start using non-reserved port,

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c#508

Will dynamically changing this variable have any side effect?

Thanks,
-Youzhong

On Mon, Nov 3, 2014 at 4:50 PM, Marcel Telka <mar...@telka.sk> wrote:

> Hi Youzhong,
>
> On Mon, Nov 03, 2014 at 04:14:55PM -0500, Youzhong Yang via
> illumos-developer wrote:
> > Hello,
> >
> > We are having a very strange issue on one of servers. The issue is that
> > fcntl locking over NFS returns 'no locks available' immediately.
> >
> > dtrace shows that bindresvport() returns error code 125 (EADDRINUSE):
> >
> > # dtrace -n 'fbt:rpcmod:bindresvport:return /arg1 != 0/ {stack();
> > printf("ret = %d", arg1);}'
> >   9  52692              bindresvport:return
> >               rpcmod`connmgr_get+0x560
> >               rpcmod`connmgr_wrapget+0x63
> >               rpcmod`clnt_cots_kcallit+0x198
> >               rpcmod`rpcbind_getaddr+0x245
> >               klmmod`update_host_rpcbinding+0x4f
> >               klmmod`nlm_host_get_rpc+0x6d
> >               klmmod`nlm_do_lock+0x10d
> >               klmmod`nlm4_lock_4_svc+0x2a
> >               klmmod`nlm_dispatch+0xe6
> >               klmmod`nlm_prog_4+0x34
> >               rpcmod`svc_getreq+0x1c1
> >               rpcmod`svc_run+0x146
> >               rpcmod`svc_do_run+0x8e
> >               nfs`nfssys+0xf1
> >               unix`_sys_sysenter_post_swapgs+0x149
> > ret = 125
> >
> > netstat shows that 501 reserved ports are in BOUND state:
> >
> > # netstat -an | grep BOUND
> >       *.935                *.*                0      0 1049740      0
> BOUND
> >       *.801                *.*                0      0 1049740      0
> BOUND
> >       *.798                *.*                0      0 1049740      0
> BOUND
> >       *.561                *.*                0      0 1049740      0
> BOUND
> >       *.613                *.*                0      0 1049740      0
> BOUND
> >       ....
> > # netstat -an | grep BOUND | wc -l
> >      501
> >
> > Has anyone seen this similar issue? is it possible to unbind those
> reserved
> > ports? Rebooting the server is our last resort.
> >
> > Any advice would be very much appreciated.
>
> I faced similar issue in connmgr_get().  It is filed as #1616 and the
> problem
> is that the dead connection is not properly closed (there seems to be
> missing
> connmgr_cancelconn() call somewhere), so the client could properly
> reconnect.
> Unfortunately, I had no time to finish the analysis of this bug.
>
>
> HTH
>
> --
> +-------------------------------------------+
> | Marcel Telka   e-mail:   mar...@telka.sk  |
> |                homepage: http://telka.sk/ |
> |                jabber:   mar...@jabber.sk |
> +-------------------------------------------+
>



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to