Peter Memishian wrote: > > Need a quick code review for: > > > > 6762222* lockd burns cpu cycles, nfs pathologically slow > > > > As has been discussed, the immediate approach is to enforce flow > > control in the RPC client side dispatch routines -- > > clnt_dispatch_send() and clnt_clts_dispatch_send(). In case the > > downstream is flow controlled, the error recovery is straightforward > > and you rely on the upper (rfscall) layers to retry the call -- a > > mechanism that already exists currently to take care of other errors. > > > > webrev: > > http://cr.opensolaris.org/~maheshvs/6762222-webrev/ > > I haven't looked over everything in detail yet, but at least in > clnt_dispatch_send(), it'd be simpler to check canput() earlier in the > function, before you've done anything you have to later undo. This is > OK because canput() is already not atomic with respect to put() and > it's fine if you go a couple of messages over QFULL. >
Meem, yes I thought of that.. and didn't really like the undo work be done later on. But I reluctantly fell on the side of keeping the canput() and put() as close to each other due to a mixture of sense of guilt and paronia.. ;) Your comments helped me sway the other way. (I also think the case of QFULL will be a special case anyway like in the case of lockd -> statd interaction, but that's besides the point). I've changed the code now.. and the new webrev's at: http://cr.opensolaris.org/~maheshvs/6762222-webrev2/ Thanks, Mahesh > > * External links: > > CR: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6762222 > > Evaluation for the CR: http://cr.opensolaris.org/~maheshvs/6762222-eval > >