> I have written a fix for the issue initially documented in 6696163, for > which I have now opened 6817942 to get one bug for one issue (BTW: > could someone please correct my typo in the 6817942 summary: s:>=:>:).
I fixed that up. > The root cause is documented in 6696163: When choosing a connection for > an RPC call, the old connmgr_get code used lbolt timestamps which don't > have enough precision (nowadays) to yield proper load balancing. > > I have implemented round robin load balancing: Whenever a connection is > requested via connmgr_get, it is noted as last used. Upon the next > connection > request, the connection which comes next in the list of connections (or the > first one if the last one was used last) is returned. > > I have left a comment in the code explaining why I chose the particular > approach and what I think should be done when someone comes to do a > larger rewrite of the code in question. > > I would appreciate reviews of my suggested fix, which is available at > > http://cr.opensolaris.org/~nigoroll/rpc_loadbalancing_6817942/ I suspect there are few here who are intimately familiar with that code, as historically the NFS team has tended to the RPC module; you may want to try [email protected]. As an aside, unless you're working around a problem (which you should indeed cite by CR), comments should not reference CRs directly. Also, I'd strongly urge you to make use of all 80 columns in your fixes -- e.g., there are needless wraps at 1898-1899 and 1998-1999. -- meem _______________________________________________ networking-discuss mailing list [email protected]
