Hi Nils, Have you had a chance to take a look at the proposed changed?
Thanks, -Dai Dai Ngo wrote: > Hi Nils, > > Thank you for your effort to fix these issues. > > I think we can achieve the same result with a simpler change > by removing and re-inserting the entry to the head of the list > once it was selected. The webrev for that change is here: > > http://cr.opensolaris.org/~dain/6817942/ > <http://cr.opensolaris.org/%7Edain/6817942/> > > I tested the change and verified it OK with dtrace: > > #!/usr/sbin/dtrace -s > > fbt:rpcmod:connmgr_get:return > { > @a[((struct cm_xprt *)arg1)]=count() > } > > [BEFORE] > > pnfs-18-21.root[510] ./rpc.d > dtrace: script './rpc.d' matched 2 probes > ^C > > -1091071946176 4 > -1089303351872 4 > -1090980970432 5 > -1089303233856 5 > -1091352342336 4208 > -1091352346816 4225 > -1091277309824 4237 > -1091277310208 21463 > pnfs-18-21.root[511] > > > [AFTER] > > dtrace: script './rpc.d' matched 2 probes > ^C > > -1091071946176 1 > -1090980970432 1 > -1089303351872 2 > -1089303233856 2 > -1091277310208 8539 > -1091352346816 8540 > -1091352342336 8540 > -1091277309824 8540 > pnfs-18-21.root[512] > > Thanks, > -Dai > > > Nils Goroll wrote: >> Hi All, >> >> Peter Memishian has recommended that I ask the NFS community rather >> than the networking community, see >> http://opensolaris.org/jive/thread.jspa?threadID=97367&tstart=0 for >> the original thread. >> >> I have written a fix for the issue initially documented in 6696163, >> for which I >> have now opened 6817942 to get one bug for one issue. >> >> The root cause is documented in 6696163: When choosing a connection >> for an RPC >> call, the old connmgr_get code used lbolt timestamps which don't have >> enough >> precision (nowadays) to yield proper load balancing. >> >> I have implemented round robin load balancing: Whenever a connection is >> requested via connmgr_get, it is noted as last used. Upon the next >> connection >> request, the connection which comes next in the list of connections >> (or the >> first one if the last one was used last) is returned. >> >> I have left a comment in the code explaining why I chose the >> particular approach >> and what I think should be done when someone comes to do a larger >> rewrite of the >> code in question. >> >> I would appreciate reviews of my suggested fix, which is available at >> >> http://cr.opensolaris.org/~nigoroll/rpc_loadbalancing_6817942/ >> >> Thanks, Nils >> >> -- >> >> Tests I conducted: >> >> #### PREPARATIONS >> >>> clnt_max_conns/D >> clnt_max_conns: >> clnt_max_conns: 4 >> >>> rpclog/W b >> rpclog: 0 = 0xb >> >> ifconfig e1000g0 addif 192.168.77.100/24 up deprecated >> share -o rw=localhost:192.168.77.29,root=localhost:192.168.77.29 \ >> /var/tmp/s >> mkdir /tmp/c >> cd ; umount /tmp/c ; \ >> mount -o vers=3,proto=tcp 192.168.77.100:/var/tmp/s /tmp/c ; \ >> cd /tmp/c >> >> #### TEST BALANCING >> >>> rpclog/W 2 >> rpclog: 0 = 0x2 >>> clnt_max_conns/D >> clnt_max_conns: >> clnt_max_conns: 4 >> >> - mount share (see above) >> - ls -als >> >> haggis:/tmp/c# egrep '19:57:.*call going out' /var/adm/messages >> Mar 16 19:58:10 haggis rpcmod: [ID 595603 kern.notice] >> clnt_tli_kinit: COTS selected >> Mar 16 19:58:10 haggis rpcmod: [ID 359612 kern.notice] >> clnt_cots_kcallit, procnum 4 >> Mar 16 19:58:10 haggis rpcmod: [ID 832978 kern.notice] >> clnt_cots_kcallit: >> wait.tv_sec: 60 >> Mar 16 19:58:10 haggis rpcmod: [ID 359371 kern.notice] >> clnt_cots_kcallit: >> wait.tv_usec: 0 >> Mar 16 19:57:22 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f4598c38 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b3c8 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b5f0 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b760 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f4598c38 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b3c8 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b5f0 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b760 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f4598c38 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b3c8 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b5f0 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f459b760 >> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: >> call going >> out on ffffff01f4598c38 >> [...etc...] >> >> #### Test balancing throughput (can't to a real performance test on a >> single >> #### laptop, this is more a test against massive performance regression) >> >> haggis:/tmp/c# iostat -xznM 5 & dd if=1g_1 of=/dev/zero >> bs=$((1024*1024))& >> [2] 100689 >> [3] 100690 >> haggis:/tmp/c# extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 24.1 7.5 1.1 0.1 0.8 0.3 24.3 8.1 12 20 c5t0d0 >> 0.0 0.0 0.0 0.0 0.0 0.0 0.0 122.5 0 0 >> 192.168.77.100:/var/tmp/s >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 486.4 10.9 60.6 0.1 2.4 28.1 4.7 56.5 77 100 c5t0d0 >> 1907.1 0.0 59.6 0.0 0.4 1.8 0.2 0.9 18 97 >> 192.168.77.100:/var/tmp/s >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 206.5 1.2 25.8 0.0 2.4 11.8 11.7 56.7 40 40 c5t0d0 >> 4199.2 0.0 131.2 0.0 0.5 2.2 0.1 0.5 33 97 >> 192.168.77.100:/var/tmp/s >> 1024+0 records in >> 1024+0 records out >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 0.4 0.0 0.0 0.0 0.0 0.0 0.0 5.4 0 0 c5t0d0 >> 461.0 0.0 14.4 0.0 0.1 0.2 0.1 0.4 4 8 >> 192.168.77.100:/var/tmp/s >> >> #### Is Dooming working OK? >> >>> rpclog/W 8 >> rpclog: 0 = 0x8 >>> clnt_max_conns/W 8 >> clnt_max_conns: 0x4 = 0x8 >> >> * ls -als on share >> >> haggis:/tmp/c# netstat -an| egrep '^192.168.77.100.2049' >> 192.168.77.100.2049 192.168.77.29.1023 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1022 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1021 49152 0 49308 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1020 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1019 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1018 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1017 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1016 49152 0 49304 0 >> ESTABLISHED >> >>> clnt_max_conns/W 4 >> clnt_max_conns: 0x8 = 0x4 >> >> haggis:/tmp/c# ls -als >> total 7235597 >> 3 drwxr-xr-x 2 root root 10 Mar 16 10:47 . >> 8 drwxrwxrwt 8 root sys 720 Mar 16 19:57 .. >> 878969 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_1 >> 794225 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_2 >> 1130659 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_3 >> 865401 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_4 >> 778861 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_5 >> 953995 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_6 >> 969355 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_7 >> 864121 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_8 >> haggis:/tmp/c# Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] >> connmgr_get: too many conns, dooming entry ffffff01e0ca9240 >> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: >> too many >> conns, dooming entry ffffff0205a14280 >> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: >> too many >> conns, dooming entry ffffff01fed6d700 >> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: >> too many >> conns, dooming entry ffffff01f468d380 >> >> ... waiting a while ... >> >> Mar 16 20:05:12 haggis rpcmod: [ID 772401 kern.notice] >> connmgr_sndrel: sending >> ordrel to queue ffffff020b4a7e50 >> Mar 16 20:05:12 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff020b4a7e50: got T_ORDREL_REQ >> Mar 16 20:05:12 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff020b6cde40: got T_ORDREL_REQ >> netstat -an| egrep '^192.168.77.100.2049' >> 192.168.77.100.2049 192.168.77.29.1022 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1021 49152 0 49308 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1020 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1019 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1018 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1017 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1016 49152 0 49304 0 >> ESTABLISHED >> haggis:/tmp/c# Mar 16 20:05:46 haggis rpcmod: [ID 772401 kern.notice] >> connmgr_sndrel: sending ordrel to queue ffffff02039093b0 >> Mar 16 20:05:46 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff02039093b0: got T_ORDREL_REQ >> Mar 16 20:05:46 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff020b6c88f8: got T_ORDREL_REQ >> Mar 16 20:06:02 haggis rpcmod: [ID 772401 kern.notice] >> connmgr_sndrel: sending >> ordrel to queue ffffff0202e440f8 >> Mar 16 20:06:02 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff0202e440f8: got T_ORDREL_REQ >> Mar 16 20:06:02 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff020b6cd3a0: got T_ORDREL_REQ >> Mar 16 20:06:03 haggis rpcmod: [ID 772401 kern.notice] >> connmgr_sndrel: sending >> ordrel to queue ffffff02030cdba0 >> Mar 16 20:06:03 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff02030cdba0: got T_ORDREL_REQ >> Mar 16 20:06:03 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq >> 0xffffff02038798f0: got T_ORDREL_REQ >> netstat -an| egrep '^192.168.77.100.2049' >> 192.168.77.100.2049 192.168.77.29.1019 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1018 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1017 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1016 49152 0 49304 0 >> ESTABLISHED >> >> >> #### Simple test with one connection only >> >>> clnt_max_conns/W 1 >> clnt_max_conns: 0x4 = 0x1 >> >> haggis:/tmp/c# iostat -xznM 5& dd if=1g_3 of=/dev/null bs=$((1024*1024)) >> [2] 100717 >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 17.5 4.6 1.4 0.1 0.4 0.5 16.8 24.7 6 11 c5t0d0 >> 98.8 0.0 3.1 0.0 0.0 0.0 0.1 0.5 0 3 >> 192.168.77.100:/var/tmp/s >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 533.7 0.0 66.7 0.0 24.8 5.6 46.5 10.5 91 98 c5t0d0 >> 1916.6 0.0 59.7 0.0 0.4 2.3 0.2 1.2 21 96 >> 192.168.77.100:/var/tmp/s >> kill %2 >> extended device statistics >> r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device >> 350.4 0.0 43.8 0.0 19.8 4.5 56.5 12.7 73 76 c5t0d0 >> 2958.3 0.0 90.5 0.0 0.3 2.4 0.1 0.8 21 97 >> 192.168.77.100:/var/tmp/s >> 1024+0 records in >> 1024+0 records out >> haggis:/tmp/c# kill %2 >> >> haggis:/tmp/c# ls -als >> total 7235597 >> 3 drwxr-xr-x 2 root root 10 Mar 16 10:47 . >> 8 drwxrwxrwt 9 root sys 1317 Mar 16 20:38 .. >> 878969 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_1 >> 794225 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_2 >> 1130659 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_3 >> 865401 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_4 >> 778861 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_5 >> 953995 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_6 >> 969355 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_7 >> 864121 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_8 >> haggis:/tmp/c# netstat -an| egrep '^192.168.77.100.2049' >> 192.168.77.100.2049 192.168.77.29.1015 49152 0 49304 0 >> ESTABLISHED >> >> #### test with >1 destination address >> >> haggis:/tmp/d# mkdir /tmp/e >> haggis:/tmp/d# mount -o vers=3,proto=tcp 192.168.77.29:/var/tmp/s /tmp/e >> haggis:/tmp/d# netstat -an| egrep '^192.168.77.(29|100).2049' >> 192.168.77.100.2049 192.168.77.29.1015 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1014 49152 0 49308 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1013 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1012 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1011 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1010 49152 0 49304 0 >> ESTABLISHED >> haggis:/tmp/d# cd /tmp/e >> haggis:/tmp/e# ls -als >> total 7235597 >> 3 drwxr-xr-x 2 root root 10 Mar 16 10:47 . >> 8 drwxrwxrwt 11 root sys 1433 Mar 16 20:41 .. >> 878969 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_1 >> 794225 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_2 >> 1130659 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_3 >> 865401 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_4 >> 778861 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_5 >> 953995 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_6 >> 969355 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_7 >> 864121 -rw------- 1 root root 1073741824 Mar 16 10:49 1g_8 >> haggis:/tmp/e# netstat -an| egrep '^192.168.77.(29|100).2049' >> 192.168.77.100.2049 192.168.77.29.1015 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1014 49152 0 49308 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1013 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.100.2049 192.168.77.29.1012 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1011 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1010 49152 0 49304 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1009 49152 0 49308 0 >> ESTABLISHED >> 192.168.77.29.2049 192.168.77.29.1008 49152 0 49308 0 >> ESTABLISHED >> _______________________________________________ >> nfs-discuss mailing list >> nfs-discuss at opensolaris.org > >