Hi Nils,

Have you had a chance to take a look at the proposed changed?

Thanks,
-Dai

Dai Ngo wrote:
> Hi Nils,
>
> Thank you for your effort to fix these issues.
>
> I think we can achieve the same result with a simpler change
> by removing and re-inserting the entry to the head of the list
> once it was selected. The webrev for that change is here:
>
> http://cr.opensolaris.org/~dain/6817942/ 
> <http://cr.opensolaris.org/%7Edain/6817942/>
>
> I tested the change and verified it OK with dtrace:
>
> #!/usr/sbin/dtrace -s
>
> fbt:rpcmod:connmgr_get:return
> {
>       @a[((struct cm_xprt *)arg1)]=count()
> }
>
> [BEFORE]
>
> pnfs-18-21.root[510] ./rpc.d
> dtrace: script './rpc.d' matched 2 probes
> ^C
>
>  -1091071946176                4
>  -1089303351872                4
>  -1090980970432                5
>  -1089303233856                5
>  -1091352342336             4208
>  -1091352346816             4225
>  -1091277309824             4237
>  -1091277310208            21463
> pnfs-18-21.root[511]
>
>
> [AFTER]
>
> dtrace: script './rpc.d' matched 2 probes
> ^C
>
>  -1091071946176                1
>  -1090980970432                1
>  -1089303351872                2
>  -1089303233856                2
>  -1091277310208             8539
>  -1091352346816             8540
>  -1091352342336             8540
>  -1091277309824             8540
> pnfs-18-21.root[512]
>
> Thanks,
> -Dai
>
>
> Nils Goroll wrote:
>> Hi All,
>>
>> Peter Memishian has recommended that I ask the NFS community rather 
>> than the networking community, see 
>> http://opensolaris.org/jive/thread.jspa?threadID=97367&tstart=0 for 
>> the original thread.
>>
>> I have written a fix for the issue initially documented in 6696163, 
>> for which I
>> have now opened 6817942 to get one bug for one issue.
>>
>> The root cause is documented in 6696163: When choosing a connection 
>> for an RPC
>> call, the old connmgr_get code used lbolt timestamps which don't have 
>> enough
>> precision (nowadays) to yield proper load balancing.
>>
>> I have implemented round robin load balancing: Whenever a connection is
>> requested via connmgr_get, it is noted as last used. Upon the next 
>> connection
>> request, the connection which comes next in the list of connections 
>> (or the
>> first one if the last one was used last) is returned.
>>
>> I have left a comment in the code explaining why I chose the 
>> particular approach
>> and what I think should be done when someone comes to do a larger 
>> rewrite of the
>> code in question.
>>
>> I would appreciate reviews of my suggested fix, which is available at
>>
>> http://cr.opensolaris.org/~nigoroll/rpc_loadbalancing_6817942/
>>
>> Thanks, Nils
>>
>> -- 
>>
>> Tests I conducted:
>>
>> #### PREPARATIONS
>>
>>> clnt_max_conns/D
>> clnt_max_conns:
>> clnt_max_conns: 4
>>
>>> rpclog/W b
>> rpclog:         0               =       0xb
>>
>> ifconfig e1000g0 addif 192.168.77.100/24 up deprecated
>> share -o rw=localhost:192.168.77.29,root=localhost:192.168.77.29 \
>> /var/tmp/s
>> mkdir /tmp/c
>> cd ; umount /tmp/c ; \
>> mount -o vers=3,proto=tcp 192.168.77.100:/var/tmp/s /tmp/c ; \
>> cd /tmp/c
>>
>> #### TEST BALANCING
>>
>>> rpclog/W 2
>> rpclog:         0               =       0x2
>>> clnt_max_conns/D
>> clnt_max_conns:
>> clnt_max_conns: 4
>>
>> - mount share (see above)
>> - ls -als
>>
>> haggis:/tmp/c# egrep '19:57:.*call going out' /var/adm/messages
>> Mar 16 19:58:10 haggis rpcmod: [ID 595603 kern.notice] 
>> clnt_tli_kinit: COTS selected
>> Mar 16 19:58:10 haggis rpcmod: [ID 359612 kern.notice] 
>> clnt_cots_kcallit, procnum 4
>> Mar 16 19:58:10 haggis rpcmod: [ID 832978 kern.notice] 
>> clnt_cots_kcallit:
>> wait.tv_sec: 60
>> Mar 16 19:58:10 haggis rpcmod: [ID 359371 kern.notice] 
>> clnt_cots_kcallit:
>> wait.tv_usec: 0
>> Mar 16 19:57:22 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f4598c38
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b3c8
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b5f0
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b760
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f4598c38
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b3c8
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b5f0
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b760
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f4598c38
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b3c8
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b5f0
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f459b760
>> Mar 16 19:57:26 haggis rpcmod: [ID 757553 kern.notice] connmgr_get: 
>> call going
>> out on ffffff01f4598c38
>> [...etc...]
>>
>> #### Test balancing throughput (can't to a real performance test on a 
>> single
>> #### laptop, this is more a test against massive performance regression)
>>
>> haggis:/tmp/c# iostat -xznM 5 & dd if=1g_1 of=/dev/zero 
>> bs=$((1024*1024))&
>> [2] 100689
>> [3] 100690
>> haggis:/tmp/c#                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>    24.1    7.5    1.1    0.1  0.8  0.3   24.3    8.1  12  20 c5t0d0
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0  122.5   0   0
>> 192.168.77.100:/var/tmp/s
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>   486.4   10.9   60.6    0.1  2.4 28.1    4.7   56.5  77 100 c5t0d0
>>  1907.1    0.0   59.6    0.0  0.4  1.8    0.2    0.9  18  97
>> 192.168.77.100:/var/tmp/s
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>   206.5    1.2   25.8    0.0  2.4 11.8   11.7   56.7  40  40 c5t0d0
>>  4199.2    0.0  131.2    0.0  0.5  2.2    0.1    0.5  33  97
>> 192.168.77.100:/var/tmp/s
>> 1024+0 records in
>> 1024+0 records out
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>     0.4    0.0    0.0    0.0  0.0  0.0    0.0    5.4   0   0 c5t0d0
>>   461.0    0.0   14.4    0.0  0.1  0.2    0.1    0.4   4   8
>> 192.168.77.100:/var/tmp/s
>>
>> #### Is Dooming working OK?
>>
>>> rpclog/W 8
>> rpclog:         0               =       0x8
>>> clnt_max_conns/W 8
>> clnt_max_conns: 0x4             =       0x8
>>
>> * ls -als on share
>>
>> haggis:/tmp/c# netstat -an| egrep '^192.168.77.100.2049'
>> 192.168.77.100.2049  192.168.77.29.1023   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1022   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1021   49152      0 49308      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1020   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1019   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1018   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1017   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1016   49152      0 49304      0 
>> ESTABLISHED
>>
>>> clnt_max_conns/W 4
>> clnt_max_conns: 0x8             =       0x4
>>
>> haggis:/tmp/c# ls -als
>> total 7235597
>>    3 drwxr-xr-x   2 root     root          10 Mar 16 10:47 .
>>    8 drwxrwxrwt   8 root     sys          720 Mar 16 19:57 ..
>> 878969 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_1
>> 794225 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_2
>> 1130659 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_3
>> 865401 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_4
>> 778861 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_5
>> 953995 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_6
>> 969355 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_7
>> 864121 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_8
>> haggis:/tmp/c# Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice]
>> connmgr_get: too many conns, dooming entry ffffff01e0ca9240
>> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: 
>> too many
>> conns, dooming entry ffffff0205a14280
>> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: 
>> too many
>> conns, dooming entry ffffff01fed6d700
>> Mar 16 20:01:56 haggis rpcmod: [ID 882835 kern.notice] connmgr_get: 
>> too many
>> conns, dooming entry ffffff01f468d380
>>
>> ... waiting a while ...
>>
>> Mar 16 20:05:12 haggis rpcmod: [ID 772401 kern.notice] 
>> connmgr_sndrel: sending
>> ordrel to queue ffffff020b4a7e50
>> Mar 16 20:05:12 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff020b4a7e50: got T_ORDREL_REQ
>> Mar 16 20:05:12 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff020b6cde40: got T_ORDREL_REQ
>> netstat -an| egrep '^192.168.77.100.2049'
>> 192.168.77.100.2049  192.168.77.29.1022   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1021   49152      0 49308      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1020   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1019   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1018   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1017   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1016   49152      0 49304      0 
>> ESTABLISHED
>> haggis:/tmp/c# Mar 16 20:05:46 haggis rpcmod: [ID 772401 kern.notice]
>> connmgr_sndrel: sending ordrel to queue ffffff02039093b0
>> Mar 16 20:05:46 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff02039093b0: got T_ORDREL_REQ
>> Mar 16 20:05:46 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff020b6c88f8: got T_ORDREL_REQ
>> Mar 16 20:06:02 haggis rpcmod: [ID 772401 kern.notice] 
>> connmgr_sndrel: sending
>> ordrel to queue ffffff0202e440f8
>> Mar 16 20:06:02 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff0202e440f8: got T_ORDREL_REQ
>> Mar 16 20:06:02 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff020b6cd3a0: got T_ORDREL_REQ
>> Mar 16 20:06:03 haggis rpcmod: [ID 772401 kern.notice] 
>> connmgr_sndrel: sending
>> ordrel to queue ffffff02030cdba0
>> Mar 16 20:06:03 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff02030cdba0: got T_ORDREL_REQ
>> Mar 16 20:06:03 haggis rpcmod: [ID 776223 kern.notice] mir_wput_other wq
>> 0xffffff02038798f0: got T_ORDREL_REQ
>> netstat -an| egrep '^192.168.77.100.2049'
>> 192.168.77.100.2049  192.168.77.29.1019   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1018   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1017   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1016   49152      0 49304      0 
>> ESTABLISHED
>>
>>
>> #### Simple test with one connection only
>>
>>> clnt_max_conns/W 1
>> clnt_max_conns: 0x4             =       0x1
>>
>> haggis:/tmp/c# iostat -xznM 5& dd if=1g_3 of=/dev/null bs=$((1024*1024))
>> [2] 100717
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>    17.5    4.6    1.4    0.1  0.4  0.5   16.8   24.7   6  11 c5t0d0
>>    98.8    0.0    3.1    0.0  0.0  0.0    0.1    0.5   0   3
>> 192.168.77.100:/var/tmp/s
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>   533.7    0.0   66.7    0.0 24.8  5.6   46.5   10.5  91  98 c5t0d0
>>  1916.6    0.0   59.7    0.0  0.4  2.3    0.2    1.2  21  96
>> 192.168.77.100:/var/tmp/s
>> kill %2
>>                     extended device statistics
>>     r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
>>   350.4    0.0   43.8    0.0 19.8  4.5   56.5   12.7  73  76 c5t0d0
>>  2958.3    0.0   90.5    0.0  0.3  2.4    0.1    0.8  21  97
>> 192.168.77.100:/var/tmp/s
>> 1024+0 records in
>> 1024+0 records out
>> haggis:/tmp/c# kill %2
>>
>> haggis:/tmp/c# ls -als
>> total 7235597
>>    3 drwxr-xr-x   2 root     root          10 Mar 16 10:47 .
>>    8 drwxrwxrwt   9 root     sys         1317 Mar 16 20:38 ..
>> 878969 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_1
>> 794225 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_2
>> 1130659 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_3
>> 865401 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_4
>> 778861 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_5
>> 953995 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_6
>> 969355 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_7
>> 864121 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_8
>> haggis:/tmp/c# netstat -an| egrep '^192.168.77.100.2049'
>> 192.168.77.100.2049  192.168.77.29.1015   49152      0 49304      0 
>> ESTABLISHED
>>
>> #### test with >1 destination address
>>
>> haggis:/tmp/d# mkdir /tmp/e
>> haggis:/tmp/d# mount -o vers=3,proto=tcp 192.168.77.29:/var/tmp/s /tmp/e
>> haggis:/tmp/d# netstat -an| egrep '^192.168.77.(29|100).2049'
>> 192.168.77.100.2049  192.168.77.29.1015   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1014   49152      0 49308      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1013   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1012   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1011   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1010   49152      0 49304      0 
>> ESTABLISHED
>> haggis:/tmp/d# cd /tmp/e
>> haggis:/tmp/e# ls -als
>> total 7235597
>>    3 drwxr-xr-x   2 root     root          10 Mar 16 10:47 .
>>    8 drwxrwxrwt  11 root     sys         1433 Mar 16 20:41 ..
>> 878969 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_1
>> 794225 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_2
>> 1130659 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_3
>> 865401 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_4
>> 778861 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_5
>> 953995 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_6
>> 969355 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_7
>> 864121 -rw-------   1 root     root     1073741824 Mar 16 10:49 1g_8
>> haggis:/tmp/e# netstat -an| egrep '^192.168.77.(29|100).2049'
>> 192.168.77.100.2049  192.168.77.29.1015   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1014   49152      0 49308      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1013   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.100.2049  192.168.77.29.1012   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1011   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1010   49152      0 49304      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1009   49152      0 49308      0 
>> ESTABLISHED
>> 192.168.77.29.2049   192.168.77.29.1008   49152      0 49308      0 
>> ESTABLISHED
>> _______________________________________________
>> nfs-discuss mailing list
>> nfs-discuss at opensolaris.org
>
>


Reply via email to