Hi,
I have this SR-IOV VM packet drop problems in my lab test which I am not
sure where to dig into and where to ask the question, hope this is the
right mailing list to ask or any direction would be appreciated.
Here is my lab detail
Dell server:
Poweredge R710 + Intel 82599ES 10-Gigabit SFI/SFP+(dual port) + Ubuntu
16.04
SR-IOV enabled:
10: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
mode DEFAULT group default qlen 1000
link/ether e8:ea:6a:06:1b:1a brd ff:ff:ff:ff:ff:ff
vf 0 MAC 4a:d0:eb:c8:76:ea, spoof checking on, link-state auto
vf 1 MAC 52:54:00:eb:39:4a, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
11: enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
mode DEFAULT group default qlen 1000
link/ether e8:ea:6a:06:1b:1b brd ff:ff:ff:ff:ff:ff
vf 0 MAC 6e:8c:99:84:e2:80, spoof checking on, link-state auto
vf 1 MAC 52:54:00:6a:d4:05, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
three VMs provisioned with SR-IOV VF
# virsh list --all
Id Name State
----------------------------------------------------
1 bigip-sriov running <====F5 BIGIP VE
2 sriov-enp4s0f0-vf1 running <====Ubuntu 14.04 VM runs
iperf3 client
3 sriov-enp4s0f1-vf1 running <====Ubuntu 14.04 VM runs
iperf3 server
the VF assignment to guest is
BIGIP VE: enp4s0f0 vf0, enp4s0f1 vf0
Ubuntu VM iperf3 client: enp4s0f0 vf1
Ubuntu VM iperf3 server: enp4s0f1 vf1
the packet path is below, BIGIP VE simply forward the packet
iperf3 client <------> BIGIP VE<----->-iperf3 server
here is iperf3 retransmission and sporadic packet drop:
# ./iperf3 -c 10.2.72.66
Connecting to host 10.2.72.66, port 5201
[ 4] local 10.1.72.16 port 44701 connected to 10.2.72.66 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 109 MBytes 912 Mbits/sec 0 1.34 MBytes
[ 4] 1.00-2.00 sec 110 MBytes 924 Mbits/sec 0 1.34 MBytes
[ 4] 2.00-3.00 sec 109 MBytes 915 Mbits/sec 0 1.34 MBytes
[ 4] 3.00-4.00 sec 108 MBytes 906 Mbits/sec 237 959 KBytes
<====Re-transmission
[ 4] 4.00-5.00 sec 116 MBytes 970 Mbits/sec 0 1.05 MBytes
[ 4] 5.00-6.00 sec 114 MBytes 955 Mbits/sec 0 1.15 MBytes
[ 4] 6.00-7.00 sec 112 MBytes 940 Mbits/sec 0 1.22 MBytes
[ 4] 7.00-8.00 sec 110 MBytes 924 Mbits/sec 0 1.27 MBytes
[ 4] 8.00-9.00 sec 109 MBytes 915 Mbits/sec 0 1.30 MBytes
[ 4] 9.00-10.00 sec 109 MBytes 915 Mbits/sec 0 1.32 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 1.08 GBytes 927 Mbits/sec 237
sender <====Re-transmission here
[ 4] 0.00-10.00 sec 1.08 GBytes 927 Mbits/sec
receiver
if I move iperf3 VM server to a separate physical server, no more
re-transmission and zero packet drop
so this leads me to think there seems to some sporadic resource contentions
here when running multiple SR-IOV VMs on same hypervisor that could cause
network packet drop under high throughput test.
I even tried to pin the vcpu core to physical cores with 'virsh vcpupin' to
avoid context switch and tried to allocate physical cores to VM from same
NUMA node, but I see no difference.
# virsh nodeinfo
CPU model: x86_64
CPU(s): 16
CPU frequency: 2394 MHz
CPU socket(s): 1
Core(s) per socket: 4
Thread(s) per core: 2
NUMA cell(s): 2
Memory size: 74287756 KiB
# virsh capabilites
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>41285380</memory>
<pages unit='KiB' size='4'>9797057</pages>
<pages unit='KiB' size='2048'>1024</pages>
<pages unit='KiB' size='1048576'>0</pages>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='8'>
<cpu id='0' socket_id='1' core_id='0' siblings='0,8'/>
<cpu id='2' socket_id='1' core_id='1' siblings='2,10'/>
<cpu id='4' socket_id='1' core_id='9' siblings='4,12'/>
<cpu id='6' socket_id='1' core_id='10' siblings='6,14'/>
<cpu id='8' socket_id='1' core_id='0' siblings='0,8'/>
<cpu id='10' socket_id='1' core_id='1' siblings='2,10'/>
<cpu id='12' socket_id='1' core_id='9' siblings='4,12'/>
<cpu id='14' socket_id='1' core_id='10' siblings='6,14'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>33002376</memory>
<pages unit='KiB' size='4'>7726306</pages>
<pages unit='KiB' size='2048'>1024</pages>
<pages unit='KiB' size='1048576'>0</pages>
<distances>
<sibling id='0' value='20'/>
<sibling id='1' value='10'/>
</distances>
<cpus num='8'>
<cpu id='1' socket_id='0' core_id='0' siblings='1,9'/>
<cpu id='3' socket_id='0' core_id='1' siblings='3,11'/>
<cpu id='5' socket_id='0' core_id='9' siblings='5,13'/>
<cpu id='7' socket_id='0' core_id='10' siblings='7,15'/>
<cpu id='9' socket_id='0' core_id='0' siblings='1,9'/>
<cpu id='11' socket_id='0' core_id='1' siblings='3,11'/>
<cpu id='13' socket_id='0' core_id='9' siblings='5,13'/>
<cpu id='15' socket_id='0' core_id='10' siblings='7,15'/>
</cpus>
</cell>
</cells>
</topology>
# virsh vcpuinfo bigip-sriov
VCPU: 0
CPU: 0
State: running
CPU time: 249177.5s
CPU Affinity: y---------------
VCPU: 1
CPU: 2
State: running
CPU time: 195544.8s
CPU Affinity: --y-------------
VCPU: 2
CPU: 4
State: running
CPU time: 195515.5s
CPU Affinity: ----y-----------
VCPU: 3
CPU: 6
State: running
CPU time: 194525.0s
CPU Affinity: ------y---------
# virsh vcpuinfo sriov-enp4s0f0-vf1
VCPU: 0
CPU: 1
State: running
CPU time: 4823.9s
CPU Affinity: -y--------------
VCPU: 1
CPU: 3
State: running
CPU time: 4015.0s
CPU Affinity: ---y------------
VCPU: 2
CPU: 5
State: running
CPU time: 3732.3s
CPU Affinity: -----y----------
VCPU: 3
CPU: 7
State: running
CPU time: 3605.6s
CPU Affinity: -------y--------
# virsh vcpuinfo sriov-enp4s0f1-vf1
VCPU: 0
CPU: 9
State: running
CPU time: 2501.8s
CPU Affinity: ---------y------
VCPU: 1
CPU: 11
State: running
CPU time: 2494.1s
CPU Affinity: -----------y----
VCPU: 2
CPU: 13
State: running
CPU time: 3974.2s
CPU Affinity: -------------y--
VCPU: 3
CPU: 15
State: running
CPU time: 2863.9s
CPU Affinity: ---------------y
found SR-IOV research doc https://pdfs.semanticscholar.org/6678/
b17fc8758efea8d32c2d47f9924f8a0cdc6d.pdf, seems old.
could anyone hint me some direction where I should look into? let me know
if I can provide more VM details. I work for F5 networks and our internal
developer found no problem with BIGIP VE driver, I am thinking maybe this
is more in the area of inter SR-IOV VM communication and VF <-->PF things.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired