Hi Andrija,
Thanks for the pointers!!! I managed to fix my issues. It would be great to document your experiences on the vxlan/cloudstack docs page. My slow I meant the same behaviour you've described, pings and small file downloads are fast (few MB/s) but large file downloads/transfers speeds fall to few kb/bytes per second (sometimes stalls). I checked and found that lro was off by default: # ethtool -k enp2s0 | grep large large-receive-offload: off [fixed] Another issue I found was that with Ubuntu 18.04 netplan does not apply mtu settings as provided in the yaml file: https://bugs.launchpad.net/ubuntu/+source/nplan/+bug/1724895 (facepalm 😞) For the above issue, I used a workaround described in https://djanotes.blogspot.com/2018/01/netplan-setting-mtu-for-bridge-devices.html and after rebooting my hosts, the nested VM's were able to download files from public Internet per the ISP/network provided speeds. - Rohit <https://cloudstack.apache.org> ________________________________ From: Andrija Panic <andrija.pa...@gmail.com> Sent: Monday, November 19, 2018 7:46:43 PM To: dev Cc: Wido den Hollander Subject: Re: VXLAN and KVm experiences Define slow please ? - MTU for parent interface of all vxlan interfaces is set to 1550 or more (vxlan interface MTU == 50 bytes less than parent interface) ? Can you check if LRO is disabled on physical nics - with LRO issues (spent 2 days of my life for this back in the days...) ping is working fine, but any larger packer goes to almost zero KB/s... (Ubuntu thing btw...) Cheers rohit.ya...@shapeblue.com www.shapeblue.com Amadeus House, Floral Street, London WC2E 9DPUK @shapeblue On Mon, 19 Nov 2018 at 14:36, Rohit Yadav <rohit.ya...@shapeblue.com> wrote: > All, > > I need some pointers around vxlan debugging and configuration: (sorry for > the long email) > > I'm working on a concept CI system where the idea is to setup CloudStack > with kvm hosts and use vxlan isolation for guest, mgmt and public > networks, and then run CI jobs as CloudStack projects where monkeybox VMs > (nested kvm VMs) run in isolated networks and are used to test a CloudStack > build/branch/PR. > > I've two Ubuntu 18.04.1 based i7 mini pcs running KVM, where there is a > single bridge/nic cloudbr0 to carry public, guest and mgmt network that is > vxlan based. I've set max_igmp_memberships to 200 and to see console proxy > etc I used vxlan://untagged for the public IP address range. The gigabit > switch between them does not support igmp snooping. Now the problem is that > in the nested VMs in an isolated network (VRs public nic plugs into > cloudbr0, and guest nic plugs into a bridge that has vxlan end point for > some VNI) , the download speed from public network is very slow. I've > enabled the default udp port for vxlan on both hosts. How do I debug > vxlans, what's going wrong? (do note that I've a single bridge for all > those networks, with no vlans) > > > Regards, > Rohit Yadav > > ________________________________ > From: Simon Weller <swel...@ena.com.INVALID> > Sent: Wednesday, November 14, 2018 10:55:18 PM > To: Wido den Hollander; dev@cloudstack.apache.org > Subject: Re: VXLAN and KVm experiences > > Wido, > > > Here is the original document on the implemention for VXLAN in ACS - > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Linux+native+VXLAN+support+on+KVM+hypervisor > > It may shed some light on the reasons for the different multicast groups. > > > - Si > > ________________________________ > From: Wido den Hollander <w...@widodh.nl> > Sent: Tuesday, November 13, 2018 4:40 AM > To: dev@cloudstack.apache.org; Simon Weller > Subject: Re: VXLAN and KVm experiences > > > > On 10/23/18 2:34 PM, Simon Weller wrote: > > Linux native VXLAN uses multicast and each host has to participate in > multicast in order to see the VXLAN networks. We haven't tried using PIM > across a L3 boundary with ACS, although it will probably work fine. > > > > Another option is to use a L3 VTEP, but right now there is no native > support for that in CloudStack's VXLAN implementation, although we've > thought about proposing it as feature. > > > > Getting back to this I see CloudStack does this: > > local mcastGrp="239.$(( ($vxlanId >> 16) % 256 )).$(( ($vxlanId >> 8) % > 256 )).$(( $vxlanId % 256 ))" > > VNI 1000 would use group 239.0.3.232 and VNI 1001 uses 239.0.3.233 1000. > > Why are we using a different mcast group for every VNI? As the VNI is > encoded in the packet this should just work in one group, right? > > Because this way you need to configure all those groups on your > Router(s) as each VNI will use a different Multicast Group. > > I'm just looking for the reason why we have this different multicast > groups. > > I was thinking that we might want to add a option to agent.properties > where we allow users to set a fixed Multicast group for all traffic. > > Wido > > [0]: > > https://github.com/apache/cloudstack/blob/master/scripts/vm/network/vnet/modifyvxlan.sh#L33 > > > > > > > ________________________________ > > From: Wido den Hollander <w...@widodh.nl> > > Sent: Tuesday, October 23, 2018 7:17 AM > > To: dev@cloudstack.apache.org; Simon Weller > > Subject: Re: VXLAN and KVm experiences > > > > > > > > On 10/23/18 1:51 PM, Simon Weller wrote: > >> We've also been using VXLAN on KVM for all of our isolated VPC guest > networks for quite a long time now. As Andrija pointed out, make sure you > increase the max_igmp_memberships param and also put an ip address on each > interface host VXLAN interface in the same subnet for all hosts that will > share networking, or multicast won't work. > >> > > > > Thanks! So you are saying that all hypervisors need to be in the same L2 > > network or are you routing the multicast? > > > > My idea was that each POD would be an isolated Layer 3 domain and that a > > VNI would span over the different Layer 3 networks. > > > > I don't like STP and other Layer 2 loop-prevention systems. > > > > Wido > > > >> > >> - Si > >> > >> > >> ________________________________ > >> From: Wido den Hollander <w...@widodh.nl> > >> Sent: Tuesday, October 23, 2018 5:21 AM > >> To: dev@cloudstack.apache.org > >> Subject: Re: VXLAN and KVm experiences > >> > >> > >> > >> On 10/23/18 11:21 AM, Andrija Panic wrote: > >>> Hi Wido, > >>> > >>> I have "pioneered" this one in production for last 3 years (and > suffered a > >>> nasty pain of silent drop of packages on kernel 3.X back in the days > >>> because of being unaware of max_igmp_memberships kernel parameters, so > I > >>> have updated the manual long time ago). > >>> > >>> I never had any issues (beside above nasty one...) and it works very > well. > >> > >> That's what I want to hear! > >> > >>> To avoid above issue that I described - you should increase > >>> max_igmp_memberships (/proc/sys/net/ipv4/igmp_max_memberships) - > otherwise > >>> with more than 20 vxlan interfaces, some of them will stay in down > state > >>> and have a hard traffic drop (with proper message in agent.log) with > kernel > >>>> 4.0 (or I silent, bitchy random packet drop on kernel 3.X...) - and > also > >>> pay attention to MTU size as well - anyway everything is in the manual > (I > >>> updated everything I though was missing) - so please check it. > >>> > >> > >> Yes, the underlying network will all be 9000 bytes MTU. > >> > >>> Our example setup: > >>> > >>> We have i.e. bond.950 as the main VLAN which will carry all vxlan > "tunnels" > >>> - so this is defined as KVM traffic label. In our case it didn't make > sense > >>> to use bridge on top of this bond0.950 (as the traffic label) - you can > >>> test it on your own - since this bridge is used only to extract child > >>> bond0.950 interface name, then based on vxlan ID, ACS will provision > >>> vxlan...@bond0.xxx and join this new vxlan interface to NEW bridge > created > >>> (and then of course vNIC goes to this new bridge), so original bridge > (to > >>> which bond0.xxx belonged) is not used for anything. > >>> > >> > >> Clear, I indeed thought something like that would happen. > >> > >>> Here is sample from above for vxlan 867 used for tenant isolation: > >>> > >>> root@hostname:~# brctl show brvx-867 > >>> > >>> bridge name bridge id STP enabled interfaces > >>> brvx-867 8000.2215cfce99ce no vnet6 > >>> > >>> vxlan867 > >>> > >>> root@hostname:~# ip -d link show vxlan867 > >>> > >>> 297: vxlan867: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8142 qdisc noqueue > >>> master brvx-867 state UNKNOWN mode DEFAULT group default qlen 1000 > >>> link/ether 22:15:cf:ce:99:ce brd ff:ff:ff:ff:ff:ff promiscuity 1 > >>> vxlan id 867 group 239.0.3.99 dev bond0.950 port 0 0 ttl 10 ageing > 300 > >>> > >>> root@ix1-c7-2:~# ifconfig bond0.950 | grep MTU > >>> UP BROADCAST RUNNING MULTICAST MTU:8192 Metric:1 > >>> > >>> So note how the vxlan interface has by 50 bytes smaller MTU than the > >>> bond0.950 parent interface (which could affects traffic inside VM) - so > >>> jumbo frames are needed anyway on the parent interface (bond.950 in > example > >>> above with minimum of 1550 MTU) > >>> > >> > >> Yes, thanks! We will be using 1500 MTU inside the VMs, so all the > >> networks underneath will be ~9k. > >> > >>> Ping me if more details needed, happy to help. > >>> > >> > >> Awesome! We'll be doing a PoC rather soon. I'll come back with our > >> experiences later. > >> > >> Wido > >> > >>> Cheers > >>> Andrija > >>> > >> > rohit.ya...@shapeblue.com > www.shapeblue.com<http://www.shapeblue.com> > Amadeus House, Floral Street, London WC2E 9DPUK > @shapeblue > > > > > On Tue, 23 Oct 2018 at 08:23, Wido den Hollander <w...@widodh.nl> wrote: > >>> > >>>> Hi, > >>>> > >>>> I just wanted to know if there are people out there using KVM with > >>>> Advanced Networking and using VXLAN for different networks. > >>>> > >>>> Our main goal would be to spawn a VM and based on the network the NIC > is > >>>> in attach it to a different VXLAN bridge on the KVM host. > >>>> > >>>> It seems to me that this should work, but I just wanted to check and > see > >>>> if people have experience with it. > >>>> > >>>> Wido > >>>> > >>> > >>> > >> > > > -- Andrija Panić