Re: VXLAN and KVm experiences

Rohit Yadav Mon, 19 Nov 2018 07:22:55 -0800

Hi Andrija,


Thanks for the pointers!!! I managed to fix my issues. It would be great to 
document your experiences on the vxlan/cloudstack docs page.


My slow I meant the same behaviour you've described, pings and small file 
downloads are fast (few MB/s) but large file downloads/transfers speeds fall to 
few kb/bytes per second (sometimes stalls).


I checked and found that lro was off by default:

# ethtool -k enp2s0 | grep large
large-receive-offload: off [fixed]


Another issue I found was that with Ubuntu 18.04 netplan does not apply mtu 
settings as provided in the yaml file: 
https://bugs.launchpad.net/ubuntu/+source/nplan/+bug/1724895 (facepalm 😞)

For the above issue, I used a workaround described in 
https://djanotes.blogspot.com/2018/01/netplan-setting-mtu-for-bridge-devices.html
 and after rebooting my hosts, the nested VM's were able to download files from 
public Internet per the ISP/network provided speeds.


- Rohit

<https://cloudstack.apache.org>



________________________________
From: Andrija Panic <[email protected]>
Sent: Monday, November 19, 2018 7:46:43 PM
To: dev
Cc: Wido den Hollander
Subject: Re: VXLAN and KVm experiences

Define slow please ? - MTU for parent interface of all vxlan interfaces is
set to 1550 or more (vxlan interface MTU == 50 bytes less than parent
interface) ?
 Can you check if LRO is disabled on physical nics - with LRO issues (spent
2 days of my life for this back in the days...) ping is working fine, but
any larger packer goes to almost zero KB/s... (Ubuntu thing btw...)

Cheers



[email protected] 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue
  
 

On Mon, 19 Nov 2018 at 14:36, Rohit Yadav <[email protected]> wrote:

> All,
>
> I need some pointers around vxlan debugging and configuration: (sorry for
> the long email)
>
> I'm working on a concept CI system where the idea is to setup CloudStack
> with kvm hosts and use vxlan isolation for guest, mgmt  and public
> networks, and then run CI jobs as CloudStack projects where monkeybox VMs
> (nested kvm VMs) run in isolated networks and are used to test a CloudStack
> build/branch/PR.
>
> I've two Ubuntu 18.04.1 based i7 mini pcs running KVM, where there is a
> single bridge/nic cloudbr0 to carry public, guest and mgmt network that is
> vxlan based. I've set max_igmp_memberships to 200 and to see console proxy
> etc I used vxlan://untagged for the public IP address range. The gigabit
> switch between them does not support igmp snooping. Now the problem is that
> in the nested VMs in an isolated network (VRs public nic plugs into
> cloudbr0, and guest nic plugs into a bridge that has vxlan end point for
> some VNI) , the download speed from public network is very slow. I've
> enabled the default udp port for vxlan on both hosts. How do I debug
> vxlans, what's going wrong? (do note that I've a single bridge for all
> those networks, with no vlans)
>
>
> Regards,
> Rohit Yadav
>
> ________________________________
> From: Simon Weller <[email protected]>
> Sent: Wednesday, November 14, 2018 10:55:18 PM
> To: Wido den Hollander; [email protected]
> Subject: Re: VXLAN and KVm experiences
>
> Wido,
>
>
> Here is the original document on the implemention for VXLAN in ACS -
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Linux+native+VXLAN+support+on+KVM+hypervisor
>
> It may shed some light on the reasons for the different multicast groups.
>
>
> - Si
>
> ________________________________
> From: Wido den Hollander <[email protected]>
> Sent: Tuesday, November 13, 2018 4:40 AM
> To: [email protected]; Simon Weller
> Subject: Re: VXLAN and KVm experiences
>
>
>
> On 10/23/18 2:34 PM, Simon Weller wrote:
> > Linux native VXLAN uses multicast and each host has to participate in
> multicast in order to see the VXLAN networks. We haven't tried using PIM
> across a L3 boundary with ACS, although it will probably work fine.
> >
> > Another option is to use a L3 VTEP, but right now there is no native
> support for that in CloudStack's VXLAN implementation, although we've
> thought about proposing it as feature.
> >
>
> Getting back to this I see CloudStack does this:
>
> local mcastGrp="239.$(( ($vxlanId >> 16) % 256 )).$(( ($vxlanId >> 8) %
> 256 )).$(( $vxlanId % 256 ))"
>
> VNI 1000 would use group 239.0.3.232 and VNI 1001 uses 239.0.3.233 1000.
>
> Why are we using a different mcast group for every VNI? As the VNI is
> encoded in the packet this should just work in one group, right?
>
> Because this way you need to configure all those groups on your
> Router(s) as each VNI will use a different Multicast Group.
>
> I'm just looking for the reason why we have this different multicast
> groups.
>
> I was thinking that we might want to add a option to agent.properties
> where we allow users to set a fixed Multicast group for all traffic.
>
> Wido
>
> [0]:
>
> https://github.com/apache/cloudstack/blob/master/scripts/vm/network/vnet/modifyvxlan.sh#L33
>
>
>
> >
> > ________________________________
> > From: Wido den Hollander <[email protected]>
> > Sent: Tuesday, October 23, 2018 7:17 AM
> > To: [email protected]; Simon Weller
> > Subject: Re: VXLAN and KVm experiences
> >
> >
> >
> > On 10/23/18 1:51 PM, Simon Weller wrote:
> >> We've also been using VXLAN on KVM for all of our isolated VPC guest
> networks for quite a long time now. As Andrija pointed out, make sure you
> increase the max_igmp_memberships param and also put an ip address on each
> interface host VXLAN interface in the same subnet for all hosts that will
> share networking, or multicast won't work.
> >>
> >
> > Thanks! So you are saying that all hypervisors need to be in the same L2
> > network or are you routing the multicast?
> >
> > My idea was that each POD would be an isolated Layer 3 domain and that a
> > VNI would span over the different Layer 3 networks.
> >
> > I don't like STP and other Layer 2 loop-prevention systems.
> >
> > Wido
> >
> >>
> >> - Si
> >>
> >>
> >> ________________________________
> >> From: Wido den Hollander <[email protected]>
> >> Sent: Tuesday, October 23, 2018 5:21 AM
> >> To: [email protected]
> >> Subject: Re: VXLAN and KVm experiences
> >>
> >>
> >>
> >> On 10/23/18 11:21 AM, Andrija Panic wrote:
> >>> Hi Wido,
> >>>
> >>> I have "pioneered" this one in production for last 3 years (and
> suffered a
> >>> nasty pain of silent drop of packages on kernel 3.X back in the days
> >>> because of being unaware of max_igmp_memberships kernel parameters, so
> I
> >>> have updated the manual long time ago).
> >>>
> >>> I never had any issues (beside above nasty one...) and it works very
> well.
> >>
> >> That's what I want to hear!
> >>
> >>> To avoid above issue that I described - you should increase
> >>> max_igmp_memberships (/proc/sys/net/ipv4/igmp_max_memberships)  -
> otherwise
> >>> with more than 20 vxlan interfaces, some of them will stay in down
> state
> >>> and have a hard traffic drop (with proper message in agent.log) with
> kernel
> >>>> 4.0 (or I silent, bitchy random packet drop on kernel 3.X...) - and
> also
> >>> pay attention to MTU size as well - anyway everything is in the manual
> (I
> >>> updated everything I though was missing) - so please check it.
> >>>
> >>
> >> Yes, the underlying network will all be 9000 bytes MTU.
> >>
> >>> Our example setup:
> >>>
> >>> We have i.e. bond.950 as the main VLAN which will carry all vxlan
> "tunnels"
> >>> - so this is defined as KVM traffic label. In our case it didn't make
> sense
> >>> to use bridge on top of this bond0.950 (as the traffic label) - you can
> >>> test it on your own - since this bridge is used only to extract child
> >>> bond0.950 interface name, then based on vxlan ID, ACS will provision
> >>> [email protected] and join this new vxlan interface to NEW bridge
> created
> >>> (and then of course vNIC goes to this new bridge), so original bridge
> (to
> >>> which bond0.xxx belonged) is not used for anything.
> >>>
> >>
> >> Clear, I indeed thought something like that would happen.
> >>
> >>> Here is sample from above for vxlan 867 used for tenant isolation:
> >>>
> >>> root@hostname:~# brctl show brvx-867
> >>>
> >>> bridge name     bridge id               STP enabled     interfaces
> >>> brvx-867                8000.2215cfce99ce       no              vnet6
> >>>
> >>>      vxlan867
> >>>
> >>> root@hostname:~# ip -d link show vxlan867
> >>>
> >>> 297: vxlan867: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8142 qdisc noqueue
> >>> master brvx-867 state UNKNOWN mode DEFAULT group default qlen 1000
> >>>     link/ether 22:15:cf:ce:99:ce brd ff:ff:ff:ff:ff:ff promiscuity 1
> >>>     vxlan id 867 group 239.0.3.99 dev bond0.950 port 0 0 ttl 10 ageing
> 300
> >>>
> >>> root@ix1-c7-2:~# ifconfig bond0.950 | grep MTU
> >>>           UP BROADCAST RUNNING MULTICAST  MTU:8192  Metric:1
> >>>
> >>> So note how the vxlan interface has by 50 bytes smaller MTU than the
> >>> bond0.950 parent interface (which could affects traffic inside VM) - so
> >>> jumbo frames are needed anyway on the parent interface (bond.950 in
> example
> >>> above with minimum of 1550 MTU)
> >>>
> >>
> >> Yes, thanks! We will be using 1500 MTU inside the VMs, so all the
> >> networks underneath will be ~9k.
> >>
> >>> Ping me if more details needed, happy to help.
> >>>
> >>
> >> Awesome! We'll be doing a PoC rather soon. I'll come back with our
> >> experiences later.
> >>
> >> Wido
> >>
> >>> Cheers
> >>> Andrija
> >>>
> >>
> [email protected]
> www.shapeblue.com<http://www.shapeblue.com>
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
> > On Tue, 23 Oct 2018 at 08:23, Wido den Hollander <[email protected]> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I just wanted to know if there are people out there using KVM with
> >>>> Advanced Networking and using VXLAN for different networks.
> >>>>
> >>>> Our main goal would be to spawn a VM and based on the network the NIC
> is
> >>>> in attach it to a different VXLAN bridge on the KVM host.
> >>>>
> >>>> It seems to me that this should work, but I just wanted to check and
> see
> >>>> if people have experience with it.
> >>>>
> >>>> Wido
> >>>>
> >>>
> >>>
> >>
> >
>


--

Andrija Panić

Re: VXLAN and KVm experiences

Reply via email to