I thought the heartbeat messages were higher priority than packet-ins - but I may be mistaken. Maybe Jozef or someone has an idea on this. Also this may be a case where the split connection handler that was implemented in Carbon may help (lot of connect/disconnects i.e. connection flapping) - is this also an issue with Carbon?
On Sun, Jun 11, 2017 at 2:33 AM, Tim Irnich <[email protected]> wrote: > Luis, I have not looked at the test you link below but what we have > observed is that vSwitch connects and disconnects cause a high load on ODL, > and that the processing of heartbeat messages from other, already connected > vSwitches can get delayed due to this, causing a cascade effect. > > > > /Tim > > > > *From:* Luis Gomez [mailto:[email protected]] > *Sent:* Friday, June 09, 2017 17:07 > *To:* Tim Irnich <[email protected]> > *Cc:* MORTON, ALFRED C (AL) <[email protected]>; Daniel Farrell < > [email protected]>; [email protected]; > [email protected]; Nikolas Hermanns < > [email protected]> > *Subject:* Re: [openflowplugin-dev] [integration-dev] Many OVS > connects/disconnects causing high load, disconnects, failure > > > > I guess we are talking about OpenFlow connections to OVS switches here. If > so what is is the high load scenario the controller is in this test? For > single controller (4 CPUs) we support up to 400 switches loaded with 10K > flows in Carbon: > > > > https://jenkins.opendaylight.org/releng/view/openflowplugin/job/ > openflowplugin-csit-1node-periodic-sw-scalability-daily- > only-carbon/plot/Switch%20Scalability/ > > > > Soon I will bring a test to see if this number holds in a cluster. > > > > BR/Luis > > > > > > > > On Jun 9, 2017, at 5:05 AM, Tim Irnich <[email protected]> wrote: > > > > Al I think you’re hitting the nail on the head here. We were thinking the > same, giving heartbeat messages priority over other message processing > should prevent this cascading effect we have seen. Not sure if the current > framework allows this though… > > > > Regards, Tim > > > > *From:* MORTON, ALFRED C (AL) [mailto:[email protected] <[email protected]>] > > *Sent:* Friday, June 09, 2017 14:02 > *To:* Daniel Farrell <[email protected]>; integration-dev@lists. > opendaylight.org; [email protected] > *Cc:* Tim Irnich <[email protected]>; Nikolas Hermanns < > [email protected]> > *Subject:* RE: [integration-dev] Many OVS connects/disconnects causing > high load, disconnects, failure > > > > Hi Daniel, > > (I’m not impersonating Luis or Jamo, but measuring lost > > southbound packets has been one of my pet projects, as you know...) > > > > If we add the Latte golang tool that Nikkos contributed > > to the OPNFV Cperf project, we should be able to > > correlate ODL load levels (cbench) with OVS message loss ratios > > (and the OVS disconnects, I’m assuming that heartbeats > > have the same priority as PACKETINs for ODL processing, > > and maybe prioritization is part of the solution...) > > > > Definitely worth discussing further in Beijing next week, > > Al > > > > > > *From:* [email protected] [mailto > :[email protected] > <[email protected]>] *On Behalf Of *Daniel > Farrell > *Sent:* Friday, June 09, 2017 6:51 AM > *To:* [email protected]; openflowplug > [email protected] > *Cc:* Tim Irnich; Nikolas Hermanns > *Subject:* [integration-dev] Many OVS connects/disconnects causing high > load, disconnects, failure > > > > Hey Integration/Test, openflowplugin, > > > > OPNFV vswitch perf folks are reporting ODL problems caused by lots of OVS > disconnects. See the description below. > > @Luis, Jamo - What's the most relevant ODL test? > > > > @Others - Can we fix this? > > > > Thanks, > > Daniel Farrell > > On Fri, Jun 9, 2017 at 6:19 AM Nikolas Hermanns < > [email protected]> wrote: > > Hey Daniel, > > > > I hope you will come to the opnfv summit next week :-D. I would like to > discuss with you a new addition to vsperf may be. We have an issue that > through lots of connects and disconnects of ovs, odl is going into to high > load and through that the heart beats from ovs do not reach odl anymore. > Then even more switches do disconnect and finally the whole cluster does > not have networking anymore. > > There are some workarounds for that but basically we would like to setup a > test cases testing the amount of switches odl can easily handle. Not sure > yet something like that. > > > > Can we have a small chat about it next week. > > > > Reach out to me: > > +491729607904 <+49%20172%209607904> (whatsapp + sms) > > [email protected] (sometimes faster + hangouts) > > Or just this mail address. > > > > BR Nikolas > > _______________________________________________ > openflowplugin-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev > > > > _______________________________________________ > integration-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/integration-dev > >
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
