Hey,

AL, Daniel shall we sit together in the lunch break today?


just to give a idea of our problem. The scenario looks like this:

Shared Openstack/ODL controller 4 CPU cores. 3-4 compute nodes each having 1 OVS dpnode. So we are far away from 400 switches. And in the time CPU load from Openstack services is rather low but from ODL is very high. This is the bug we wrote to ODL:

https://bugs.opendaylight.org/show_bug.cgi?id=8186

And this is our corresponding jira issue in OPNFV:

https://jira.opnfv.org/browse/SDNVPN-144

BR Nikolas


On 11.06.2017 11:33, Tim Irnich wrote:

Luis, I have not looked at the test you link below but what we have observed is that vSwitch connects and disconnects cause a high load on ODL, and that the processing of heartbeat messages from other, already connected vSwitches can get delayed due to this, causing a cascade effect.

/Tim

*From:* Luis Gomez [mailto:[email protected]]
*Sent:* Friday, June 09, 2017 17:07
*To:* Tim Irnich <[email protected]>
*Cc:* MORTON, ALFRED C (AL) <[email protected]>; Daniel Farrell <[email protected]>; [email protected]; [email protected]; Nikolas Hermanns <[email protected]> *Subject:* Re: [openflowplugin-dev] [integration-dev] Many OVS connects/disconnects causing high load, disconnects, failure

I guess we are talking about OpenFlow connections to OVS switches here. If so what is is the high load scenario the controller is in this test? For single controller (4 CPUs) we support up to 400 switches loaded with 10K flows in Carbon:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-1node-periodic-sw-scalability-daily-only-carbon/plot/Switch%20Scalability/

Soon I will bring a test to see if this number holds in a cluster.

BR/Luis

    On Jun 9, 2017, at 5:05 AM, Tim Irnich <[email protected]
    <mailto:[email protected]>> wrote:

    Al I think you’re hitting the nail on the head here. We were
    thinking the same, giving heartbeat messages priority over other
    message processing should prevent this cascading effect we have
    seen. Not sure if the current framework allows this though…

    Regards, Tim

    *From:*MORTON, ALFRED C (AL) [mailto:[email protected]]
    *Sent:*Friday, June 09, 2017 14:02
    *To:*Daniel Farrell <[email protected]
    <mailto:[email protected]>>;
    [email protected]
    <mailto:[email protected]>;
    [email protected]
    <mailto:[email protected]>
    *Cc:*Tim Irnich <[email protected]
    <mailto:[email protected]>>; Nikolas Hermanns
    <[email protected] <mailto:[email protected]>>
    *Subject:*RE: [integration-dev] Many OVS connects/disconnects
    causing high load, disconnects, failure

    Hi Daniel,

    (I’m not impersonating Luis or Jamo, but measuring lost

    southbound packets has been one of my pet projects, as you know...)

    If we add the Latte golang tool that Nikkos contributed

    to the OPNFV Cperf project, we should be able to

    correlate ODL load levels (cbench) with OVS message loss ratios

    (and the OVS disconnects, I’m assuming that heartbeats

    have the same priority as PACKETINs for ODL processing,

    and maybe prioritization is part of the solution...)

    Definitely worth discussing further in Beijing next week,

    Al

    *From:*[email protected]
    
<mailto:[email protected]>[mailto:[email protected]]*On
    Behalf Of*Daniel Farrell
    *Sent:*Friday, June 09, 2017 6:51 AM
    *To:*[email protected]
    
<mailto:[email protected]>;[email protected]
    <mailto:[email protected]>
    *Cc:*Tim Irnich; Nikolas Hermanns
    *Subject:*[integration-dev] Many OVS connects/disconnects causing
    high load, disconnects, failure

    Hey Integration/Test, openflowplugin,

    OPNFV vswitch perf folks are reporting ODL problems caused by lots
    of OVS disconnects. See the description below.

    @Luis, Jamo - What's the most relevant ODL test?

    @Others - Can we fix this?

    Thanks,

    Daniel Farrell

    On Fri, Jun 9, 2017 at 6:19 AM Nikolas Hermanns
    <[email protected]
    <mailto:[email protected]>> wrote:

        Hey Daniel,

        I hope you will come to the opnfv summit next week :-D. I
        would like to discuss with you a new addition to vsperf may
        be. We have an issue that through lots of connects and
        disconnects of ovs, odl is going into to high load and through
        that the heart beats from ovs do not reach odl anymore. Then
        even more switches do disconnect and finally the whole cluster
        does not have networking anymore.

        There are some workarounds for that but basically we would
        like to setup a test cases testing the amount of switches odl
        can easily handle. Not sure yet something like that.

        Can we have a small chat about it next week.

        Reach out to me:

        +491729607904 <tel:+49%20172%209607904>(whatsapp + sms)

        [email protected]
        <mailto:[email protected]>(sometimes faster + hangouts)

        Or just this mail address.

        BR Nikolas

    _______________________________________________
    openflowplugin-dev mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev


_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to