At Mon, 18 Jan 2016 12:12:28 +0900,
IWAMOTO Toshihiro wrote:
> I'm sending out this mail to share the finding and discuss how to
> improve with those interested in neutron ovs performance.
> TL;DR: The native of_interface code, which has been merged recently
> and isn't default, seems to consume less CPU time but gives a mixed
> result.  I'm looking into this for improvement.

I went on to look at implementation details of eventlet etc, but it
turned out to be fairly simple.  The OVS agent in the
of_interface=native mode waits for a openflow connection from
ovs-vswitchd, which can take up to 5 seconds.

Please look at the attached graph.
The x-axis is time from agent restarts, the y-axis is numbers of ports
processed (in treat_devices and bind_devices).  Each port is counted
twice; the first slope is treat_devices and the second is
bind_devices.  The native of_interface needs some more time on
start-up, but bind_devices is about 2x faster.

The data was collected with 160 VMs with the devstack default settings.

> * Introduction
> With an ML2+ovs Neutron configuration, openflow rule modification
> happens often and is somewhat a heavy operation as it involves
> exec() of the ovs-ofctl command.
> The native of_interface driver doesn't use the ovs-ofctl command and
> should have less performance impact on the system.  This document
> tries to confirm this hypothesis.
> * Method
> In order to focus on openflow rule operation time and avoid noise from
> other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> restarted and the time it took to reconfigure the flows was measured.
> 1. Use devstack to start a test environment.  As debug logs generate
>    considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> 2. Apply to enable
>    measurement of flow reconfiguration times.
> 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
>    flows.  If you have >16G RAM, more could be booted.
> 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
>    Use time, oprofile, and python's cProfile (use --profile arg) to
>    collect data.
> * Results
> Execution time (averages of 3 runs):
>             native     28.3s user 2.9s sys 0.4s
>             ovs-ofctl  25.7s user 2.2s sys 0.3s
> ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> count in execution time of ovs-ofctl.

With 160 VMs and debug=false for the OVS agent and the neutron-server,

Execution time (averages and SDs of 10 runs):

            native     56.4+-3.4s  user 8.7+-0.1s   sys 0.82+-0.04s
            ovs-ofctl  55.9+-1.0s  user 6.9+-0.08s  sys 0.67+-0.05s

To exclude the openflow connection waits,
times between log outputs of "Loaded agent extensions" and
"Configuration for devices up completed" is also compared:

            native     48.2+-0.49s
            ovs-ofctl  53.2+-0.99s

The native of_interface is the clear winner.

> Oprofile data collected by running "operf -s -t" contain the
> information.
> With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows:
>    samples|      %|
> ------------------
>     87408 100.000 python2.7
>       CPU_CLK_UNHALT...|
>         samples|      %|
>       ------------------
>           69160 79.1232 python2.7
>            8416  9.6284 vmlinux-3.13.0-24-generic
> and "opreport --merge tgid" doesn't show ovs-ofctl.
> With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows:
>    samples|      %|
> ------------------
>     62771 100.000 python2.7
>         CPU_CLK_UNHALT...|
>           samples|      %|
>         ------------------
>             49418 78.7274 python2.7
>              6483 10.3280 vmlinux-3.13.0-24-generic
> and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl 
>     35774  3.5979 ovs-ofctl
>         CPU_CLK_UNHALT...|
>           samples|      %|
>         ------------------
>             28219 78.8813 vmlinux-3.13.0-24-generic
>              3487  9.7473
>              2301  6.4320 ovs-ofctl
> Comparing 87408 (native python) with 62771+35774, the native
> of_interface uses 0.4s less CPU time overall.
> * Conclusion and future steps
> The native of_interface uses slightly less CPU time but takes longer
> time to complete a flow reconfiguration after an agent restart.
> As an OVS agent accounts for only 1/10th of total CPU usage during a
> flow reconfiguration (data not shown), there may be other areas for
> improvement.
> The cProfile Python module gives more fine grained data, but no
> apparent performance bottleneck was found.  The data show more
> eventlet context switches with the native of_interface, which is due
> to how the native of_interface is written.  I'm looking into for
> improving CPU usage and latency.

Attachment: of_int-comparison.pdf
Description: Adobe PDF document

OpenStack Development Mailing List (not for usage questions)

Reply via email to