I'm sending out this mail to share the finding and discuss how to improve with those interested in neutron ovs performance.
TL;DR: The native of_interface code, which has been merged recently and isn't default, seems to consume less CPU time but gives a mixed result. I'm looking into this for improvement. * Introduction With an ML2+ovs Neutron configuration, openflow rule modification happens often and is somewhat a heavy operation as it involves exec() of the ovs-ofctl command. The native of_interface driver doesn't use the ovs-ofctl command and should have less performance impact on the system. This document tries to confirm this hypothesis. * Method In order to focus on openflow rule operation time and avoid noise from other operations (VM boot-up, etc.), neutron-openvswitch-agent was restarted and the time it took to reconfigure the flows was measured. 1. Use devstack to start a test environment. As debug logs generate considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false. 2. Apply https://review.openstack.org/#/c/267905/ to enable measurement of flow reconfiguration times. 3. Boot 80 m1.nano instances. In my setup, this generates 404 br-int flows. If you have >16G RAM, more could be booted. 4. Stop neutron-openvswitch-agent and restart with --run-once arg. Use time, oprofile, and python's cProfile (use --profile arg) to collect data. * Results Execution time (averages of 3 runs): native 28.3s user 2.9s sys 0.4s ovs-ofctl 25.7s user 2.2s sys 0.3s ovs-ofctl runs faster and seems to use less CPU, but the above doesn't count in execution time of ovs-ofctl. Oprofile data collected by running "operf -s -t" contain the information. With of_interface=native config, "opreport tgid:<pid of ovs agent>" shows: samples| %| ------------------ 87408 100.000 python2.7 CPU_CLK_UNHALT...| samples| %| ------------------ 69160 79.1232 python2.7 8416 9.6284 vmlinux-3.13.0-24-generic and "opreport --merge tgid" doesn't show ovs-ofctl. With of_interface=ovs-ofctl, "opreport tgid:<pid of ovs agent>" shows: samples| %| ------------------ 62771 100.000 python2.7 CPU_CLK_UNHALT...| samples| %| ------------------ 49418 78.7274 python2.7 6483 10.3280 vmlinux-3.13.0-24-generic and "opreport --merge tgid" shows CPU consumption by ovs-ofctl 35774 3.5979 ovs-ofctl CPU_CLK_UNHALT...| samples| %| ------------------ 28219 78.8813 vmlinux-3.13.0-24-generic 3487 9.7473 ld-2.19.so 2301 6.4320 ovs-ofctl Comparing 87408 (native python) with 62771+35774, the native of_interface uses 0.4s less CPU time overall. * Conclusion and future steps The native of_interface uses slightly less CPU time but takes longer time to complete a flow reconfiguration after an agent restart. As an OVS agent accounts for only 1/10th of total CPU usage during a flow reconfiguration (data not shown), there may be other areas for improvement. The cProfile Python module gives more fine grained data, but no apparent performance bottleneck was found. The data show more eventlet context switches with the native of_interface, which is due to how the native of_interface is written. I'm looking into for improving CPU usage and latency. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev