Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-15 Thread Miguel Angel Ajo Pelayo
On Fri, Apr 15, 2016 at 7:32 AM, IWAMOTO Toshihiro
 wrote:
> At Mon, 11 Apr 2016 14:42:59 +0200,
> Miguel Angel Ajo Pelayo wrote:
>>
>> On Mon, Apr 11, 2016 at 11:40 AM, IWAMOTO Toshihiro
>>  wrote:
>> > At Fri, 8 Apr 2016 12:21:21 +0200,
>> > Miguel Angel Ajo Pelayo wrote:
>> >>
>> >> Hi, good that you're looking at this,
>> >>
>> >>
>> >> You could create a lot of ports with this method [1] and a bit of extra
>> >> bash, without the extra expense of instance RAM.
>> >>
>> >>
>> >> [1]
>> >> http://www.ajo.es/post/89207996034/creating-a-network-interface-to-tenant-network-in
>> >>
>> >>
>> >> This effort is going to be still more relevant in the context of
>> >> openvswitch firewall. We still need to make sure it's tested with the
>> >> native interface, and eventually we will need flow bundling (like in
>> >> ovs-ofctl --bundle add-flows) where the whole 
>> >> addition/removal/modification
>> >> is sent to be executed atomically by the switch.
>> >
>> > Bad news is that ovs-firewall isn't currently using the native
>> > of_interface much.  I can add install_xxx methods to
>> > OpenFlowSwitchMixin classes so that ovs-firewall can use the native
>> > interface.
>> > Do you have a plan for implementing flow bundling or using conjunction?
>> >
>>
>> Adding Jakub to the thread,
>>
>> IMO, if the native interface is going to provide us with greater speed
>> for rule manipulation, we should look into it.
>>
>> We don't use bundling or conjunctions yet, but it's part of the plan.
>> Bundling will allow atomicity of operations with rules (switching
>> firewall rules, etc, as we have with iptables-save /
>> iptables-restore), and conjunctions will reduce the number of entries.
>> (No expansion of IP addresses for remote groups, no expansion of
>> security group rules per port, when several ports are on the same
>> security group on the same compute host).
>>
>> Do we have any metric of bare rule manipulation time (ms/rule, for example)?
>
> No bare numbers but from a graph in the other mail I sent last week,
> bind_devices for 160 ports (iirc, that amounts to 800 flows) takes
> 4.5sec with of_interface=native and 8sec with of_interface=ovs-ofctl,
> which means an native add-flow is 4ms faster than the other.
>
> As the ovs firewall uses DeferredOVSBridge and has less exec
> overheads, I have no idea how much gain the native of_interface
> brings.
>
>> As a note, we're around 80 rules/port with IPv6 + IPv4 on the default
>> sec group plus a couple of rules.
>
> I booted 120VMs on one network and the default security group
> generated 62k flows.  It seems using conjunction is the #1 item for
> performance.
>

Ouch, hello again cartesian product!, luckily we already know how to
optimize that, now we need to get our hands on it.

@iwamoto, thanks for trying it.



>
>
>>
>> >> On Thu, Apr 7, 2016 at 10:00 AM, IWAMOTO Toshihiro 
>> >> wrote:
>> >>
>> >> > At Thu, 07 Apr 2016 16:33:02 +0900,
>> >> > IWAMOTO Toshihiro wrote:
>> >> > >
>> >> > > At Mon, 18 Jan 2016 12:12:28 +0900,
>> >> > > IWAMOTO Toshihiro wrote:
>> >> > > >
>> >> > > > I'm sending out this mail to share the finding and discuss how to
>> >> > > > improve with those interested in neutron ovs performance.
>> >> > > >
>> >> > > > TL;DR: The native of_interface code, which has been merged recently
>> >> > > > and isn't default, seems to consume less CPU time but gives a mixed
>> >> > > > result.  I'm looking into this for improvement.
>> >> > >
>> >> > > I went on to look at implementation details of eventlet etc, but it
>> >> > > turned out to be fairly simple.  The OVS agent in the
>> >> > > of_interface=native mode waits for a openflow connection from
>> >> > > ovs-vswitchd, which can take up to 5 seconds.
>> >> > >
>> >> > > Please look at the attached graph.
>> >> > > The x-axis is time from agent restarts, the y-axis is numbers of ports
>> >> > > processed (in treat_devices and bind_devices).  Each port is counted
>> >> > > twice; the first slope is treat_devices and the second is
>> >> > > bind_devices.  The native of_interface needs some more time on
>> >> > > start-up, but bind_devices is about 2x faster.
>> >> > >
>> >> > > The data was collected with 160 VMs with the devstack default 
>> >> > > settings.
>> >> >
>> >> > And if you wonder how other services are doing meanwhile, here is a
>> >> > bonus chart.
>> >> >
>> >> > The ovs agent was restarted 3 times with of_interface=native, then 3
>> >> > times with of_interface=ovs-ofctl.
>> >> >
>> >> > As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
>> >> > threaded process is CPU bound.
>> >> >
>> >> > Frankly, the OVS agent would have little rooms for improvement than
>> >> > other services.  Also, it might be fun to draw similar charts for
>> >> > other types of workloads.
>> >> >
>> >> >
>> >> > __
>> >> > OpenStack Development Mailing 

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-14 Thread IWAMOTO Toshihiro
At Mon, 11 Apr 2016 14:42:59 +0200,
Miguel Angel Ajo Pelayo wrote:
> 
> On Mon, Apr 11, 2016 at 11:40 AM, IWAMOTO Toshihiro
>  wrote:
> > At Fri, 8 Apr 2016 12:21:21 +0200,
> > Miguel Angel Ajo Pelayo wrote:
> >>
> >> Hi, good that you're looking at this,
> >>
> >>
> >> You could create a lot of ports with this method [1] and a bit of extra
> >> bash, without the extra expense of instance RAM.
> >>
> >>
> >> [1]
> >> http://www.ajo.es/post/89207996034/creating-a-network-interface-to-tenant-network-in
> >>
> >>
> >> This effort is going to be still more relevant in the context of
> >> openvswitch firewall. We still need to make sure it's tested with the
> >> native interface, and eventually we will need flow bundling (like in
> >> ovs-ofctl --bundle add-flows) where the whole addition/removal/modification
> >> is sent to be executed atomically by the switch.
> >
> > Bad news is that ovs-firewall isn't currently using the native
> > of_interface much.  I can add install_xxx methods to
> > OpenFlowSwitchMixin classes so that ovs-firewall can use the native
> > interface.
> > Do you have a plan for implementing flow bundling or using conjunction?
> >
> 
> Adding Jakub to the thread,
> 
> IMO, if the native interface is going to provide us with greater speed
> for rule manipulation, we should look into it.
> 
> We don't use bundling or conjunctions yet, but it's part of the plan.
> Bundling will allow atomicity of operations with rules (switching
> firewall rules, etc, as we have with iptables-save /
> iptables-restore), and conjunctions will reduce the number of entries.
> (No expansion of IP addresses for remote groups, no expansion of
> security group rules per port, when several ports are on the same
> security group on the same compute host).
> 
> Do we have any metric of bare rule manipulation time (ms/rule, for example)?

No bare numbers but from a graph in the other mail I sent last week,
bind_devices for 160 ports (iirc, that amounts to 800 flows) takes
4.5sec with of_interface=native and 8sec with of_interface=ovs-ofctl,
which means an native add-flow is 4ms faster than the other.

As the ovs firewall uses DeferredOVSBridge and has less exec
overheads, I have no idea how much gain the native of_interface
brings.

> As a note, we're around 80 rules/port with IPv6 + IPv4 on the default
> sec group plus a couple of rules.

I booted 120VMs on one network and the default security group
generated 62k flows.  It seems using conjunction is the #1 item for
performance.



> 
> >> On Thu, Apr 7, 2016 at 10:00 AM, IWAMOTO Toshihiro 
> >> wrote:
> >>
> >> > At Thu, 07 Apr 2016 16:33:02 +0900,
> >> > IWAMOTO Toshihiro wrote:
> >> > >
> >> > > At Mon, 18 Jan 2016 12:12:28 +0900,
> >> > > IWAMOTO Toshihiro wrote:
> >> > > >
> >> > > > I'm sending out this mail to share the finding and discuss how to
> >> > > > improve with those interested in neutron ovs performance.
> >> > > >
> >> > > > TL;DR: The native of_interface code, which has been merged recently
> >> > > > and isn't default, seems to consume less CPU time but gives a mixed
> >> > > > result.  I'm looking into this for improvement.
> >> > >
> >> > > I went on to look at implementation details of eventlet etc, but it
> >> > > turned out to be fairly simple.  The OVS agent in the
> >> > > of_interface=native mode waits for a openflow connection from
> >> > > ovs-vswitchd, which can take up to 5 seconds.
> >> > >
> >> > > Please look at the attached graph.
> >> > > The x-axis is time from agent restarts, the y-axis is numbers of ports
> >> > > processed (in treat_devices and bind_devices).  Each port is counted
> >> > > twice; the first slope is treat_devices and the second is
> >> > > bind_devices.  The native of_interface needs some more time on
> >> > > start-up, but bind_devices is about 2x faster.
> >> > >
> >> > > The data was collected with 160 VMs with the devstack default settings.
> >> >
> >> > And if you wonder how other services are doing meanwhile, here is a
> >> > bonus chart.
> >> >
> >> > The ovs agent was restarted 3 times with of_interface=native, then 3
> >> > times with of_interface=ovs-ofctl.
> >> >
> >> > As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
> >> > threaded process is CPU bound.
> >> >
> >> > Frankly, the OVS agent would have little rooms for improvement than
> >> > other services.  Also, it might be fun to draw similar charts for
> >> > other types of workloads.
> >> >
> >> >
> >> > __
> >> > OpenStack Development Mailing List (not for usage questions)
> >> > Unsubscribe: 
> >> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >> >
> >> >
> >
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: 

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-11 Thread Miguel Angel Ajo Pelayo
On Mon, Apr 11, 2016 at 11:40 AM, IWAMOTO Toshihiro
 wrote:
> At Fri, 8 Apr 2016 12:21:21 +0200,
> Miguel Angel Ajo Pelayo wrote:
>>
>> Hi, good that you're looking at this,
>>
>>
>> You could create a lot of ports with this method [1] and a bit of extra
>> bash, without the extra expense of instance RAM.
>>
>>
>> [1]
>> http://www.ajo.es/post/89207996034/creating-a-network-interface-to-tenant-network-in
>>
>>
>> This effort is going to be still more relevant in the context of
>> openvswitch firewall. We still need to make sure it's tested with the
>> native interface, and eventually we will need flow bundling (like in
>> ovs-ofctl --bundle add-flows) where the whole addition/removal/modification
>> is sent to be executed atomically by the switch.
>
> Bad news is that ovs-firewall isn't currently using the native
> of_interface much.  I can add install_xxx methods to
> OpenFlowSwitchMixin classes so that ovs-firewall can use the native
> interface.
> Do you have a plan for implementing flow bundling or using conjunction?
>

Adding Jakub to the thread,

IMO, if the native interface is going to provide us with greater speed
for rule manipulation, we should look into it.

We don't use bundling or conjunctions yet, but it's part of the plan.
Bundling will allow atomicity of operations with rules (switching
firewall rules, etc, as we have with iptables-save /
iptables-restore), and conjunctions will reduce the number of entries.
(No expansion of IP addresses for remote groups, no expansion of
security group rules per port, when several ports are on the same
security group on the same compute host).

Do we have any metric of bare rule manipulation time (ms/rule, for example)?

As a note, we're around 80 rules/port with IPv6 + IPv4 on the default
sec group plus a couple of rules.






>> On Thu, Apr 7, 2016 at 10:00 AM, IWAMOTO Toshihiro 
>> wrote:
>>
>> > At Thu, 07 Apr 2016 16:33:02 +0900,
>> > IWAMOTO Toshihiro wrote:
>> > >
>> > > At Mon, 18 Jan 2016 12:12:28 +0900,
>> > > IWAMOTO Toshihiro wrote:
>> > > >
>> > > > I'm sending out this mail to share the finding and discuss how to
>> > > > improve with those interested in neutron ovs performance.
>> > > >
>> > > > TL;DR: The native of_interface code, which has been merged recently
>> > > > and isn't default, seems to consume less CPU time but gives a mixed
>> > > > result.  I'm looking into this for improvement.
>> > >
>> > > I went on to look at implementation details of eventlet etc, but it
>> > > turned out to be fairly simple.  The OVS agent in the
>> > > of_interface=native mode waits for a openflow connection from
>> > > ovs-vswitchd, which can take up to 5 seconds.
>> > >
>> > > Please look at the attached graph.
>> > > The x-axis is time from agent restarts, the y-axis is numbers of ports
>> > > processed (in treat_devices and bind_devices).  Each port is counted
>> > > twice; the first slope is treat_devices and the second is
>> > > bind_devices.  The native of_interface needs some more time on
>> > > start-up, but bind_devices is about 2x faster.
>> > >
>> > > The data was collected with 160 VMs with the devstack default settings.
>> >
>> > And if you wonder how other services are doing meanwhile, here is a
>> > bonus chart.
>> >
>> > The ovs agent was restarted 3 times with of_interface=native, then 3
>> > times with of_interface=ovs-ofctl.
>> >
>> > As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
>> > threaded process is CPU bound.
>> >
>> > Frankly, the OVS agent would have little rooms for improvement than
>> > other services.  Also, it might be fun to draw similar charts for
>> > other types of workloads.
>> >
>> >
>> > __
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-11 Thread IWAMOTO Toshihiro
At Fri, 8 Apr 2016 12:21:21 +0200,
Miguel Angel Ajo Pelayo wrote:
> 
> Hi, good that you're looking at this,
> 
> 
> You could create a lot of ports with this method [1] and a bit of extra
> bash, without the extra expense of instance RAM.
> 
> 
> [1]
> http://www.ajo.es/post/89207996034/creating-a-network-interface-to-tenant-network-in
> 
> 
> This effort is going to be still more relevant in the context of
> openvswitch firewall. We still need to make sure it's tested with the
> native interface, and eventually we will need flow bundling (like in
> ovs-ofctl --bundle add-flows) where the whole addition/removal/modification
> is sent to be executed atomically by the switch.

Bad news is that ovs-firewall isn't currently using the native
of_interface much.  I can add install_xxx methods to
OpenFlowSwitchMixin classes so that ovs-firewall can use the native
interface.
Do you have a plan for implementing flow bundling or using conjunction?

> On Thu, Apr 7, 2016 at 10:00 AM, IWAMOTO Toshihiro 
> wrote:
> 
> > At Thu, 07 Apr 2016 16:33:02 +0900,
> > IWAMOTO Toshihiro wrote:
> > >
> > > At Mon, 18 Jan 2016 12:12:28 +0900,
> > > IWAMOTO Toshihiro wrote:
> > > >
> > > > I'm sending out this mail to share the finding and discuss how to
> > > > improve with those interested in neutron ovs performance.
> > > >
> > > > TL;DR: The native of_interface code, which has been merged recently
> > > > and isn't default, seems to consume less CPU time but gives a mixed
> > > > result.  I'm looking into this for improvement.
> > >
> > > I went on to look at implementation details of eventlet etc, but it
> > > turned out to be fairly simple.  The OVS agent in the
> > > of_interface=native mode waits for a openflow connection from
> > > ovs-vswitchd, which can take up to 5 seconds.
> > >
> > > Please look at the attached graph.
> > > The x-axis is time from agent restarts, the y-axis is numbers of ports
> > > processed (in treat_devices and bind_devices).  Each port is counted
> > > twice; the first slope is treat_devices and the second is
> > > bind_devices.  The native of_interface needs some more time on
> > > start-up, but bind_devices is about 2x faster.
> > >
> > > The data was collected with 160 VMs with the devstack default settings.
> >
> > And if you wonder how other services are doing meanwhile, here is a
> > bonus chart.
> >
> > The ovs agent was restarted 3 times with of_interface=native, then 3
> > times with of_interface=ovs-ofctl.
> >
> > As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
> > threaded process is CPU bound.
> >
> > Frankly, the OVS agent would have little rooms for improvement than
> > other services.  Also, it might be fun to draw similar charts for
> > other types of workloads.
> >
> >
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-08 Thread Miguel Angel Ajo Pelayo
Hi, good that you're looking at this,


You could create a lot of ports with this method [1] and a bit of extra
bash, without the extra expense of instance RAM.


[1]
http://www.ajo.es/post/89207996034/creating-a-network-interface-to-tenant-network-in


This effort is going to be still more relevant in the context of
openvswitch firewall. We still need to make sure it's tested with the
native interface, and eventually we will need flow bundling (like in
ovs-ofctl --bundle add-flows) where the whole addition/removal/modification
is sent to be executed atomically by the switch.






On Thu, Apr 7, 2016 at 10:00 AM, IWAMOTO Toshihiro 
wrote:

> At Thu, 07 Apr 2016 16:33:02 +0900,
> IWAMOTO Toshihiro wrote:
> >
> > At Mon, 18 Jan 2016 12:12:28 +0900,
> > IWAMOTO Toshihiro wrote:
> > >
> > > I'm sending out this mail to share the finding and discuss how to
> > > improve with those interested in neutron ovs performance.
> > >
> > > TL;DR: The native of_interface code, which has been merged recently
> > > and isn't default, seems to consume less CPU time but gives a mixed
> > > result.  I'm looking into this for improvement.
> >
> > I went on to look at implementation details of eventlet etc, but it
> > turned out to be fairly simple.  The OVS agent in the
> > of_interface=native mode waits for a openflow connection from
> > ovs-vswitchd, which can take up to 5 seconds.
> >
> > Please look at the attached graph.
> > The x-axis is time from agent restarts, the y-axis is numbers of ports
> > processed (in treat_devices and bind_devices).  Each port is counted
> > twice; the first slope is treat_devices and the second is
> > bind_devices.  The native of_interface needs some more time on
> > start-up, but bind_devices is about 2x faster.
> >
> > The data was collected with 160 VMs with the devstack default settings.
>
> And if you wonder how other services are doing meanwhile, here is a
> bonus chart.
>
> The ovs agent was restarted 3 times with of_interface=native, then 3
> times with of_interface=ovs-ofctl.
>
> As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
> threaded process is CPU bound.
>
> Frankly, the OVS agent would have little rooms for improvement than
> other services.  Also, it might be fun to draw similar charts for
> other types of workloads.
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-07 Thread IWAMOTO Toshihiro
At Thu, 07 Apr 2016 16:33:02 +0900,
IWAMOTO Toshihiro wrote:
> 
> At Mon, 18 Jan 2016 12:12:28 +0900,
> IWAMOTO Toshihiro wrote:
> > 
> > I'm sending out this mail to share the finding and discuss how to
> > improve with those interested in neutron ovs performance.
> > 
> > TL;DR: The native of_interface code, which has been merged recently
> > and isn't default, seems to consume less CPU time but gives a mixed
> > result.  I'm looking into this for improvement.
> 
> I went on to look at implementation details of eventlet etc, but it
> turned out to be fairly simple.  The OVS agent in the
> of_interface=native mode waits for a openflow connection from
> ovs-vswitchd, which can take up to 5 seconds.
> 
> Please look at the attached graph.
> The x-axis is time from agent restarts, the y-axis is numbers of ports
> processed (in treat_devices and bind_devices).  Each port is counted
> twice; the first slope is treat_devices and the second is
> bind_devices.  The native of_interface needs some more time on
> start-up, but bind_devices is about 2x faster.
> 
> The data was collected with 160 VMs with the devstack default settings.

And if you wonder how other services are doing meanwhile, here is a
bonus chart.

The ovs agent was restarted 3 times with of_interface=native, then 3
times with of_interface=ovs-ofctl.

As the test machine has 16 CPUs, 6.25% CPU usage can mean a single
threaded process is CPU bound.

Frankly, the OVS agent would have little rooms for improvement than
other services.  Also, it might be fun to draw similar charts for
other types of workloads.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-04-07 Thread IWAMOTO Toshihiro
At Mon, 18 Jan 2016 12:12:28 +0900,
IWAMOTO Toshihiro wrote:
> 
> I'm sending out this mail to share the finding and discuss how to
> improve with those interested in neutron ovs performance.
> 
> TL;DR: The native of_interface code, which has been merged recently
> and isn't default, seems to consume less CPU time but gives a mixed
> result.  I'm looking into this for improvement.

I went on to look at implementation details of eventlet etc, but it
turned out to be fairly simple.  The OVS agent in the
of_interface=native mode waits for a openflow connection from
ovs-vswitchd, which can take up to 5 seconds.

Please look at the attached graph.
The x-axis is time from agent restarts, the y-axis is numbers of ports
processed (in treat_devices and bind_devices).  Each port is counted
twice; the first slope is treat_devices and the second is
bind_devices.  The native of_interface needs some more time on
start-up, but bind_devices is about 2x faster.

The data was collected with 160 VMs with the devstack default settings.

> * Introduction
> 
> With an ML2+ovs Neutron configuration, openflow rule modification
> happens often and is somewhat a heavy operation as it involves
> exec() of the ovs-ofctl command.
> 
> The native of_interface driver doesn't use the ovs-ofctl command and
> should have less performance impact on the system.  This document
> tries to confirm this hypothesis.
> 
> 
> * Method
> 
> In order to focus on openflow rule operation time and avoid noise from
> other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> restarted and the time it took to reconfigure the flows was measured.
> 
> 1. Use devstack to start a test environment.  As debug logs generate
>considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> 2. Apply https://review.openstack.org/#/c/267905/ to enable
>measurement of flow reconfiguration times.
> 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
>flows.  If you have >16G RAM, more could be booted.
> 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
>Use time, oprofile, and python's cProfile (use --profile arg) to
>collect data.
> 
> * Results
> 
> Execution time (averages of 3 runs):
> 
> native 28.3s user 2.9s sys 0.4s
> ovs-ofctl  25.7s user 2.2s sys 0.3s
> 
> ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> count in execution time of ovs-ofctl.

With 160 VMs and debug=false for the OVS agent and the neutron-server,

Execution time (averages and SDs of 10 runs):

native 56.4+-3.4s  user 8.7+-0.1s   sys 0.82+-0.04s
ovs-ofctl  55.9+-1.0s  user 6.9+-0.08s  sys 0.67+-0.05s

To exclude the openflow connection waits,
times between log outputs of "Loaded agent extensions" and
"Configuration for devices up completed" is also compared:

native 48.2+-0.49s
ovs-ofctl  53.2+-0.99s

The native of_interface is the clear winner.

> Oprofile data collected by running "operf -s -t" contain the
> information.
> 
> With of_interface=native config, "opreport tgid:" shows:
> 
>samples|  %|
> --
> 87408 100.000 python2.7
>   CPU_CLK_UNHALT...|
> samples|  %|
>   --
>   69160 79.1232 python2.7
>8416  9.6284 vmlinux-3.13.0-24-generic
> 
> and "opreport --merge tgid" doesn't show ovs-ofctl.
> 
> With of_interface=ovs-ofctl, "opreport tgid:" shows:
> 
>samples|  %|
> --
> 62771 100.000 python2.7
> CPU_CLK_UNHALT...|
>   samples|  %|
> --
> 49418 78.7274 python2.7
>  6483 10.3280 vmlinux-3.13.0-24-generic
> 
> and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl 
> 
> 35774  3.5979 ovs-ofctl
> CPU_CLK_UNHALT...|
>   samples|  %|
> --
> 28219 78.8813 vmlinux-3.13.0-24-generic
>  3487  9.7473 ld-2.19.so
>  2301  6.4320 ovs-ofctl
> 
> Comparing 87408 (native python) with 62771+35774, the native
> of_interface uses 0.4s less CPU time overall.
> 
> * Conclusion and future steps
> 
> The native of_interface uses slightly less CPU time but takes longer
> time to complete a flow reconfiguration after an agent restart.
> 
> As an OVS agent accounts for only 1/10th of total CPU usage during a
> flow reconfiguration (data not shown), there may be other areas for
> improvement.
> 
> The cProfile Python module gives more fine grained data, but no
> apparent performance bottleneck was found.  The data show more
> eventlet context switches with the native of_interface, which is due
> to how the native of_interface is written.  I'm looking into for
> improving CPU usage and latency.



of_int-comparison.pdf
Description: Adobe PDF document
__
OpenStack Development Mailing List (not for 

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-02-03 Thread IWAMOTO Toshihiro
At Sat, 30 Jan 2016 02:08:55 +,
Wuhongning wrote:
> 
> By our testing, ryu openflow has greatly improved the performance, with 500 
> port vxlan flow table, from 15s to 2.5s, 6 times better.

That's quite a impressive number.
What tests did you do?  Could you share some details?

Also, although unlikely, but please make sure your measurements aren't
affected by https://bugs.launchpad.net/neutron/+bug/1538368 .


> 
> From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
> Sent: Monday, January 25, 2016 5:08 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
> 
> At Thu, 21 Jan 2016 02:59:16 +,
> Wuhongning wrote:
> >
> > I don't think 400 flows can show the difference , do you have setup any 
> > tunnel peer?
> >
> > In fact we may set the network type as "vxlan", then make a fake MD 
> > simulate sending l2pop fdb add messages, to push ten's of thousands flows 
> > into the testing ovs agent.
> 
> I chose this method because I didn't want to write such extra code for
> measurements. ;)
> Of course, I'd love to see data from other test environments and other
> workload than agent restarts.
> 
> Also, we now have https://review.openstack.org/#/c/271939/ and can
> profile neutron-server (and probably others, too).
> I couldn't find non-trivial findings until now, though.
> 
> > 
> > From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
> > Sent: Monday, January 18, 2016 4:37 PM
> > To: OpenStack Development Mailing List (not for usage questions)
> > Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
> >
> > At Mon, 18 Jan 2016 00:42:32 -0500,
> > Kevin Benton wrote:
> > >
> > > Thanks for doing this. A couple of questions:
> > >
> > > What were your rootwrap settings when running these tests? Did you just
> > > have it calling sudo directly?
> >
> > I used devstack's default, which runs root_helper_daemon.
> >
> > > Also, you mention that this is only ~10% of the time spent during flow
> > > reconfiguration. What other areas are eating up so much time?
> >
> >
> > In another run,
> >
> > $ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge 
> > tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head 
> > -1);echo; done|sort -nr -k +2
> > 10071   239058 100.000 python2.714922 100.000 python2.7
> > 999592328 100.000 python2.711450 100.000 python2.7
> > 757988202 100.000 python2.7(18596)
> > 1109451560 100.000 python2.747964 100.000 python2.7
> > 703549687 100.000 python2.740678 100.000 python2.7
> > 1109349380 100.000 python2.736004 100.000 python2.7
> > (legend:
> >  )
> >
> > These processes are neutron-server, nova-api,
> > neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
> > a decending order.
> >
> > So neutron-server uses about 3x CPU time than the ovs agent,
> > nova-api's CPU usage is similar to the ovs agent's, and the others
> > aren't probably significant.
> >
> > > Cheers,
> > > Kevin Benton
> > >
> > > On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro 
> > > <iwam...@valinux.co.jp>
> > > wrote:
> > >
> > > > I'm sending out this mail to share the finding and discuss how to
> > > > improve with those interested in neutron ovs performance.
> > > >
> > > > TL;DR: The native of_interface code, which has been merged recently
> > > > and isn't default, seems to consume less CPU time but gives a mixed
> > > > result.  I'm looking into this for improvement.
> > > >
> > > > * Introduction
> > > >
> > > > With an ML2+ovs Neutron configuration, openflow rule modification
> > > > happens often and is somewhat a heavy operation as it involves
> > > > exec() of the ovs-ofctl command.
> > > >
> > > > The native of_interface driver doesn't use the ovs-ofctl command and
> > > > should have less performance impact on the system.  This document
> > > > tries to confirm this hypothesis.
> > > >
> > > >
> > > > * Method
> > > >
> > > > In order to focus on openflow rule operation time and avoid noise from
> > > > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > > > restarted a

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-01-29 Thread Wuhongning
By our testing, ryu openflow has greatly improved the performance, with 500 
port vxlan flow table, from 15s to 2.5s, 6 times better.

From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
Sent: Monday, January 25, 2016 5:08 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance

At Thu, 21 Jan 2016 02:59:16 +,
Wuhongning wrote:
>
> I don't think 400 flows can show the difference , do you have setup any 
> tunnel peer?
>
> In fact we may set the network type as "vxlan", then make a fake MD simulate 
> sending l2pop fdb add messages, to push ten's of thousands flows into the 
> testing ovs agent.

I chose this method because I didn't want to write such extra code for
measurements. ;)
Of course, I'd love to see data from other test environments and other
workload than agent restarts.

Also, we now have https://review.openstack.org/#/c/271939/ and can
profile neutron-server (and probably others, too).
I couldn't find non-trivial findings until now, though.

> 
> From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
> Sent: Monday, January 18, 2016 4:37 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
>
> At Mon, 18 Jan 2016 00:42:32 -0500,
> Kevin Benton wrote:
> >
> > Thanks for doing this. A couple of questions:
> >
> > What were your rootwrap settings when running these tests? Did you just
> > have it calling sudo directly?
>
> I used devstack's default, which runs root_helper_daemon.
>
> > Also, you mention that this is only ~10% of the time spent during flow
> > reconfiguration. What other areas are eating up so much time?
>
>
> In another run,
>
> $ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge 
> tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head 
> -1);echo; done|sort -nr -k +2
> 10071   239058 100.000 python2.714922 100.000 python2.7
> 999592328 100.000 python2.711450 100.000 python2.7
> 757988202 100.000 python2.7(18596)
> 1109451560 100.000 python2.747964 100.000 python2.7
> 703549687 100.000 python2.740678 100.000 python2.7
> 1109349380 100.000 python2.736004 100.000 python2.7
> (legend:
>  )
>
> These processes are neutron-server, nova-api,
> neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
> a decending order.
>
> So neutron-server uses about 3x CPU time than the ovs agent,
> nova-api's CPU usage is similar to the ovs agent's, and the others
> aren't probably significant.
>
> > Cheers,
> > Kevin Benton
> >
> > On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro <iwam...@valinux.co.jp>
> > wrote:
> >
> > > I'm sending out this mail to share the finding and discuss how to
> > > improve with those interested in neutron ovs performance.
> > >
> > > TL;DR: The native of_interface code, which has been merged recently
> > > and isn't default, seems to consume less CPU time but gives a mixed
> > > result.  I'm looking into this for improvement.
> > >
> > > * Introduction
> > >
> > > With an ML2+ovs Neutron configuration, openflow rule modification
> > > happens often and is somewhat a heavy operation as it involves
> > > exec() of the ovs-ofctl command.
> > >
> > > The native of_interface driver doesn't use the ovs-ofctl command and
> > > should have less performance impact on the system.  This document
> > > tries to confirm this hypothesis.
> > >
> > >
> > > * Method
> > >
> > > In order to focus on openflow rule operation time and avoid noise from
> > > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > > restarted and the time it took to reconfigure the flows was measured.
> > >
> > > 1. Use devstack to start a test environment.  As debug logs generate
> > >considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> > > 2. Apply https://review.openstack.org/#/c/267905/ to enable
> > >measurement of flow reconfiguration times.
> > > 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
> > >flows.  If you have >16G RAM, more could be booted.
> > > 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> > >Use time, oprofile, and python's cProfile (use --profile arg) to
> > >collect data.
> > >
> > > * Results
>

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-01-25 Thread IWAMOTO Toshihiro
At Thu, 21 Jan 2016 02:59:16 +,
Wuhongning wrote:
> 
> I don't think 400 flows can show the difference , do you have setup any 
> tunnel peer?
> 
> In fact we may set the network type as "vxlan", then make a fake MD simulate 
> sending l2pop fdb add messages, to push ten's of thousands flows into the 
> testing ovs agent.

I chose this method because I didn't want to write such extra code for
measurements. ;)
Of course, I'd love to see data from other test environments and other
workload than agent restarts.

Also, we now have https://review.openstack.org/#/c/271939/ and can
profile neutron-server (and probably others, too).
I couldn't find non-trivial findings until now, though.

> 
> From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
> Sent: Monday, January 18, 2016 4:37 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance
> 
> At Mon, 18 Jan 2016 00:42:32 -0500,
> Kevin Benton wrote:
> >
> > Thanks for doing this. A couple of questions:
> >
> > What were your rootwrap settings when running these tests? Did you just
> > have it calling sudo directly?
> 
> I used devstack's default, which runs root_helper_daemon.
> 
> > Also, you mention that this is only ~10% of the time spent during flow
> > reconfiguration. What other areas are eating up so much time?
> 
> 
> In another run,
> 
> $ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge 
> tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head 
> -1);echo; done|sort -nr -k +2
> 10071   239058 100.000 python2.714922 100.000 python2.7
> 999592328 100.000 python2.711450 100.000 python2.7
> 757988202 100.000 python2.7(18596)
> 1109451560 100.000 python2.747964 100.000 python2.7
> 703549687 100.000 python2.740678 100.000 python2.7
> 1109349380 100.000 python2.736004 100.000 python2.7
> (legend:
>  )
> 
> These processes are neutron-server, nova-api,
> neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
> a decending order.
> 
> So neutron-server uses about 3x CPU time than the ovs agent,
> nova-api's CPU usage is similar to the ovs agent's, and the others
> aren't probably significant.
> 
> > Cheers,
> > Kevin Benton
> >
> > On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro <iwam...@valinux.co.jp>
> > wrote:
> >
> > > I'm sending out this mail to share the finding and discuss how to
> > > improve with those interested in neutron ovs performance.
> > >
> > > TL;DR: The native of_interface code, which has been merged recently
> > > and isn't default, seems to consume less CPU time but gives a mixed
> > > result.  I'm looking into this for improvement.
> > >
> > > * Introduction
> > >
> > > With an ML2+ovs Neutron configuration, openflow rule modification
> > > happens often and is somewhat a heavy operation as it involves
> > > exec() of the ovs-ofctl command.
> > >
> > > The native of_interface driver doesn't use the ovs-ofctl command and
> > > should have less performance impact on the system.  This document
> > > tries to confirm this hypothesis.
> > >
> > >
> > > * Method
> > >
> > > In order to focus on openflow rule operation time and avoid noise from
> > > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > > restarted and the time it took to reconfigure the flows was measured.
> > >
> > > 1. Use devstack to start a test environment.  As debug logs generate
> > >considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> > > 2. Apply https://review.openstack.org/#/c/267905/ to enable
> > >measurement of flow reconfiguration times.
> > > 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
> > >flows.  If you have >16G RAM, more could be booted.
> > > 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> > >Use time, oprofile, and python's cProfile (use --profile arg) to
> > >collect data.
> > >
> > > * Results
> > >
> > > Execution time (averages of 3 runs):
> > >
> > > native 28.3s user 2.9s sys 0.4s
> > > ovs-ofctl  25.7s user 2.2s sys 0.3s
> > >
> > > ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> > > count in execution time of ovs-ofctl.
> > >
> > > Oprofile dat

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-01-20 Thread Wuhongning
I don't think 400 flows can show the difference , do you have setup any tunnel 
peer?

In fact we may set the network type as "vxlan", then make a fake MD simulate 
sending l2pop fdb add messages, to push ten's of thousands flows into the 
testing ovs agent.


From: IWAMOTO Toshihiro [iwam...@valinux.co.jp]
Sent: Monday, January 18, 2016 4:37 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron] OVS flow modification performance

At Mon, 18 Jan 2016 00:42:32 -0500,
Kevin Benton wrote:
>
> Thanks for doing this. A couple of questions:
>
> What were your rootwrap settings when running these tests? Did you just
> have it calling sudo directly?

I used devstack's default, which runs root_helper_daemon.

> Also, you mention that this is only ~10% of the time spent during flow
> reconfiguration. What other areas are eating up so much time?


In another run,

$ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge 
tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head -1);echo; 
done|sort -nr -k +2
10071   239058 100.000 python2.714922 100.000 python2.7
999592328 100.000 python2.711450 100.000 python2.7
757988202 100.000 python2.7(18596)
1109451560 100.000 python2.747964 100.000 python2.7
703549687 100.000 python2.740678 100.000 python2.7
1109349380 100.000 python2.736004 100.000 python2.7
(legend:
 )

These processes are neutron-server, nova-api,
neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
a decending order.

So neutron-server uses about 3x CPU time than the ovs agent,
nova-api's CPU usage is similar to the ovs agent's, and the others
aren't probably significant.

> Cheers,
> Kevin Benton
>
> On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro <iwam...@valinux.co.jp>
> wrote:
>
> > I'm sending out this mail to share the finding and discuss how to
> > improve with those interested in neutron ovs performance.
> >
> > TL;DR: The native of_interface code, which has been merged recently
> > and isn't default, seems to consume less CPU time but gives a mixed
> > result.  I'm looking into this for improvement.
> >
> > * Introduction
> >
> > With an ML2+ovs Neutron configuration, openflow rule modification
> > happens often and is somewhat a heavy operation as it involves
> > exec() of the ovs-ofctl command.
> >
> > The native of_interface driver doesn't use the ovs-ofctl command and
> > should have less performance impact on the system.  This document
> > tries to confirm this hypothesis.
> >
> >
> > * Method
> >
> > In order to focus on openflow rule operation time and avoid noise from
> > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > restarted and the time it took to reconfigure the flows was measured.
> >
> > 1. Use devstack to start a test environment.  As debug logs generate
> >considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> > 2. Apply https://review.openstack.org/#/c/267905/ to enable
> >measurement of flow reconfiguration times.
> > 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
> >flows.  If you have >16G RAM, more could be booted.
> > 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> >Use time, oprofile, and python's cProfile (use --profile arg) to
> >collect data.
> >
> > * Results
> >
> > Execution time (averages of 3 runs):
> >
> > native 28.3s user 2.9s sys 0.4s
> > ovs-ofctl  25.7s user 2.2s sys 0.3s
> >
> > ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> > count in execution time of ovs-ofctl.
> >
> > Oprofile data collected by running "operf -s -t" contain the
> > information.
> >
> > With of_interface=native config, "opreport tgid:" shows:
> >
> >samples|  %|
> > --
> > 87408 100.000 python2.7
> > CPU_CLK_UNHALT...|
> >   samples|  %|
> > --
> > 69160 79.1232 python2.7
> >  8416  9.6284 vmlinux-3.13.0-24-generic
> >
> > and "opreport --merge tgid" doesn't show ovs-ofctl.
> >
> > With of_interface=ovs-ofctl, "opreport tgid:" shows:
> >
> >samples|  %|
> > --
> > 62771 100.000 python2.7
> > CPU_CLK_UNHALT...|
> >   samples|  %|
> > --
> > 49418 78.7274 python2.7

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-01-18 Thread IWAMOTO Toshihiro
At Mon, 18 Jan 2016 00:42:32 -0500,
Kevin Benton wrote:
> 
> Thanks for doing this. A couple of questions:
> 
> What were your rootwrap settings when running these tests? Did you just
> have it calling sudo directly?

I used devstack's default, which runs root_helper_daemon.

> Also, you mention that this is only ~10% of the time spent during flow
> reconfiguration. What other areas are eating up so much time?


In another run,

$ for f in `cat tgidlist.n2`; do echo -n $f; opreport -n tgid:$f --merge 
tid|head -1|tr -d '\n'; (cd bg; opreport -n tgid:$f --merge tid|head -1);echo; 
done|sort -nr -k +2
10071   239058 100.000 python2.714922 100.000 python2.7 
999592328 100.000 python2.711450 100.000 python2.7 
757988202 100.000 python2.7(18596)
1109451560 100.000 python2.747964 100.000 python2.7 
703549687 100.000 python2.740678 100.000 python2.7 
1109349380 100.000 python2.736004 100.000 python2.7 
(legend:
 )

These processes are neutron-server, nova-api,
neutron-openvswitch-agent, nova-conductor, dstat and nova-conductor in
a decending order.

So neutron-server uses about 3x CPU time than the ovs agent,
nova-api's CPU usage is similar to the ovs agent's, and the others
aren't probably significant.

> Cheers,
> Kevin Benton
> 
> On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro 
> wrote:
> 
> > I'm sending out this mail to share the finding and discuss how to
> > improve with those interested in neutron ovs performance.
> >
> > TL;DR: The native of_interface code, which has been merged recently
> > and isn't default, seems to consume less CPU time but gives a mixed
> > result.  I'm looking into this for improvement.
> >
> > * Introduction
> >
> > With an ML2+ovs Neutron configuration, openflow rule modification
> > happens often and is somewhat a heavy operation as it involves
> > exec() of the ovs-ofctl command.
> >
> > The native of_interface driver doesn't use the ovs-ofctl command and
> > should have less performance impact on the system.  This document
> > tries to confirm this hypothesis.
> >
> >
> > * Method
> >
> > In order to focus on openflow rule operation time and avoid noise from
> > other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> > restarted and the time it took to reconfigure the flows was measured.
> >
> > 1. Use devstack to start a test environment.  As debug logs generate
> >considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> > 2. Apply https://review.openstack.org/#/c/267905/ to enable
> >measurement of flow reconfiguration times.
> > 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
> >flows.  If you have >16G RAM, more could be booted.
> > 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
> >Use time, oprofile, and python's cProfile (use --profile arg) to
> >collect data.
> >
> > * Results
> >
> > Execution time (averages of 3 runs):
> >
> > native 28.3s user 2.9s sys 0.4s
> > ovs-ofctl  25.7s user 2.2s sys 0.3s
> >
> > ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> > count in execution time of ovs-ofctl.
> >
> > Oprofile data collected by running "operf -s -t" contain the
> > information.
> >
> > With of_interface=native config, "opreport tgid:" shows:
> >
> >samples|  %|
> > --
> > 87408 100.000 python2.7
> > CPU_CLK_UNHALT...|
> >   samples|  %|
> > --
> > 69160 79.1232 python2.7
> >  8416  9.6284 vmlinux-3.13.0-24-generic
> >
> > and "opreport --merge tgid" doesn't show ovs-ofctl.
> >
> > With of_interface=ovs-ofctl, "opreport tgid:" shows:
> >
> >samples|  %|
> > --
> > 62771 100.000 python2.7
> > CPU_CLK_UNHALT...|
> >   samples|  %|
> > --
> > 49418 78.7274 python2.7
> >  6483 10.3280 vmlinux-3.13.0-24-generic
> >
> > and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl
> >
> > 35774  3.5979 ovs-ofctl
> > CPU_CLK_UNHALT...|
> >   samples|  %|
> > --
> > 28219 78.8813 vmlinux-3.13.0-24-generic
> >  3487  9.7473 ld-2.19.so
> >  2301  6.4320 ovs-ofctl
> >
> > Comparing 87408 (native python) with 62771+35774, the native
> > of_interface uses 0.4s less CPU time overall.
> >
> > * Conclusion and future steps
> >
> > The native of_interface uses slightly less CPU time but takes longer
> > time to complete a flow reconfiguration after an agent restart.
> >
> > As an OVS agent accounts for only 1/10th of total CPU usage during a
> > flow reconfiguration (data not shown), there may be other areas for
> > improvement.
> >
> > The cProfile Python module gives more fine grained data, but no
> > apparent performance bottleneck was found.  The data show more
> > eventlet context 

Re: [openstack-dev] [Neutron] OVS flow modification performance

2016-01-17 Thread Kevin Benton
Thanks for doing this. A couple of questions:

What were your rootwrap settings when running these tests? Did you just
have it calling sudo directly?

Also, you mention that this is only ~10% of the time spent during flow
reconfiguration. What other areas are eating up so much time?

Cheers,
Kevin Benton

On Sun, Jan 17, 2016 at 10:12 PM, IWAMOTO Toshihiro 
wrote:

> I'm sending out this mail to share the finding and discuss how to
> improve with those interested in neutron ovs performance.
>
> TL;DR: The native of_interface code, which has been merged recently
> and isn't default, seems to consume less CPU time but gives a mixed
> result.  I'm looking into this for improvement.
>
> * Introduction
>
> With an ML2+ovs Neutron configuration, openflow rule modification
> happens often and is somewhat a heavy operation as it involves
> exec() of the ovs-ofctl command.
>
> The native of_interface driver doesn't use the ovs-ofctl command and
> should have less performance impact on the system.  This document
> tries to confirm this hypothesis.
>
>
> * Method
>
> In order to focus on openflow rule operation time and avoid noise from
> other operations (VM boot-up, etc.), neutron-openvswitch-agent was
> restarted and the time it took to reconfigure the flows was measured.
>
> 1. Use devstack to start a test environment.  As debug logs generate
>considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
> 2. Apply https://review.openstack.org/#/c/267905/ to enable
>measurement of flow reconfiguration times.
> 3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
>flows.  If you have >16G RAM, more could be booted.
> 4. Stop neutron-openvswitch-agent and restart with --run-once arg.
>Use time, oprofile, and python's cProfile (use --profile arg) to
>collect data.
>
> * Results
>
> Execution time (averages of 3 runs):
>
> native 28.3s user 2.9s sys 0.4s
> ovs-ofctl  25.7s user 2.2s sys 0.3s
>
> ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
> count in execution time of ovs-ofctl.
>
> Oprofile data collected by running "operf -s -t" contain the
> information.
>
> With of_interface=native config, "opreport tgid:" shows:
>
>samples|  %|
> --
> 87408 100.000 python2.7
> CPU_CLK_UNHALT...|
>   samples|  %|
> --
> 69160 79.1232 python2.7
>  8416  9.6284 vmlinux-3.13.0-24-generic
>
> and "opreport --merge tgid" doesn't show ovs-ofctl.
>
> With of_interface=ovs-ofctl, "opreport tgid:" shows:
>
>samples|  %|
> --
> 62771 100.000 python2.7
> CPU_CLK_UNHALT...|
>   samples|  %|
> --
> 49418 78.7274 python2.7
>  6483 10.3280 vmlinux-3.13.0-24-generic
>
> and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl
>
> 35774  3.5979 ovs-ofctl
> CPU_CLK_UNHALT...|
>   samples|  %|
> --
> 28219 78.8813 vmlinux-3.13.0-24-generic
>  3487  9.7473 ld-2.19.so
>  2301  6.4320 ovs-ofctl
>
> Comparing 87408 (native python) with 62771+35774, the native
> of_interface uses 0.4s less CPU time overall.
>
> * Conclusion and future steps
>
> The native of_interface uses slightly less CPU time but takes longer
> time to complete a flow reconfiguration after an agent restart.
>
> As an OVS agent accounts for only 1/10th of total CPU usage during a
> flow reconfiguration (data not shown), there may be other areas for
> improvement.
>
> The cProfile Python module gives more fine grained data, but no
> apparent performance bottleneck was found.  The data show more
> eventlet context switches with the native of_interface, which is due
> to how the native of_interface is written.  I'm looking into for
> improving CPU usage and latency.
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Kevin Benton
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Neutron] OVS flow modification performance

2016-01-17 Thread IWAMOTO Toshihiro
I'm sending out this mail to share the finding and discuss how to
improve with those interested in neutron ovs performance.

TL;DR: The native of_interface code, which has been merged recently
and isn't default, seems to consume less CPU time but gives a mixed
result.  I'm looking into this for improvement.

* Introduction

With an ML2+ovs Neutron configuration, openflow rule modification
happens often and is somewhat a heavy operation as it involves
exec() of the ovs-ofctl command.

The native of_interface driver doesn't use the ovs-ofctl command and
should have less performance impact on the system.  This document
tries to confirm this hypothesis.


* Method

In order to focus on openflow rule operation time and avoid noise from
other operations (VM boot-up, etc.), neutron-openvswitch-agent was
restarted and the time it took to reconfigure the flows was measured.

1. Use devstack to start a test environment.  As debug logs generate
   considable amount of load, ENABLE_DEBUG_LOG_LEVEL was set to false.
2. Apply https://review.openstack.org/#/c/267905/ to enable
   measurement of flow reconfiguration times.
3. Boot 80 m1.nano instances.  In my setup, this generates 404 br-int
   flows.  If you have >16G RAM, more could be booted.
4. Stop neutron-openvswitch-agent and restart with --run-once arg.
   Use time, oprofile, and python's cProfile (use --profile arg) to
   collect data.

* Results

Execution time (averages of 3 runs):

native 28.3s user 2.9s sys 0.4s
ovs-ofctl  25.7s user 2.2s sys 0.3s

ovs-ofctl runs faster and seems to use less CPU, but the above doesn't
count in execution time of ovs-ofctl.

Oprofile data collected by running "operf -s -t" contain the
information.

With of_interface=native config, "opreport tgid:" shows:

   samples|  %|
--
87408 100.000 python2.7
CPU_CLK_UNHALT...|
  samples|  %|
--
69160 79.1232 python2.7
 8416  9.6284 vmlinux-3.13.0-24-generic

and "opreport --merge tgid" doesn't show ovs-ofctl.

With of_interface=ovs-ofctl, "opreport tgid:" shows:

   samples|  %|
--
62771 100.000 python2.7
CPU_CLK_UNHALT...|
  samples|  %|
--
49418 78.7274 python2.7
 6483 10.3280 vmlinux-3.13.0-24-generic

and  "opreport --merge tgid" shows CPU consumption by ovs-ofctl 

35774  3.5979 ovs-ofctl
CPU_CLK_UNHALT...|
  samples|  %|
--
28219 78.8813 vmlinux-3.13.0-24-generic
 3487  9.7473 ld-2.19.so
 2301  6.4320 ovs-ofctl

Comparing 87408 (native python) with 62771+35774, the native
of_interface uses 0.4s less CPU time overall.

* Conclusion and future steps

The native of_interface uses slightly less CPU time but takes longer
time to complete a flow reconfiguration after an agent restart.

As an OVS agent accounts for only 1/10th of total CPU usage during a
flow reconfiguration (data not shown), there may be other areas for
improvement.

The cProfile Python module gives more fine grained data, but no
apparent performance bottleneck was found.  The data show more
eventlet context switches with the native of_interface, which is due
to how the native of_interface is written.  I'm looking into for
improving CPU usage and latency.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev