Re: [ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-26 Thread Dumitru Ceara
On 8/26/20 5:11 PM, Dumitru Ceara wrote:
> On 8/25/20 7:46 PM, Ben Pfaff wrote:
>> On Tue, Aug 25, 2020 at 06:43:51PM +0200, Dumitru Ceara wrote:
>>> On 8/25/20 6:01 PM, Ben Pfaff wrote:
 On Mon, Aug 24, 2020 at 04:28:22PM -0700, Han Zhou wrote:
> As I remember you were working on the new ovn-northd that utilizes DDlog
> for incremental processing. Could you share the current status?
>
> Now that some more improvements have been made in ovn-controller and 
> OVSDB,
> the ovn-northd becomes the more obvious bottleneck for OVN use in large
> scale environments. Since you were not in the OVN meetings for the last
> couple of weeks, could you share here the status and plan moving forward?

 The status is basically that I haven't yet succeeded at getting Red
 Hat's recommended benchmarks running.  I'm told that is important before
 we merge it.  I find them super difficult to set up.  I tried a few
 weeks ago and basically gave up.  Piles and piles of repos all linked
 together in tricky ways, making it really difficult to substitute my own
 branches.  I intend to try again soon, though.  I have a new computer
 that should be arriving soon, which should also allow it to proceed more
 quickly.
>>>
>>> Hi Ben,
>>>
>>> I can try to help with setting up ovn-heater, in theory it should be
>>> enough to export OVS_REPO, OVS_BRANCH, OVN_REPO, OVN_BRANCH, make them
>>> point to your repos and branches and then run "do.sh install" and it
>>> should take care of installing all the dependencies and repos.
>>>
>>> I can also try to run the scale tests on our downstream if that helps.
>>
>> It's probably better if I come up with something locally, because I
>> expect to have to run it multiple times, maybe many times, since I will
>> presumably discover bottlenecks.
>>
>> This time around, I'll speak up when I run into problems.
>>
> 
> Sorry in advance for the log email.
> 
> I went ahead and added a new test scenario to ovn-heater that I think
> might be relevant in the context of ovn-northd incremental processing:
> 
> https://github.com/dceara/ovn-heater#example-run-scenario-3---scale-up-number-of-pods---stress-ovn-northd
> 
> On my test machine:
> Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
> 2 NUMA nodes - 28 cores each.
> 
> I did:
> 
> $ cd
> $ git clone https://github.com/dceara/ovn-heater
> $ cd ovn-heater
> $ cat > physical-deployments/physical-deployment.yml << EOF
> registry-node: localhost
> internal-iface: none
> 
> central-node:
>   name: localhost
> 
> worker-nodes:
>   - localhost
> EOF
> 
> # Install all the required repos and make everything work together using
> # latest OVS and OVN code from github. This generates the
> # ~/ovn-heater/runtime where all the repos are cloned and the test suite
> # is run. This step also generates the container image with OVS/OVN
> # compiled from sources. This step has to be done every time we need
> # to test with a different version of OVS/OVN and can be customized with
> # the OVS/OVN_REPO and OVS/OVN_BRANCH env vars.
> $ ./do.sh install

# Missed a step here:
$ ./do.sh rally-deploy

> 
> # Start the test:
> # This brings up 30 "fake" OVN nodes and then simulates addition of
> # 1000 pods (lsps) and associated policies (port_group/address_set/acl).
> $ ./do.sh browbeat-run
> browbeat-scenarios/switch-per-node-30-node-1000-pods.yml debug-dceara-pods
> 
> # This takes quite long, ~1hr on my system.
> # Results are stored at:
> # ls -l
> ~/ovn-heater/test_results/debug-dceara-pods-20200826-080650/20200826-120718/rally/plugin-workloads/all-rally-run-0.html
> 
> What I noticed was that while the test was running (we can monitor the
> execution by tailing ~/ovn-heater/runtime/browbeat/*.log) that
> ovn-northd's CPU usage increased constantly and was above 70-80% after
> ~500 iterations.
> 
> ovn-northd logs:
> 2020-08-26T14:24:25.989Z|02119|poll_loop|INFO|wakeup due to [POLLIN] on
> fd 12 (192.16.0.1:53642<->192.16.0.1:6642) at lib/stream-ssl.c:832 (97%
> CPU usage)
> 
> 2020-08-26T14:24:31.985Z|02120|poll_loop|INFO|Dropped 54 log messages in
> last 5 seconds (most recently, 0 seconds ago) due to excessive rate
> 
> 
> 2020-08-26T14:24:31.985Z|02121|poll_loop|INFO|wakeup due to [POLLIN] on
> fd 11 (192.16.0.1:56340<->192.16.0.1:6641) at lib/stream-ssl.c:832 (99%
> CPU usage)
> 
> For troubleshooting/profiling, the easiest way I can think of for
> rerunning the sequence of commands without actually running the whole
> suite is to extract them from the ovn-nbctl daemon logs. We start it on
> node ovn-central-1. I also added a short sleep to avoid NB changes being
> batched before ovn-northd processes them:
> 
> $ docker exec ovn-central-1 grep "Running command"
> /var/log/openvswitch/ovn-nbctl.log | sed -ne 's/.*Running command
> run\(.*\)/ovn-nbctl\1; sleep 0.01/p' > commands.sh
> 
> # Now we can just run ovn-northd locally:
> $ ovn-ctl start_northd
> # Start an ovn-nbctl daemon locally:
> $ export 

Re: [ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-26 Thread Dumitru Ceara
On 8/25/20 7:46 PM, Ben Pfaff wrote:
> On Tue, Aug 25, 2020 at 06:43:51PM +0200, Dumitru Ceara wrote:
>> On 8/25/20 6:01 PM, Ben Pfaff wrote:
>>> On Mon, Aug 24, 2020 at 04:28:22PM -0700, Han Zhou wrote:
 As I remember you were working on the new ovn-northd that utilizes DDlog
 for incremental processing. Could you share the current status?

 Now that some more improvements have been made in ovn-controller and OVSDB,
 the ovn-northd becomes the more obvious bottleneck for OVN use in large
 scale environments. Since you were not in the OVN meetings for the last
 couple of weeks, could you share here the status and plan moving forward?
>>>
>>> The status is basically that I haven't yet succeeded at getting Red
>>> Hat's recommended benchmarks running.  I'm told that is important before
>>> we merge it.  I find them super difficult to set up.  I tried a few
>>> weeks ago and basically gave up.  Piles and piles of repos all linked
>>> together in tricky ways, making it really difficult to substitute my own
>>> branches.  I intend to try again soon, though.  I have a new computer
>>> that should be arriving soon, which should also allow it to proceed more
>>> quickly.
>>
>> Hi Ben,
>>
>> I can try to help with setting up ovn-heater, in theory it should be
>> enough to export OVS_REPO, OVS_BRANCH, OVN_REPO, OVN_BRANCH, make them
>> point to your repos and branches and then run "do.sh install" and it
>> should take care of installing all the dependencies and repos.
>>
>> I can also try to run the scale tests on our downstream if that helps.
> 
> It's probably better if I come up with something locally, because I
> expect to have to run it multiple times, maybe many times, since I will
> presumably discover bottlenecks.
> 
> This time around, I'll speak up when I run into problems.
> 

Sorry in advance for the log email.

I went ahead and added a new test scenario to ovn-heater that I think
might be relevant in the context of ovn-northd incremental processing:

https://github.com/dceara/ovn-heater#example-run-scenario-3---scale-up-number-of-pods---stress-ovn-northd

On my test machine:
Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
2 NUMA nodes - 28 cores each.

I did:

$ cd
$ git clone https://github.com/dceara/ovn-heater
$ cd ovn-heater
$ cat > physical-deployments/physical-deployment.yml << EOF
registry-node: localhost
internal-iface: none

central-node:
  name: localhost

worker-nodes:
  - localhost
EOF

# Install all the required repos and make everything work together using
# latest OVS and OVN code from github. This generates the
# ~/ovn-heater/runtime where all the repos are cloned and the test suite
# is run. This step also generates the container image with OVS/OVN
# compiled from sources. This step has to be done every time we need
# to test with a different version of OVS/OVN and can be customized with
# the OVS/OVN_REPO and OVS/OVN_BRANCH env vars.
$ ./do.sh install

# Start the test:
# This brings up 30 "fake" OVN nodes and then simulates addition of
# 1000 pods (lsps) and associated policies (port_group/address_set/acl).
$ ./do.sh browbeat-run
browbeat-scenarios/switch-per-node-30-node-1000-pods.yml debug-dceara-pods

# This takes quite long, ~1hr on my system.
# Results are stored at:
# ls -l
~/ovn-heater/test_results/debug-dceara-pods-20200826-080650/20200826-120718/rally/plugin-workloads/all-rally-run-0.html

What I noticed was that while the test was running (we can monitor the
execution by tailing ~/ovn-heater/runtime/browbeat/*.log) that
ovn-northd's CPU usage increased constantly and was above 70-80% after
~500 iterations.

ovn-northd logs:
2020-08-26T14:24:25.989Z|02119|poll_loop|INFO|wakeup due to [POLLIN] on
fd 12 (192.16.0.1:53642<->192.16.0.1:6642) at lib/stream-ssl.c:832 (97%
CPU usage)

2020-08-26T14:24:31.985Z|02120|poll_loop|INFO|Dropped 54 log messages in
last 5 seconds (most recently, 0 seconds ago) due to excessive rate


2020-08-26T14:24:31.985Z|02121|poll_loop|INFO|wakeup due to [POLLIN] on
fd 11 (192.16.0.1:56340<->192.16.0.1:6641) at lib/stream-ssl.c:832 (99%
CPU usage)

For troubleshooting/profiling, the easiest way I can think of for
rerunning the sequence of commands without actually running the whole
suite is to extract them from the ovn-nbctl daemon logs. We start it on
node ovn-central-1. I also added a short sleep to avoid NB changes being
batched before ovn-northd processes them:

$ docker exec ovn-central-1 grep "Running command"
/var/log/openvswitch/ovn-nbctl.log | sed -ne 's/.*Running command
run\(.*\)/ovn-nbctl\1; sleep 0.01/p' > commands.sh

# Now we can just run ovn-northd locally:
$ ovn-ctl start_northd
# Start an ovn-nbctl daemon locally:
$ export OVN_NB_DAEMON=$(ovn-nbctl --detach)
# Replay the commands:
$ ./commands.sh

Regarding the ddlog compilation I suspect that we need to add support
for it in ovn-fake-multinode which builds and runs the fake node's
images. I can take care of that and add the rust compiler and ddlog
binaries to 

Re: [ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-25 Thread Ben Pfaff
On Tue, Aug 25, 2020 at 06:43:51PM +0200, Dumitru Ceara wrote:
> On 8/25/20 6:01 PM, Ben Pfaff wrote:
> > On Mon, Aug 24, 2020 at 04:28:22PM -0700, Han Zhou wrote:
> >> As I remember you were working on the new ovn-northd that utilizes DDlog
> >> for incremental processing. Could you share the current status?
> >>
> >> Now that some more improvements have been made in ovn-controller and OVSDB,
> >> the ovn-northd becomes the more obvious bottleneck for OVN use in large
> >> scale environments. Since you were not in the OVN meetings for the last
> >> couple of weeks, could you share here the status and plan moving forward?
> > 
> > The status is basically that I haven't yet succeeded at getting Red
> > Hat's recommended benchmarks running.  I'm told that is important before
> > we merge it.  I find them super difficult to set up.  I tried a few
> > weeks ago and basically gave up.  Piles and piles of repos all linked
> > together in tricky ways, making it really difficult to substitute my own
> > branches.  I intend to try again soon, though.  I have a new computer
> > that should be arriving soon, which should also allow it to proceed more
> > quickly.
> 
> Hi Ben,
> 
> I can try to help with setting up ovn-heater, in theory it should be
> enough to export OVS_REPO, OVS_BRANCH, OVN_REPO, OVN_BRANCH, make them
> point to your repos and branches and then run "do.sh install" and it
> should take care of installing all the dependencies and repos.
> 
> I can also try to run the scale tests on our downstream if that helps.

It's probably better if I come up with something locally, because I
expect to have to run it multiple times, maybe many times, since I will
presumably discover bottlenecks.

This time around, I'll speak up when I run into problems.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-25 Thread Dumitru Ceara
On 8/25/20 6:01 PM, Ben Pfaff wrote:
> On Mon, Aug 24, 2020 at 04:28:22PM -0700, Han Zhou wrote:
>> As I remember you were working on the new ovn-northd that utilizes DDlog
>> for incremental processing. Could you share the current status?
>>
>> Now that some more improvements have been made in ovn-controller and OVSDB,
>> the ovn-northd becomes the more obvious bottleneck for OVN use in large
>> scale environments. Since you were not in the OVN meetings for the last
>> couple of weeks, could you share here the status and plan moving forward?
> 
> The status is basically that I haven't yet succeeded at getting Red
> Hat's recommended benchmarks running.  I'm told that is important before
> we merge it.  I find them super difficult to set up.  I tried a few
> weeks ago and basically gave up.  Piles and piles of repos all linked
> together in tricky ways, making it really difficult to substitute my own
> branches.  I intend to try again soon, though.  I have a new computer
> that should be arriving soon, which should also allow it to proceed more
> quickly.

Hi Ben,

I can try to help with setting up ovn-heater, in theory it should be
enough to export OVS_REPO, OVS_BRANCH, OVN_REPO, OVN_BRANCH, make them
point to your repos and branches and then run "do.sh install" and it
should take care of installing all the dependencies and repos.

I can also try to run the scale tests on our downstream if that helps.

Regards,
Dumitru

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-25 Thread Ben Pfaff
On Mon, Aug 24, 2020 at 04:28:22PM -0700, Han Zhou wrote:
> As I remember you were working on the new ovn-northd that utilizes DDlog
> for incremental processing. Could you share the current status?
> 
> Now that some more improvements have been made in ovn-controller and OVSDB,
> the ovn-northd becomes the more obvious bottleneck for OVN use in large
> scale environments. Since you were not in the OVN meetings for the last
> couple of weeks, could you share here the status and plan moving forward?

The status is basically that I haven't yet succeeded at getting Red
Hat's recommended benchmarks running.  I'm told that is important before
we merge it.  I find them super difficult to set up.  I tried a few
weeks ago and basically gave up.  Piles and piles of repos all linked
together in tricky ways, making it really difficult to substitute my own
branches.  I intend to try again soon, though.  I have a new computer
that should be arriving soon, which should also allow it to proceed more
quickly.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Inquiry for DDlog status for ovn-northd

2020-08-24 Thread Han Zhou
Hi Ben and Leonid,

As I remember you were working on the new ovn-northd that utilizes DDlog
for incremental processing. Could you share the current status?

Now that some more improvements have been made in ovn-controller and OVSDB,
the ovn-northd becomes the more obvious bottleneck for OVN use in large
scale environments. Since you were not in the OVN meetings for the last
couple of weeks, could you share here the status and plan moving forward?

Thanks,
Han
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss