On Wed, Nov 15, 2023 at 7:32 PM Numan Siddique <[email protected]> wrote:
>
> On Wed, Nov 15, 2023 at 2:59 AM Han Zhou <[email protected]> wrote:
> >
> > On Thu, Oct 26, 2023 at 11:12 AM <[email protected]> wrote:
> > >
> > > From: Numan Siddique <[email protected]>
> > >
> > > This patch series adds incremental processing in the lflow engine
> > > node to handle changes to northd and other engine nodes.
> > > Changed related to load balancers and NAT are mainly handled in
> > > this patch series.
> > >
> > > This patch series can also be found here -
> > https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
> >
> > Thanks Numan for the great improvement!
>
> Hi Han,
>
> Thanks for the review comments.  I understand it is hard to review
> somany patches, especially related to I-P.
> I appreciate the time spent on it and  the feedback.
>
> >
> > I spent some time these days to review the series and now at patch 10. I
> > need to take a break and so I just sent the comments I had so far.
> >
> > I also did scale testing for northd with
> > https://github.com/hzhou8/ovn-test-script for a 500 chassis, 50 lsp /
> > chassis setup, and noticed that the recompute latency has increased 20%
> > after the series. I think in general it is expected to have some
> > performance penalty for the recompute case because of the new indexes
> > created for I-P. However, the patch 10 "northd: Refactor lflow
management
> > into a separate module." alone introduces a 10% latency increase, which
> > necessitates more investigation.
>
> Before sending out the series I  did some testing on recomputes with a
> large OVN NB DB and SB DB
> (from a 500 node ovn-heater density heavy run).
> I'm aware of the increase in recomputes.  And I did some more testing
> today as well.
>
> In my testing,  I can see that the increase in latency is due to the
> new engine nodes added (lr_lbnat mainly)
> and due to housekeeping of the lflow references.  I do not see any
> increase due to the new lflow-mgr.c added in patch 10.
>
> I compared patch 9 and patch 10 separately and there is no difference
> in lflow engine node recompute time between patch 9 and 10.
>

My results were different. My test profile simulates the ovn-k8s topology
(central mode, not IC) with 500 nodes, 50 LSP/node, with no LB, small
amount of ACLs and PGs.
(
https://github.com/hzhou8/ovn-test-script/blob/main/args.500ch_50lsp_10000pg
)

The test results for ovn-northd recompute time are:
main: 1118.3
p9: 1129.5
p10: 1243.4 ==> 10% more than p9
p18: 1357.6

I am not sure if the p10 increase is related to the hash change or
something else.

> Below are the results with the time taken for the mentioned engine
> nodes in msecs for a recompute for some of the individual patches in
> the series.
>
>
> The sample OVN DBs have
>
> --------------------------------
> Resource              Total
> -------------------------------
> Logical switches    1001
> ----------------------------
> Logical routers      501
> ----------------------------
> Router ports         1501
> ----------------------------
> Switch ports         29501
> ----------------------------
> Load balancers    35253
> ----------------------------
> Load balancer group 1
> ----------------------------
> SB Logical flows    268349
> ----------------------------
> SB DB groups          509
> ----------------------------
>
> There is one load balancer group which has all the load balancers and
> it is associated with all the logical switches and routers.
>
> Below is the time taken for each engine node in msec
>
> ---------------------------------------------------------------------
> Engine nodes     | lb_data | northd  | lr_lbnat  | lflow  |
> ---------------------------------------------------------------------
> ovn-northd-main  | 358      | 2455    | x         | 2082     |
> ---------------------------------------------------------------------
> ovn-northd-p1    | 373       | 2476    | x         | 2170       |
> ---------------------------------------------------------------------
> ovn-northd-p5    | 367       | 2413    | x         | 2195       |
> ---------------------------------------------------------------------
> ovn-northd-p6    | 354       | 688     | 1815      | 2442    |
> ---------------------------------------------------------------------
> ovn-northd-p7    | 357       | 683     | 1778      | 2806    |
> ---------------------------------------------------------------------
> ovn-northd-p9    | 352       | 682     | 1781      | 2781    |
> ---------------------------------------------------------------------
> ovn-northd-p10   | 365      | 838     | 1707      | 2812    |
> ---------------------------------------------------------------------
> ovn-northd-p13   | 362      | 1066    | 1882      | 2917    |
> ---------------------------------------------------------------------
> ovn-northd-p15   | 359      | 1046    | 1688      | 2907    |
> ---------------------------------------------------------------------
> ovn-northd-p18   | 379      | 1066    | 1729      | 2886    |
> ---------------------------------------------------------------------
>
> ovn-northd-p1 means ovn-northd with the patch 1 of this series,
> ovn-northd-p2 - patch 2 of this series and so on.
>
> Patch 6 adds a new engine node lr_lbnat and that's why the northd
> engine node time gets reduced from ~2413 msec to 688.
>
> With the first 10 patches,  northd engine over all time increases to
> 200msec compared to "main '' and lflow engine node time increases to
> around 800 msec.
> The point of this data is to show that the overall increase is mainly
> due to bookkeeping and new engine nodes.  I tried to optimise but I
> couldn't.
>
>
> IIMO this 20% overall increase should be fine for the following reasons
>    -  The scale tests with ovn-heater show significant decrease in
> ovn-northd CPU usage and the decrease in the overall test duration.
>    -  This means there were very few recomputes triggered.
>    -  With the OCP/K8s  kube-burner  tests,  there were 0 recomputes
> in the northd and lflow engine nodes.
>
>
>
> >
> > >
> > > Prior to this patch series, most of the changes to northd engine
> > > resulted in full recomputation of logical flows.  This series
> > > aims to improve the performance of ovn-northd by adding the I-P
> > > support.
> >
> > I'd like to clarify "most of the changes" a little. I think we should
focus
> > on the most impactful changes that happen in a cloud environment. I
don't
> > think it is realistic to achieve "most of the changes" in I-P because
it is
> > too complex (even with this series we still handle a very small part of
the
> > DB schema incrementally), but it may be realistic to handle the most
> > impactful changes, which are the most frequent changes in the cloud,
> > incrementally. Before this series, we could handle regular VIF changes
and
> > related address-sets, port-groups (some of) updates incrementally. Those
> > are the most common changes related to pod/VM changes in cloud. I
believe
> > the next goal is for LB changes related to pod/VMs (i.e. the LB
backends),
> > which I believe is the main goal of this series. Is my understanding
> > correct?
> >
> That is correct.  LB change handling is very important in my opinion
> when used with ovn-kubernetes because all the k8s services
> are mapped to OVN LBs.
>
>
> > While reviewing the patches, I'd say sometimes I feel a little lost
because
> > it is hard to correlate some of the changes with the above goal.
>
> I understand that.  Maybe I've done a bad job in conveying why the
> changes are required for LBs.
>
> Let me try to clarify a bit here.  I'll incorporate and add more
> details in the patch commits and cover letter in the next series.
>
> This series mainly targets handling lflow generation for load balancer
changes.
> As you know we use conntrack in OVN for
>    - Load balancers (both in logical switch and router)
>    - ACLs in logical switch
>    - NAT in load balancers.
>
> And this makes it complicated to decouple logical flow generation for
> LBs from ACLs and NATs.  That is the reason I had to split northd node
> data related to load balancers, ACLs and NATs into separate nodes.
>
> To give an example, we generate certain logical flows for a logical
> switch by checking " if (od->has_lb_vip && od->has_stateful_acl)" (See
> build_acl_hints()).
> For logical routers, a NAT can be a LB VIP too.  This becomes
> difficult to decouple LB and NAT.   We could fall back to recompute
> for NAT changes.  But we still need to group logical flows related to
> LB and NAT.  And that's why I added lr_lbnat_data engine node in patch
> 6.
>

Thanks for explaining. However, I wonder if these are related to LB backend
updates. I agree that service creation/deletion may also be important to be
handled incrementally, but they are less critical (less frequent) than the
backend updates which are directly related to pod/vm creation/deletion. The
has_xxx and NAT related handling are more related to the service
creation/deletion rather than LB backend updates, right?

Thanks,
Han

>  I believe
> > there is a reason for all the changes but I am not sure if it is
directly
> > related to the goal of LB backend related I-P. I understand that there
are
> > other aspects of LB that can be incrementally processed. But I'd
recommend
> > if changes necessary for this goal can be largely reduced it would help
the
> > review and we might be able to merge them sooner and add more but less
> > impactful I-P later step by step. I guess I will have a clearer picture
> > when I finish the rest of the patches, but it would be helpful if you
could
> > add more guidance in this cover letter.
>
> I'll definitely add more details in the cover letter.
>
> Apart from the LB related I-P, this series only adds 2 more additional
> patches for I-P.
> One is for the NB_Global changes and the other for SB mac binding changes.
> ovn-kubernetes periodically writes in the NB_Global.options column
> with a timestamp for its internal health checks
> and this results in unnecessary recomputes.   After running the
> kube-burner density cni tests, I figured out the most common
> changes done in the NB DB and added the I-P for that.  With all these
> patches applied, there are no recomputes in both northd and lflow
> engine nodes
> during the test duration.  Not having the I-P for NB Global for
> example was resulting in significant ovn-northd CPU usage.
>
> Thanks again for the reviews.
>
> Numan
>
> >
> > Thanks,
> > Han
> >
> > >  In order to add this support, some of the northd engine
> > > node data (from struct ovn_datapath) is split and moved over to
> > > new engine nodes - mainly related to load balancers, NAT and ACLs.
> > >
> > > Below are the scale testing results done with these patches applied
> > > using ovn-heater.  The test ran the scenario  -
> > > ocp-500-density-heavy.yml [1].
> > >
> > > With all the lflow I-P patches applied, the resuts are:
> > >
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >                         Min (s)         Median (s)      90%ile (s)
> >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > Failed
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > > Iteration Total         0.136883        1.129016        1.192001
> >  1.204167        1.212728        0.665017        83.127099       125
  0
> > > Namespace.add_ports     0.005216        0.005736        0.007034
> >  0.015486        0.018978        0.006211        0.776373        125
  0
> > > WorkerNode.bind_port    0.035030        0.046082        0.052469
> >  0.058293        0.060311        0.045973        11.493259       250
  0
> > > WorkerNode.ping_port    0.005057        0.006727        1.047692
> >  1.069253        1.071336        0.266896        66.724094       250
  0
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > The results with the present main are:
> > >
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >                         Min (s)         Median (s)      90%ile (s)
> >  99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
> > Failed
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > > Iteration Total         0.135491        2.223805        3.311270
> >  3.339078        3.345346        1.729172        216.146495      125
  0
> > > Namespace.add_ports     0.005380        0.005744        0.006819
> >  0.018773        0.020800        0.006292        0.786532        125
  0
> > > WorkerNode.bind_port    0.034179        0.046055        0.053488
> >  0.058801        0.071043        0.046117        11.529311       250
  0
> > > WorkerNode.ping_port    0.004956        0.006952        3.086952
> >  3.191743        3.192807        0.791544        197.886026      250
  0
> > >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > [1] -
> >
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> > >
> > > v1 -> v2
> > > --------
> > >    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> > >      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> > >      make the lookup effecient.  The same ovn_datapath->index
> > >      is reused.
> > >
> > >    * Made some signficant changes to 'struct lflow_ref' in
lflow-mgr.c.
> > >      In v2 we don't use objdep_mgr to maintain the resource to lflow
> > >      references.  Instead we maintain the 'struct lflow' pointer.
> > >      With this we don't need to maintain additional hmap of lflows.
> > >
> > >
> > > Numan Siddique (18):
> > >   northd: Refactor the northd change tracking.
> > >   northd: Track ovn_datapaths in northd engine track data.
> > >   tests: Add a couple of tests in ovn-northd for I-P.
> > >   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> > >   northd: Add a new engine 'lr-nat' to manage lr NAT data.
> > >   northd: Add a new engine 'lr-lb-nat-data' to manage routers' lb
data.
> > >   northd:  Generate logical router's LB and NAT flows using
> > >     lr_lbnat_data.
> > >   northd: Don't commit dhcp response flows in the conntrack.
> > >   northd: Add a new node ls_lbacls.
> > >   northd: Refactor lflow management into a separate module.
> > >   northd: Use lflow_ref when adding all logical flows.
> > >   northd:  Move ovn_lb_datapaths from lib to northd module.
> > >   northd: Handle lb changes in lflow engine.
> > >   northd: Add lr_lb_nat_data handler for lflow engine node.
> > >   northd: Add ls_lbacls handler for lflow engine node.
> > >   northd:  Add I-P for NB_Global and SB_Global.
> > >   northd: Add a noop handler for northd SB mac binding.
> > >   northd: Add northd change handler for sync_to_sb_lb node.
> > >
> > >  lib/lb.c                   |   96 -
> > >  lib/lb.h                   |   57 -
> > >  lib/ovn-util.c             |   17 +-
> > >  lib/ovn-util.h             |    2 +-
> > >  lib/stopwatch-names.h      |    3 +
> > >  northd/aging.c             |   21 +-
> > >  northd/automake.mk         |   12 +-
> > >  northd/en-global-config.c  |  588 ++++
> > >  northd/en-global-config.h  |   65 +
> > >  northd/en-lflow.c          |  123 +-
> > >  northd/en-lflow.h          |    8 +
> > >  northd/en-lr-lb-nat-data.c |  685 ++++
> > >  northd/en-lr-lb-nat-data.h |  125 +
> > >  northd/en-lr-nat.c         |  498 +++
> > >  northd/en-lr-nat.h         |  137 +
> > >  northd/en-ls-lb-acls.c     |  530 +++
> > >  northd/en-ls-lb-acls.h     |  111 +
> > >  northd/en-northd.c         |   67 +-
> > >  northd/en-northd.h         |    2 +-
> > >  northd/en-port-group.h     |    3 +
> > >  northd/en-sync-sb.c        |  513 ++-
> > >  northd/inc-proc-northd.c   |   79 +-
> > >  northd/lflow-mgr.c         | 1078 ++++++
> > >  northd/lflow-mgr.h         |  192 ++
> > >  northd/northd.c            | 6485
++++++++++++++++--------------------
> > >  northd/northd.h            |  551 ++-
> > >  northd/ovn-northd.c        |    4 +
> > >  tests/ovn-northd.at        |  858 ++++-
> > >  28 files changed, 8827 insertions(+), 4083 deletions(-)
> > >  create mode 100644 northd/en-global-config.c
> > >  create mode 100644 northd/en-global-config.h
> > >  create mode 100644 northd/en-lr-lb-nat-data.c
> > >  create mode 100644 northd/en-lr-lb-nat-data.h
> > >  create mode 100644 northd/en-lr-nat.c
> > >  create mode 100644 northd/en-lr-nat.h
> > >  create mode 100644 northd/en-ls-lb-acls.c
> > >  create mode 100644 northd/en-ls-lb-acls.h
> > >  create mode 100644 northd/lflow-mgr.c
> > >  create mode 100644 northd/lflow-mgr.h
> > >
> > > --
> > > 2.41.0
> > >
> > > _______________________________________________
> > > dev mailing list
> > > [email protected]
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to