On Mon, Mar 20, 2023 at 3:37 AM Ilya Maximets <[email protected]> wrote:
>
> This patch set covers removal of expressions which are subsets of other
> wider expressions and aggregation of a few granular expressions into
> wider expressions that cover all of them at once.  This allows to avoid
> flow explosion in case of negative matches and reduce the total number
> of flows required for address sets.  More details are in commit messages.
>
> Version 2:
>   * Became a patch set.
>   * Added tests and missing bitmap.h include.
>   * Code switched to work with bitwise maskable fields only (ORDINAL).
>   * Added a new patch to combine smaller expressions into wider ones.
>   * Added a patch to fix a crash uncovered with expression aggregation.
>
> Ilya Maximets (3):
>   expr: Remove supersets from OR expressions.
>   expr: Avoid crash if all sub-expressions crushed down to 'true'.
>   expr: Combine OR sub-expressions into supersets.
>
>  controller/lflow.c      |   5 +-
>  lib/expr.c              | 188 +++++++++++++------
>  tests/ovn-controller.at | 399 ++++++++++++++++++++--------------------
>  tests/ovn.at            | 210 +++++++++++----------
>  4 files changed, 443 insertions(+), 359 deletions(-)
>
> --
> 2.39.2
>

Thanks Ilya for working on this. The same problem was also reported and
discussed briefly in the past: [0], which may be mentioned in reported-by
as well.

I reviewed and tested this series. It definitely works great for the flow
explosion problem caused by negations in expressions. With 5 subnets in !=
{} form, which would have generated hundreds of thousands of flows without
the patch, now ended up with almost nothing.

However, I also see scale problems introduced by this change for more
normal use cases. The loop that tries to combine expressions to supersets
is O(n^2), n = size of an address set. In large scale environments, it is
easy to have more than 10k IPs in an address set - such as when there is a
network policy allowing access from all pods of a big tenant/application. I
reused my scripts for testing my earlier address set I-P patch [1] to test
the performance with this series. As mentioned in the commit message, the
old result was:

Before: ~400ms
After: 11-12ms

Now with this series, it takes 70 seconds for the same test!
As we can see, even before address-set I-P, it took just 400ms. In a large
scale environment, since pods come and go very frequently, even 400ms for
each change would make ovn-controller too busy, and that's why we came up
with address-set I-P, which made it O(1) and much faster. Now with this
change, for each IP change it would take 70s, almost 200 times slower when
recomputing the lflow, not to mention comparing with address-set I-P.

So, I would suggest that the patch 1 should add some logic to restrict the
handling for combining expressions generated by negation (!=) only, and
keep the current logic unchanged for regular non-negative matches. I think
it is possible to add a field in the expr structure to indicate that
information while parsing the != operator.

For the same reason, I'd rather drop patch 3, because the penalty of
disabling I-P is far beyond the gains of reduced number of flows, unless
there are other ways to efficiently combine expression, and we also need to
prove at least in most cases the logic can end up with small enough number
of flows that recomputing-every-time is nothing to worry about, which I am
not even sure it is going to be the case. On the other hand, maybe it is
better to leave the task of crushing large address sets to a central
component or to CMS, and keep ovn-controller simple.

[0] https://www.mail-archive.com/[email protected]/msg61321.html
[1]
https://github.com/ovn-org/ovn/commit/7c0b538762f68b21bad41ed46f779d9084b7a679

Thanks,
Han
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to