Re: [ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-04-03 Thread Abhiram Sangana


> On 29 Mar 2023, at 16:21, Dumitru Ceara  wrote:
> 
> On 3/29/23 12:00, Abhiram Sangana wrote:
>>>>> @@ -896,6 +904,26 @@ put_local_common_flows(uint32_t dp_key,
>>>>>pb->header_.uuid.parts[0], , ofpacts_p,
>>>>>>header_.uuid);
>>>>> 
>>>>> +if (zone_ids->drop) {
>>>>> +/* Table 39, Priority 1.
>>>>> + * ===
>>>>> + *
>>>>> + * Clear the logical registers (for consistent behavior with 
>>>>> packets
>>>>> + * that get tunneled) except MFF_LOG_ACL_DROP_ZONE. */
>>>>> +match_init_catchall();
>>>>> +ofpbuf_clear(ofpacts_p);
>>>>> +match_set_metadata(, htonll(dp_key));
>>>>> +for (int i = 0; i < MFF_N_LOG_REGS; i++) {
>>>>> +if ((MFF_REG0 + i) != MFF_LOG_ACL_DROP_ZONE) {
>>>>> +put_load(0, MFF_REG0 + i, 0, 32, ofpacts_p);
>>>>> +}
>>>>> +}
>>> Why do we need this?  Don't we load the CT zone anyway for "to-lport"
>>> ACLs?  Can't we also load the drop zone in the same place?
>> Yes, we are loading the drop zone (reg 9) in the same place as CT zone
>> for “to-lport” ACLs (reg 13) but this happens before table=39 where
>> registers 0 to 9 are cleared.
> 
> Oh, I see now, it's because the drop zone is stored in reg9.  I wanted
> to suggest not allowing reg9 to be used as logical register anymore but
> that's not really easy because it's used in the router pipeline:
> 
> /* Register to store the result of check_pkt_larger action. */
> #define REGBIT_PKT_LARGER"reg9[1]"
> #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
> #define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
> #define REGBIT_DST_NAT_IP_LOCAL "reg9[4]"
> #define REGBIT_KNOWN_ECMP_NH"reg9[5]"
> #define REGBIT_KNOWN_LB_SESSION "reg9[6]"
> 
> [...]
> #define REG_ORIG_TP_DPORT_ROUTER   "reg9[16..31]"
> 
> Can we reuse one of the router-specific zones registers instead?
> 
> #define MFF_LOG_DNAT_ZONE  MFF_REG11  /* conntrack dnat zone for gateway
> router
>   * (32 bits). */
> #define MFF_LOG_SNAT_ZONE  MFF_REG12  /* conntrack snat zone for gateway
> router
>   * (32 bits). */

Currently, we seem to be allocating SNAT and DNAT zones for
logical switches too and loading registers 11 and 12.

bb83b543-0d9d-409b-832c-4fc235355289 is a logical_switch.

$ ovn-appctl ct-zone-list
bb83b543-0d9d-409b-832c-4fc235355289_dnat 1
sw0p1 3
bb83b543-0d9d-409b-832c-4fc235355289_snat 2
bb83b543-0d9d-409b-832c-4fc235355289_drop 4

 cookie=0x2547c5b0, duration=27.257s, table=0, n_packets=0, n_bytes=0, 
idle_age=27, priority=100,in_port=1 
actions=load:0x3->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x4->NXM_NX_REG9[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)

 cookie=0x2547c5b0, duration=27.258s, table=38, n_packets=0, n_bytes=0, 
idle_age=27, priority=100,reg15=0x1,metadata=0x1 
actions=load:0x3->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x4->NXM_NX_REG9[],resubmit(,39)

Is it ok to reuse these registers?

Thanks,
Abhiram

> Regards,
> Dumitru
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-03-29 Thread Abhiram Sangana

Hi Dumitru,

Thanks for reviewing the patch.

> On 23 Mar 2023, at 20:59, Dumitru Ceara  wrote:
> 
> On 3/17/23 20:34, Numan Siddique wrote:
>> On Mon, Feb 13, 2023 at 11:36 AM Abhiram Sangana
>>  wrote:
>>> 
>>> This patch adds support to commit connections dropped/rejected by
>>> ACLs to the connection tracking table. Dropped connections are
>>> committed to conntrack only if NB_Global options:ct_commit_acl_drop
>>> is set to true (false by default) and ACL dropping/rejecting the
>>> connection has label configured. The dropped connections are
>>> committed in a separate conntrack zone so that they can be managed
>>> independently and do not interact with the connection tracking state
>>> of allowed connections.
>>> 
>>> This provides a new approach to identify connections dropped by ACLs
>>> besides the existing ACL logging and drop sampling approaches.
>>> 
>>> Each logical switch is assigned a new conntrack zone for committing
>>> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>>> A new lflow action "ct_commit_drop" is introduced that commits flows
>>> to connection tracking table in a zone identified by
>>> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
>>> and non-empty label translates to include "ct_commit_drop" in its
>>> actions instead of simply dropping/rejecting the packet.
>>> 
>>> Signed-off-by: Abhiram Sangana 
>> 
>> Hi Abhiram,
>> 
> 
> Hi Abhiram,
> 
>> Sorry for the delays in the reviews.
>> 
>> Overall the patch looks good to me.
>> 
>> I do have a few comments and suggestions.
>> 
>> Instead of adding a global option, how about an option per ACL (in the
>> ACL options column) ?
>> Users can enable or disable conntrack commit per ACL.
>> 
> 
> +1 for this, finer granularity is probably better.

Sure, will add an option per ACL.

> I only have a few minor comments below.
> 
>> Are you going to use this feature all the time in your deployment ?
>> Or enable it only when debugging an issue ?
>> If you want to enable it during debugging,  is it a bit of a hassle to
>> enable the ct_commit option for the interested ACL ?
>> 
>> 
>> I'm a little bit concerned with allocating a zone id for each logical
>> switch even if there are no ACLs configured with this option.
>> But I think it's not a big deal.
>> 
>> Thanks
>> Numan
>> 
>>> ---
>>> controller/ovn-controller.c  |  14 +++-
>>> controller/physical.c|  32 -
>>> include/ovn/actions.h|   1 +
>>> include/ovn/logical-fields.h |   1 +
>>> lib/actions.c|  65 +
>>> lib/ovn-util.c   |   4 +-
>>> lib/ovn-util.h   |   2 +-
>>> northd/northd.c  |  25 ++-
>>> northd/ovn-northd.8.xml  |  30 +++-
>>> ovn-nb.xml   |  17 +++--
>>> ovn-sb.xml   |  22 ++
>>> tests/ovn-nbctl.at   |  10 ++-
>>> tests/ovn-northd.at  | 133 ---
>>> tests/ovn.at |  90 +++-
>>> utilities/ovn-nbctl.c|   7 --
>>> utilities/ovn-trace.c|   2 +
> 
> Please also add a NEWS entry.

Sure.
> 
>>> 16 files changed, 383 insertions(+), 72 deletions(-)
>>> 
>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>> index 265740cab..e8f0b7242 100644
>>> --- a/controller/ovn-controller.c
>>> +++ b/controller/ovn-controller.c
>>> @@ -734,8 +734,8 @@ update_ct_zones(const struct shash *binding_lports,
>>> HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>>> /* XXX Add method to limit zone assignment to logical router
>>>  * datapaths with NAT */
>>> -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, 
>>> "dnat");
>>> -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, 
>>> "snat");
>>> +char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, 
>>> "dnat");
>>> +char *snat = alloc_ct_zone_key(>datapath->header_.uuid, 
>>> "snat");
>>> sset_add(_users, dnat);
>>> sset_add(_users, snat);
>>> 
>>> @@ -745,6 +745,14 @@ update_ct_zones(const struct shash *bind

Re: [ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-03-21 Thread Abhiram Sangana


> On 18 Mar 2023, at 01:04, Numan Siddique  wrote:
> 
> On Mon, Feb 13, 2023 at 11:36 AM Abhiram Sangana
>  wrote:
>> 
>> This patch adds support to commit connections dropped/rejected by
>> ACLs to the connection tracking table. Dropped connections are
>> committed to conntrack only if NB_Global options:ct_commit_acl_drop
>> is set to true (false by default) and ACL dropping/rejecting the
>> connection has label configured. The dropped connections are
>> committed in a separate conntrack zone so that they can be managed
>> independently and do not interact with the connection tracking state
>> of allowed connections.
>> 
>> This provides a new approach to identify connections dropped by ACLs
>> besides the existing ACL logging and drop sampling approaches.
>> 
>> Each logical switch is assigned a new conntrack zone for committing
>> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>> A new lflow action "ct_commit_drop" is introduced that commits flows
>> to connection tracking table in a zone identified by
>> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
>> and non-empty label translates to include "ct_commit_drop" in its
>> actions instead of simply dropping/rejecting the packet.
>> 
>> Signed-off-by: Abhiram Sangana 
> 

Hi Numan,

Thanks for reviewing the patch.

> Hi Abhiram,
> 
> Sorry for the delays in the reviews.
> 
> Overall the patch looks good to me.
> 
> I do have a few comments and suggestions.
> 
> Instead of adding a global option, how about an option per ACL (in the
> ACL options column) ?
> Users can enable or disable conntrack commit per ACL.

Sure, will do that.

> 
> Are you going to use this feature all the time in your deployment ?
> Or enable it only when debugging an issue ?
> If you want to enable it during debugging,  is it a bit of a hassle to
> enable the ct_commit option for the interested ACL ?

Yes, we are planning to use this feature all the time in our deployment.
So, it shouldn't be an issue to enable ct_commit option for interested ACLs.

> 
> I'm a little bit concerned with allocating a zone id for each logical
> switch even if there are no ACLs configured with this option.
> But I think it's not a big deal.

Yes, I did try to look into allocating CT zone on an as-needed basis but
had some difficulty implementing it. Will try to explore further in a
subsequent patch.

Thanks,
Abhiram

> 
> Thanks
> Numan

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-03-03 Thread Abhiram Sangana



> On 13 Feb 2023, at 16:35, Abhiram Sangana  wrote:
> 
> This patch adds support to commit connections dropped/rejected by
> ACLs to the connection tracking table. Dropped connections are
> committed to conntrack only if NB_Global options:ct_commit_acl_drop
> is set to true (false by default) and ACL dropping/rejecting the
> connection has label configured. The dropped connections are
> committed in a separate conntrack zone so that they can be managed
> independently and do not interact with the connection tracking state
> of allowed connections.
> 
> This provides a new approach to identify connections dropped by ACLs
> besides the existing ACL logging and drop sampling approaches.
> 
> Each logical switch is assigned a new conntrack zone for committing
> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
> A new lflow action "ct_commit_drop" is introduced that commits flows
> to connection tracking table in a zone identified by
> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
> and non-empty label translates to include "ct_commit_drop" in its
> actions instead of simply dropping/rejecting the packet.
> 
> Signed-off-by: Abhiram Sangana 
> ---
> controller/ovn-controller.c  |  14 +++-
> controller/physical.c|  32 -
> include/ovn/actions.h|   1 +
> include/ovn/logical-fields.h |   1 +
> lib/actions.c|  65 +
> lib/ovn-util.c   |   4 +-
> lib/ovn-util.h   |   2 +-
> northd/northd.c  |  25 ++-
> northd/ovn-northd.8.xml  |  30 +++-
> ovn-nb.xml   |  17 +++--
> ovn-sb.xml   |  22 ++
> tests/ovn-nbctl.at   |  10 ++-
> tests/ovn-northd.at  | 133 ---
> tests/ovn.at |  90 +++-
> utilities/ovn-nbctl.c|   7 --
> utilities/ovn-trace.c|   2 +
> 16 files changed, 383 insertions(+), 72 deletions(-)
> 

Can someone please review this patch?

Thank you,
Abhiram Sangana
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-02-13 Thread Abhiram Sangana
This patch adds support to commit connections dropped/rejected by
ACLs to the connection tracking table. Dropped connections are
committed to conntrack only if NB_Global options:ct_commit_acl_drop
is set to true (false by default) and ACL dropping/rejecting the
connection has label configured. The dropped connections are
committed in a separate conntrack zone so that they can be managed
independently and do not interact with the connection tracking state
of allowed connections.

This provides a new approach to identify connections dropped by ACLs
besides the existing ACL logging and drop sampling approaches.

Each logical switch is assigned a new conntrack zone for committing
dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
A new lflow action "ct_commit_drop" is introduced that commits flows
to connection tracking table in a zone identified by
MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
and non-empty label translates to include "ct_commit_drop" in its
actions instead of simply dropping/rejecting the packet.

Signed-off-by: Abhiram Sangana 
---
 controller/ovn-controller.c  |  14 +++-
 controller/physical.c|  32 -
 include/ovn/actions.h|   1 +
 include/ovn/logical-fields.h |   1 +
 lib/actions.c|  65 +
 lib/ovn-util.c   |   4 +-
 lib/ovn-util.h   |   2 +-
 northd/northd.c  |  25 ++-
 northd/ovn-northd.8.xml  |  30 +++-
 ovn-nb.xml   |  17 +++--
 ovn-sb.xml   |  22 ++
 tests/ovn-nbctl.at   |  10 ++-
 tests/ovn-northd.at  | 133 ---
 tests/ovn.at |  90 +++-
 utilities/ovn-nbctl.c|   7 --
 utilities/ovn-trace.c|   2 +
 16 files changed, 383 insertions(+), 72 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 265740cab..e8f0b7242 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -734,8 +734,8 @@ update_ct_zones(const struct shash *binding_lports,
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
 /* XXX Add method to limit zone assignment to logical router
  * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat");
+char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat");
 sset_add(_users, dnat);
 sset_add(_users, snat);
 
@@ -745,6 +745,14 @@ update_ct_zones(const struct shash *binding_lports,
 }
 free(dnat);
 free(snat);
+
+/* Zone for committing connections dropped by ACLs with labels. */
+if (ld->is_switch) {
+char *drop = alloc_ct_zone_key(
+>datapath->header_.uuid, "drop");
+sset_add(_users, drop);
+free(drop);
+}
 }
 
 /* Delete zones that do not exist in above sset. */
@@ -2420,7 +2428,7 @@ ct_zones_datapath_binding_handler(struct engine_node 
*node, void *data)
 /* Check if the requested snat zone has changed for the datapath
  * or not.  If so, then fall back to full recompute of
  * ct_zone engine. */
-char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, "snat");
+char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, "snat");
 struct simap_node *simap_node = simap_find(_zones_data->current,
snat_dp_zone_key);
 free(snat_dp_zone_key);
diff --git a/controller/physical.c b/controller/physical.c
index 4dcf44e01..3c573c492 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -60,6 +60,7 @@ struct zone_ids {
 int ct; /* MFF_LOG_CT_ZONE. */
 int dnat;   /* MFF_LOG_DNAT_ZONE. */
 int snat;   /* MFF_LOG_SNAT_ZONE. */
+int drop;   /* MFF_LOG_ACL_DROP_ZONE. */
 };
 
 struct tunnel {
@@ -204,14 +205,18 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 
 const struct uuid *key = >datapath->header_.uuid;
 
-char *dnat = alloc_nat_zone_key(key, "dnat");
+char *dnat = alloc_ct_zone_key(key, "dnat");
 zone_ids.dnat = simap_get(ct_zones, dnat);
 free(dnat);
 
-char *snat = alloc_nat_zone_key(key, "snat");
+char *snat = alloc_ct_zone_key(key, "snat");
 zone_ids.snat = simap_get(ct_zones, snat);
 free(snat);
 
+char *drop = alloc_ct_zone_key(key, "drop");
+zone_ids.drop = simap_get(ct_zones, drop);
+free(drop);
+
 return zone_ids;

Re: [ovs-dev] [PATCH ovn] northd, controller: Commit flows dropped by ACLs to conntrack

2023-02-06 Thread Abhiram Sangana


> On 3 Feb 2023, at 16:08, Mark Michelson  wrote:
> 
> On 1/25/23 05:36, Abhiram Sangana wrote:
>> Hi Mark,
>> I have replied to your comments. Can you please have a look when you get a 
>> chance?
> 
> I had a look at the code itself, and from a purely mechanical perspective, I 
> can't see anything wrong with it. I have some high level comments down below, 
> though.
> 
>> Thanks,
>> Abhiram Sangana
>>> On 17 Jan 2023, at 12:37, Abhiram Sangana  
>>> wrote:
>>> 
>>> Hi Mark,
>>> 
>>> Thanks for reviewing the patch.
>>> 
>>>> On 16 Jan 2023, at 21:34, Mark Michelson  wrote:
>>>> 
>>>> Hello Abhiram,
>>>> 
>>>> I haven't taken a close look at every line of the series, but I have two 
>>>> high-level questions/observations.
>>>> 
>>>> First, what is the benefit of using this compared to ACL logging or drop 
>>>> sampling?
>>> 
>>> I had done some basic testing on ACL logging and exporting Netflow
>>> records for OVN bridge. I noticed a significant load on ovs-vswitchd
>>> when there are a large number of connections created in the bridge
>>> (70-80% CPU usage with 1000 connections per sec for Netflow and
>>> 60-70% CPU with 1000 packets per sec for ACL logging), probably due to
>>> large number of upcalls. I haven't tested specifically with drop
>>> sampling but expected a similar cost if we use 100% sampling. The
>>> alternative is to use metering with ACL logs or sampling part of the
>>> connections before exporting them but we might miss some of the
>>> connections this way.
>>> 
>>> With the conntrack approach, ovs-vswitchd would not be involved and
>>> CMS can read the CT table or receive connection tracking events from
>>> nf_conntrack kernel module via ctnetlink (a single event is generated
>>> for each connection). We should be able to get all/most of the
>>> connections dropped by ACLs. However, this approach is datapath
>>> specific.
>>> 
>>> There was brief discussion regarding the same in the RFC thread -
>>> [ovs-dev,RFC] northd, controller: Commit flows dropped by ACLs in a
>>> separate CT zone -
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_ovn_patch_20221018153342.164530-2D1-2Dsangana.abhiram-40nutanix.com_=DwICAg=s883GpUCOChKOHiocYtGcg=RotwPXnAckhnfEzdWgnpPR0nnZ46Y0RYo-uUGaS4vXY=W3sqrwt4epeEOIKA16LGwz3gbzhOCAyFKDxFAK8KX5lJsEFMpVu09MyJTWxSNIpa=um7MDrMKGeU4Ozd9VvKIJQ5cVpVizh_lH5ILPVlt2zE=
>>> 
>>>> 
>>>> Second, I think that if we are to accept this patch, the behavior needs to 
>>>> be optional, and it needs to be opt-in. In other words, someone needs to 
>>>> explicitly enable the behavior on the logical switch in order to commit 
>>>> dropped connections to CT. If the goal is to use this as a debugging tool, 
>>>> it should only be enabled when attempting to debug. I'm not knowledgeable 
>>>> about the inner workings of CT, but committing every dropped connection to 
>>>> CT sounds like a vector for a DOS exploit to me. Someone else can comment 
>>>> in case I'm completely wrong here.
>>>> 
>>> While the DROP CT zone is created unconditionally for each logical
>>> switch, connections are only committed if the ACL dropping/rejecting
>>> them has label set. So, user can create drop/reject ACLs without
>>> labels (default behavior today) if they don't want to use this
>>> feature.
> 
> IMO, this is not good enough. As a user I would be surprised to find that 
> additional entries are added to conntrack simply because I created a label 
> for my ACL. I think that another option needs to be created that specifically 
> enables committing dropped packets to CT.

Sure - should we add a global option to commit connections dropped
by ACLs or a per logical_switch option?

Also, given that `label` field was not supported for ACLs with action as
“drop” or “reject” before, should we bump the NB DB schema version?
> 
>>> 
>>> Yes, committing dropped connections is a potential DOS vector. I am
>>> planning to send another patch that limits the size of the DROP CT
>>> zone - ovs datapath already has support to limit the size of CT zones.
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitch_ovs_commit_cb2a5486a3a3756ee3868da0050d737c8989770c=DwICAg=s883GpUCOChKOHiocYtGcg=RotwPXnAckhnfEzdWgnpPR0nnZ46Y0RYo-uUGaS4vXY=W3sqrwt4epeEOIKA16LGwz3gbzhOCAyFKDxFAK8KX5lJsEFMpVu09MyJTWxSN

Re: [ovs-dev] [PATCH ovn] northd, controller: Commit flows dropped by ACLs to conntrack

2023-01-25 Thread Abhiram Sangana
Hi Mark,

I have replied to your comments. Can you please have a look when you get a 
chance?

Thanks,
Abhiram Sangana

> On 17 Jan 2023, at 12:37, Abhiram Sangana  wrote:
> 
> Hi Mark,
> 
> Thanks for reviewing the patch.
> 
>> On 16 Jan 2023, at 21:34, Mark Michelson  wrote:
>> 
>> Hello Abhiram,
>> 
>> I haven't taken a close look at every line of the series, but I have two 
>> high-level questions/observations.
>> 
>> First, what is the benefit of using this compared to ACL logging or drop 
>> sampling?
> 
> I had done some basic testing on ACL logging and exporting Netflow
> records for OVN bridge. I noticed a significant load on ovs-vswitchd
> when there are a large number of connections created in the bridge
> (70-80% CPU usage with 1000 connections per sec for Netflow and
> 60-70% CPU with 1000 packets per sec for ACL logging), probably due to
> large number of upcalls. I haven't tested specifically with drop
> sampling but expected a similar cost if we use 100% sampling. The
> alternative is to use metering with ACL logs or sampling part of the
> connections before exporting them but we might miss some of the
> connections this way.
> 
> With the conntrack approach, ovs-vswitchd would not be involved and
> CMS can read the CT table or receive connection tracking events from
> nf_conntrack kernel module via ctnetlink (a single event is generated
> for each connection). We should be able to get all/most of the
> connections dropped by ACLs. However, this approach is datapath
> specific.
> 
> There was brief discussion regarding the same in the RFC thread -
> [ovs-dev,RFC] northd, controller: Commit flows dropped by ACLs in a
> separate CT zone - 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_ovn_patch_20221018153342.164530-2D1-2Dsangana.abhiram-40nutanix.com_=DwICAg=s883GpUCOChKOHiocYtGcg=RotwPXnAckhnfEzdWgnpPR0nnZ46Y0RYo-uUGaS4vXY=W3sqrwt4epeEOIKA16LGwz3gbzhOCAyFKDxFAK8KX5lJsEFMpVu09MyJTWxSNIpa=um7MDrMKGeU4Ozd9VvKIJQ5cVpVizh_lH5ILPVlt2zE=
>  
> 
>> 
>> Second, I think that if we are to accept this patch, the behavior needs to 
>> be optional, and it needs to be opt-in. In other words, someone needs to 
>> explicitly enable the behavior on the logical switch in order to commit 
>> dropped connections to CT. If the goal is to use this as a debugging tool, 
>> it should only be enabled when attempting to debug. I'm not knowledgeable 
>> about the inner workings of CT, but committing every dropped connection to 
>> CT sounds like a vector for a DOS exploit to me. Someone else can comment in 
>> case I'm completely wrong here.
>> 
> While the DROP CT zone is created unconditionally for each logical
> switch, connections are only committed if the ACL dropping/rejecting
> them has label set. So, user can create drop/reject ACLs without
> labels (default behavior today) if they don't want to use this
> feature.
> 
> Yes, committing dropped connections is a potential DOS vector. I am
> planning to send another patch that limits the size of the DROP CT
> zone - ovs datapath already has support to limit the size of CT zones.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitch_ovs_commit_cb2a5486a3a3756ee3868da0050d737c8989770c=DwICAg=s883GpUCOChKOHiocYtGcg=RotwPXnAckhnfEzdWgnpPR0nnZ46Y0RYo-uUGaS4vXY=W3sqrwt4epeEOIKA16LGwz3gbzhOCAyFKDxFAK8KX5lJsEFMpVu09MyJTWxSNIpa=9jkE_C4UAA8pOWcIzNbymLs0ai6Am90PvwTE1oFjHCc=
>  
> 
>> Thanks,
>> Mark Michelson
> 
> Thanks,
> Abhiram Sangana
> 
>> On 1/13/23 07:44, Abhiram Sangana wrote:
>>> This patch commits connections dropped/rejected by ACLs with label
>>> (introduced in 0e0228be (northd: Add ACL label)) to the connection
>>> tracking table. The dropped connections are committed in a separate
>>> conntrack zone so that they can be managed independently and do not
>>> interact with the connection tracking state of allowed connections.
>>> Each logical switch is assigned a new conntrack zone for committing
>>> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>>> A new lflow action "ct_commit_drop" is introduced that commits flows
>>> to connection tracking table in a zone identified by
>>> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
>>> and non-empty label translates to include "ct_commit_drop" in its
>>> actions instead of simply dropping/rejecting the packet.
>>> This provides a new approach to identify connections dropped by ACLs
>>> besides the existing ACL logging and drop sampling approaches.
>>> Signed-off-by: Abh

Re: [ovs-dev] [PATCH ovn] northd, controller: Commit flows dropped by ACLs to conntrack

2023-01-17 Thread Abhiram Sangana
Hi Mark,

Thanks for reviewing the patch.

> On 16 Jan 2023, at 21:34, Mark Michelson  wrote:
> 
> Hello Abhiram,
> 
> I haven't taken a close look at every line of the series, but I have two 
> high-level questions/observations.
> 
> First, what is the benefit of using this compared to ACL logging or drop 
> sampling?

I had done some basic testing on ACL logging and exporting Netflow
records for OVN bridge. I noticed a significant load on ovs-vswitchd
when there are a large number of connections created in the bridge
(70-80% CPU usage with 1000 connections per sec for Netflow and
60-70% CPU with 1000 packets per sec for ACL logging), probably due to
large number of upcalls. I haven't tested specifically with drop
sampling but expected a similar cost if we use 100% sampling. The
alternative is to use metering with ACL logs or sampling part of the
connections before exporting them but we might miss some of the
connections this way.

With the conntrack approach, ovs-vswitchd would not be involved and
CMS can read the CT table or receive connection tracking events from
nf_conntrack kernel module via ctnetlink (a single event is generated
for each connection). We should be able to get all/most of the
connections dropped by ACLs. However, this approach is datapath
specific.

There was brief discussion regarding the same in the RFC thread -
[ovs-dev,RFC] northd, controller: Commit flows dropped by ACLs in a
separate CT zone - 
http://patchwork.ozlabs.org/project/ovn/patch/20221018153342.164530-1-sangana.abhi...@nutanix.com/

> 
> Second, I think that if we are to accept this patch, the behavior needs to be 
> optional, and it needs to be opt-in. In other words, someone needs to 
> explicitly enable the behavior on the logical switch in order to commit 
> dropped connections to CT. If the goal is to use this as a debugging tool, it 
> should only be enabled when attempting to debug. I'm not knowledgeable about 
> the inner workings of CT, but committing every dropped connection to CT 
> sounds like a vector for a DOS exploit to me. Someone else can comment in 
> case I'm completely wrong here.
> 
While the DROP CT zone is created unconditionally for each logical
switch, connections are only committed if the ACL dropping/rejecting
them has label set. So, user can create drop/reject ACLs without
labels (default behavior today) if they don't want to use this
feature.

Yes, committing dropped connections is a potential DOS vector. I am
planning to send another patch that limits the size of the DROP CT
zone - ovs datapath already has support to limit the size of CT zones.
https://github.com/openvswitch/ovs/commit/cb2a5486a3a3756ee3868da0050d737c8989770c

> Thanks,
> Mark Michelson

Thanks,
Abhiram Sangana

> On 1/13/23 07:44, Abhiram Sangana wrote:
>> This patch commits connections dropped/rejected by ACLs with label
>> (introduced in 0e0228be (northd: Add ACL label)) to the connection
>> tracking table. The dropped connections are committed in a separate
>> conntrack zone so that they can be managed independently and do not
>> interact with the connection tracking state of allowed connections.
>> Each logical switch is assigned a new conntrack zone for committing
>> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>> A new lflow action "ct_commit_drop" is introduced that commits flows
>> to connection tracking table in a zone identified by
>> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
>> and non-empty label translates to include "ct_commit_drop" in its
>> actions instead of simply dropping/rejecting the packet.
>> This provides a new approach to identify connections dropped by ACLs
>> besides the existing ACL logging and drop sampling approaches.
>> Signed-off-by: Abhiram Sangana 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd, controller: Commit flows dropped by ACLs to conntrack

2023-01-13 Thread Abhiram Sangana
This patch commits connections dropped/rejected by ACLs with label
(introduced in 0e0228be (northd: Add ACL label)) to the connection
tracking table. The dropped connections are committed in a separate
conntrack zone so that they can be managed independently and do not
interact with the connection tracking state of allowed connections.

Each logical switch is assigned a new conntrack zone for committing
dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
A new lflow action "ct_commit_drop" is introduced that commits flows
to connection tracking table in a zone identified by
MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
and non-empty label translates to include "ct_commit_drop" in its
actions instead of simply dropping/rejecting the packet.

This provides a new approach to identify connections dropped by ACLs
besides the existing ACL logging and drop sampling approaches.

Signed-off-by: Abhiram Sangana 
---
 controller/ovn-controller.c  |  14 +++--
 controller/physical.c|  32 ++-
 include/ovn/actions.h|   1 +
 include/ovn/logical-fields.h |   1 +
 lib/actions.c|  65 ++
 lib/ovn-util.c   |   4 +-
 lib/ovn-util.h   |   2 +-
 northd/northd.c  |  19 ++-
 northd/ovn-northd.8.xml  |  18 ++-
 ovn-nb.xml   |   8 ++-
 ovn-sb.xml   |  22 
 tests/ovn-nbctl.at   |  10 ++--
 tests/ovn-northd.at  | 102 +--
 tests/ovn.at |  90 ++-
 utilities/ovn-nbctl.c|   7 ---
 15 files changed, 323 insertions(+), 72 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 351cf1c5b..c19b56838 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -734,8 +734,8 @@ update_ct_zones(const struct shash *binding_lports,
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
 /* XXX Add method to limit zone assignment to logical router
  * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat");
+char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat");
 sset_add(_users, dnat);
 sset_add(_users, snat);
 
@@ -745,6 +745,14 @@ update_ct_zones(const struct shash *binding_lports,
 }
 free(dnat);
 free(snat);
+
+/* Zone for committing connections dropped by ACLs with labels. */
+if (ld->is_switch) {
+char *drop = alloc_ct_zone_key(
+>datapath->header_.uuid, "drop");
+sset_add(_users, drop);
+free(drop);
+}
 }
 
 /* Delete zones that do not exist in above sset. */
@@ -2415,7 +2423,7 @@ ct_zones_datapath_binding_handler(struct engine_node 
*node, void *data)
 /* Check if the requested snat zone has changed for the datapath
  * or not.  If so, then fall back to full recompute of
  * ct_zone engine. */
-char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, "snat");
+char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, "snat");
 struct simap_node *simap_node = simap_find(_zones_data->current,
snat_dp_zone_key);
 free(snat_dp_zone_key);
diff --git a/controller/physical.c b/controller/physical.c
index 4dcf44e01..3c573c492 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -60,6 +60,7 @@ struct zone_ids {
 int ct; /* MFF_LOG_CT_ZONE. */
 int dnat;   /* MFF_LOG_DNAT_ZONE. */
 int snat;   /* MFF_LOG_SNAT_ZONE. */
+int drop;   /* MFF_LOG_ACL_DROP_ZONE. */
 };
 
 struct tunnel {
@@ -204,14 +205,18 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 
 const struct uuid *key = >datapath->header_.uuid;
 
-char *dnat = alloc_nat_zone_key(key, "dnat");
+char *dnat = alloc_ct_zone_key(key, "dnat");
 zone_ids.dnat = simap_get(ct_zones, dnat);
 free(dnat);
 
-char *snat = alloc_nat_zone_key(key, "snat");
+char *snat = alloc_ct_zone_key(key, "snat");
 zone_ids.snat = simap_get(ct_zones, snat);
 free(snat);
 
+char *drop = alloc_ct_zone_key(key, "drop");
+zone_ids.drop = simap_get(ct_zones, drop);
+free(drop);
+
 return zone_ids;
 }
 
@@ -830,6 +835,9 @@ put_zones_ofpacts(const struct zone_ids *zone_ids, struct 
ofpbuf *ofpacts_p)
 if (zone_ids->snat) {
 put_load(zone_

Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-11-21 Thread Abhiram Sangana
Hi Adrian,

I apologise for the delay in replying back.

> On 14 Nov 2022, at 13:05, Adrian Moreno  wrote:
> 
> Hi,
> 
> On 10/20/22 15:49, Abhiram Sangana wrote:
>> Hi Dumitru,
>> Thanks for reviewing the patch.
>>> On 19 Oct 2022, at 14:09, Dumitru Ceara  wrote:
>>> 
>>> Hi Abhiram,
>>> 
>>> Thanks for the patch!  I only skimmed the changes so this is not a full
>>> review but more of a discussion starter.
>>> 
>>> On 10/18/22 17:33, Abhiram Sangana wrote:
>>>> To identify connections dropped by ACLs, users can enable logging for ACLs
>>>> but this approach does not scale. ACL logging uses "controller" action
>>>> which causes a significant spike in the CPU usage of ovs-vswitchd (and
>>>> ovn-controller to a lesser extent) even with metering enabled (observed
>>>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another
>>>> approach is to use drop sampling (patch by Adrian Moreno currently in
>>>> review) but we might miss specific connections of interest with this
>>>> approach.
>>>> 
>>>> This patch commits connections dropped by ACLs to the connection tracking
>>>> table with a specific ACL label that was introduced in 0e0228be (
>>>> northd: Add ACL label). The dropped connections are committed in a separate
>>>> CT zone so that they can be managed independently.
>>>> 
>>> 
>>> I'm not sure I understand how the CMS can manage this.  How is this
>>> better than sampling?  Committed connections are going to time out at
>>> some point (30 sec by default for udp/icmp with the kernel datapath).
>>> So the CMS will have to continuously monitor the contents of the
>>> conntrack zone?  Aren't we just moving the CPU load somewhere else with
>>> this?  Even so, there's a chance an entry is missed.
>> Linux nf_conntrack module supports sending connection tracking events
>> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel
>> parameter). So, CMS can parse the stream of new connection events from
>> conntrack and log the packets based on CT label.
> 
> Isn't this datapath-specific? this won't be available for the netdev 
> datapath, right?

Yes, this approach is datapath specific - I haven’t checked how to send 
connection tracking events for netdev datapath.
> 
>> An issue with sampling is that if there are a large number of packets
>> for a particular connection(s), packets of other connections might not
>> get sampled and we miss information about these connections. With the
>> conntrack approach, we get a single event for each connection (until
>> they time out), so there is lesser load on the CMS/collector and lesser
>> likelihood of missing connections.
> 
> I don't see why sampling would inherently miss packets. The current RFC for 
> ACL sampling (different from the generic drop sampling one) allows the user 
> to specify a probability per ACL so 100% of packets can be sampled in 
> connection establishment drops while we sample N% of accepted traffic having 
> a lot of flexibility in the inevitable performance vs accuracy tradeoff.
> 
> I'd like to better understand what are the limitations of the current 
> approach to see if it can be improved in any way.

I haven’t experimented with flow-based IPFIX sampling but I noticed high CPU 
usage of ovs-vswitchd while trying to export Netflow records (which I think is 
similar to 100% sampling) with large number of connections in OVN bridge. I was 
expecting a similar cost with respect to upcalls if we use 100% sampling rate 
for drop ACLs when there are multiple connection establishment packets in the 
bridge.

Thanks,
Abhiram

> Thanks,
> --
> Adrián
> 
>>> 
>>>> Each logical port is assigned a new zone for committing dropped flows.
>>>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>>>> 
>>>> A new lflow action "ct_commit_drop" is introduced that commits flows
>>>> to connection tracking table in a zone identified by
>>>> MFF_LOG_ACL_DROP_ZONE register.
>>>> 
>>>> An ACL with "drop" action and non-empty label is translated to 
>>>> "ct_commit_drop"
>>>> instead of silently dropping the packet.
>>>> 
>>>> Signed-off-by: Abhiram Sangana 
>>>> ---
>>>> controller/ovn-controller.c  | 23 ++---
>>>> controller/physical.c| 32 +--
>>>> include/ovn/actions.h|  1 +
>>>> includ

Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-11-01 Thread Abhiram Sangana


> On 1 Nov 2022, at 12:35, Dumitru Ceara  wrote:
> 
> On 11/1/22 12:40, Abhiram Sangana wrote:
>> 
>> 
>>> On 21 Oct 2022, at 10:58, Dumitru Ceara  wrote:
>>> 
>>> On 10/20/22 17:34, Abhiram Sangana wrote:
>>>> Hi Dumitru,
>>>> 
>>>> Can you please check if the implementation for the proposal looks ok?
>>>> Will send out v1 with the review comments and tests.
>>>> Also, any ideas how we can selectively create new drop zones for ports.
>>>> Currently, I am assigning a new zone for dropped connections to every
>>>> port even if its parent LS doesn’t have drop ACLs with labels.
>>>> 
>>> 
>>> Within a logical switch do we really need a drop zone per port?  Isn't
>>> it actually enough if we add a "from-lport-drop" and "to-lport-drop"
>>> zone for the whole logical switch?  That should simplify zone allocation.
>>> 
>> Given that we are committing dropped connections to CT table, a DDOS
>> attack can potentially fill up the CT table. We are planning to send
>> another patch that limits the number of entries for a given CT zone.
>> It is easier to manage the size of CT zones when there is a CT zone
>> per port rather than per Logical_Switch.
>> 
> 
> Ok.  To answer your previous question, we could mark the SB.Port_Binding
> with an option if there are ACLs applied to the switch containing it and
> at least one of those needs drop zones.  Even better would be to mark
> the the SB.Datapath_Binding but we don't have options there, we'd have
> to update the schema.
> 
> I'm not sure if that works for you, what do you think?
> 
> Thanks,
> Dumitru

Yes, I think that should work. I will send out a v1 patch with these changes.
Thank you, Dumitru.

Thanks,
Abhiram
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-11-01 Thread Abhiram Sangana


> On 21 Oct 2022, at 10:58, Dumitru Ceara  wrote:
> 
> On 10/20/22 17:34, Abhiram Sangana wrote:
>> Hi Dumitru,
>> 
>> Can you please check if the implementation for the proposal looks ok?
>> Will send out v1 with the review comments and tests.
>> Also, any ideas how we can selectively create new drop zones for ports.
>> Currently, I am assigning a new zone for dropped connections to every
>> port even if its parent LS doesn’t have drop ACLs with labels.
>> 
> 
> Within a logical switch do we really need a drop zone per port?  Isn't
> it actually enough if we add a "from-lport-drop" and "to-lport-drop"
> zone for the whole logical switch?  That should simplify zone allocation.
> 
Given that we are committing dropped connections to CT table, a DDOS
attack can potentially fill up the CT table. We are planning to send
another patch that limits the number of entries for a given CT zone.
It is easier to manage the size of CT zones when there is a CT zone
per port rather than per Logical_Switch.

>> Thanks,
>> Abhiram Sangana
>> 
>>> On 20 Oct 2022, at 15:18, Dumitru Ceara  wrote:
>>> 
>>> On 10/20/22 15:49, Abhiram Sangana wrote:
>>>> Hi Dumitru,
>>>> 
>>>> Thanks for reviewing the patch.
>>>> 
>>>>> On 19 Oct 2022, at 14:09, Dumitru Ceara  wrote:
>>>>> 
>>>>> Hi Abhiram,
>>>>> 
>>>>> Thanks for the patch!  I only skimmed the changes so this is not a full
>>>>> review but more of a discussion starter.
>>>>> 
>>>>> On 10/18/22 17:33, Abhiram Sangana wrote:
>>>>>> To identify connections dropped by ACLs, users can enable logging for 
>>>>>> ACLs
>>>>>> but this approach does not scale. ACL logging uses "controller" action
>>>>>> which causes a significant spike in the CPU usage of ovs-vswitchd (and
>>>>>> ovn-controller to a lesser extent) even with metering enabled (observed
>>>>>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another
>>>>>> approach is to use drop sampling (patch by Adrian Moreno currently in
>>>>>> review) but we might miss specific connections of interest with this
>>>>>> approach.
>>>>>> 
>>>>>> This patch commits connections dropped by ACLs to the connection tracking
>>>>>> table with a specific ACL label that was introduced in 0e0228be (
>>>>>> northd: Add ACL label). The dropped connections are committed in a 
>>>>>> separate
>>>>>> CT zone so that they can be managed independently.
>>>>>> 
>>>>> 
>>>>> I'm not sure I understand how the CMS can manage this.  How is this
>>>>> better than sampling?  Committed connections are going to time out at
>>>>> some point (30 sec by default for udp/icmp with the kernel datapath).
>>>>> So the CMS will have to continuously monitor the contents of the
>>>>> conntrack zone?  Aren't we just moving the CPU load somewhere else with
>>>>> this?  Even so, there's a chance an entry is missed.
>>>> 
>>>> Linux nf_conntrack module supports sending connection tracking events
>>>> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel
>>>> parameter). So, CMS can parse the stream of new connection events from
>>>> conntrack and log the packets based on CT label.
>>>> 
>>> 
>>> Ack.
>>> 
>>>> An issue with sampling is that if there are a large number of packets
>>>> for a particular connection(s), packets of other connections might not
>>>> get sampled and we miss information about these connections. With the
>>>> conntrack approach, we get a single event for each connection (until
>>>> they time out), so there is lesser load on the CMS/collector and lesser
>>>> likelihood of missing connections.
>>>> 
>>> 
>>> I hadn't considered this advantage, thanks for pointing it out!
>>> 
>>>>> 
>>>>>> Each logical port is assigned a new zone for committing dropped flows.
>>>>>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>>>>>> 
>>>>>> A new lflow action "ct_commit_drop" is introduced that commits flows
>>>>>> to connection tracking table in a zone identified by
>>>>>> MFF_LOG_ACL

Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-10-20 Thread Abhiram Sangana
Hi Dumitru,

Can you please check if the implementation for the proposal looks ok?
Will send out v1 with the review comments and tests.
Also, any ideas how we can selectively create new drop zones for ports.
Currently, I am assigning a new zone for dropped connections to every
port even if its parent LS doesn’t have drop ACLs with labels.

Thanks,
Abhiram Sangana

> On 20 Oct 2022, at 15:18, Dumitru Ceara  wrote:
> 
> On 10/20/22 15:49, Abhiram Sangana wrote:
>> Hi Dumitru,
>> 
>> Thanks for reviewing the patch.
>> 
>>> On 19 Oct 2022, at 14:09, Dumitru Ceara  wrote:
>>> 
>>> Hi Abhiram,
>>> 
>>> Thanks for the patch!  I only skimmed the changes so this is not a full
>>> review but more of a discussion starter.
>>> 
>>> On 10/18/22 17:33, Abhiram Sangana wrote:
>>>> To identify connections dropped by ACLs, users can enable logging for ACLs
>>>> but this approach does not scale. ACL logging uses "controller" action
>>>> which causes a significant spike in the CPU usage of ovs-vswitchd (and
>>>> ovn-controller to a lesser extent) even with metering enabled (observed
>>>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another
>>>> approach is to use drop sampling (patch by Adrian Moreno currently in
>>>> review) but we might miss specific connections of interest with this
>>>> approach.
>>>> 
>>>> This patch commits connections dropped by ACLs to the connection tracking
>>>> table with a specific ACL label that was introduced in 0e0228be (
>>>> northd: Add ACL label). The dropped connections are committed in a separate
>>>> CT zone so that they can be managed independently.
>>>> 
>>> 
>>> I'm not sure I understand how the CMS can manage this.  How is this
>>> better than sampling?  Committed connections are going to time out at
>>> some point (30 sec by default for udp/icmp with the kernel datapath).
>>> So the CMS will have to continuously monitor the contents of the
>>> conntrack zone?  Aren't we just moving the CPU load somewhere else with
>>> this?  Even so, there's a chance an entry is missed.
>> 
>> Linux nf_conntrack module supports sending connection tracking events
>> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel
>> parameter). So, CMS can parse the stream of new connection events from
>> conntrack and log the packets based on CT label.
>> 
> 
> Ack.
> 
>> An issue with sampling is that if there are a large number of packets
>> for a particular connection(s), packets of other connections might not
>> get sampled and we miss information about these connections. With the
>> conntrack approach, we get a single event for each connection (until
>> they time out), so there is lesser load on the CMS/collector and lesser
>> likelihood of missing connections.
>> 
> 
> I hadn't considered this advantage, thanks for pointing it out!
> 
>>> 
>>>> Each logical port is assigned a new zone for committing dropped flows.
>>>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>>>> 
>>>> A new lflow action "ct_commit_drop" is introduced that commits flows
>>>> to connection tracking table in a zone identified by
>>>> MFF_LOG_ACL_DROP_ZONE register.
>>>> 
>>>> An ACL with "drop" action and non-empty label is translated to 
>>>> "ct_commit_drop"
>>>> instead of silently dropping the packet.
>>>> 
>>>> Signed-off-by: Abhiram Sangana 
>>>> ---
>>>> controller/ovn-controller.c  | 23 ++---
>>>> controller/physical.c| 32 +--
>>>> include/ovn/actions.h|  1 +
>>>> include/ovn/logical-fields.h |  1 +
>>>> lib/actions.c| 50 
>>>> lib/ovn-util.c   |  4 +--
>>>> lib/ovn-util.h   |  2 +-
>>>> northd/northd.c  | 14 --
>>>> northd/ovn-northd.8.xml  | 14 --
>>>> ovn-sb.xml   | 17 
>>>> ovs  |  2 +-
>>> 
>>> This shouldn't need to change the OVS submodule.
>>> 
>> My bad. Will fix this.
>> 
> 
> Cool, thanks, looking forward for v1!
> 
> Regards,
> Dumitru
> 
>> Thanks,
>> Abhiram Sangana
>> 
>>> Thanks,
>>> Dumitru
>&g

Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-10-20 Thread Abhiram Sangana
Hi Dumitru,

Thanks for reviewing the patch.

> On 19 Oct 2022, at 14:09, Dumitru Ceara  wrote:
> 
> Hi Abhiram,
> 
> Thanks for the patch!  I only skimmed the changes so this is not a full
> review but more of a discussion starter.
> 
> On 10/18/22 17:33, Abhiram Sangana wrote:
>> To identify connections dropped by ACLs, users can enable logging for ACLs
>> but this approach does not scale. ACL logging uses "controller" action
>> which causes a significant spike in the CPU usage of ovs-vswitchd (and
>> ovn-controller to a lesser extent) even with metering enabled (observed
>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another
>> approach is to use drop sampling (patch by Adrian Moreno currently in
>> review) but we might miss specific connections of interest with this
>> approach.
>> 
>> This patch commits connections dropped by ACLs to the connection tracking
>> table with a specific ACL label that was introduced in 0e0228be (
>> northd: Add ACL label). The dropped connections are committed in a separate
>> CT zone so that they can be managed independently.
>> 
> 
> I'm not sure I understand how the CMS can manage this.  How is this
> better than sampling?  Committed connections are going to time out at
> some point (30 sec by default for udp/icmp with the kernel datapath).
> So the CMS will have to continuously monitor the contents of the
> conntrack zone?  Aren't we just moving the CPU load somewhere else with
> this?  Even so, there's a chance an entry is missed.

Linux nf_conntrack module supports sending connection tracking events
to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel
parameter). So, CMS can parse the stream of new connection events from
conntrack and log the packets based on CT label.

An issue with sampling is that if there are a large number of packets
for a particular connection(s), packets of other connections might not
get sampled and we miss information about these connections. With the
conntrack approach, we get a single event for each connection (until
they time out), so there is lesser load on the CMS/collector and lesser
likelihood of missing connections.

> 
>> Each logical port is assigned a new zone for committing dropped flows.
>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
>> 
>> A new lflow action "ct_commit_drop" is introduced that commits flows
>> to connection tracking table in a zone identified by
>> MFF_LOG_ACL_DROP_ZONE register.
>> 
>> An ACL with "drop" action and non-empty label is translated to 
>> "ct_commit_drop"
>> instead of silently dropping the packet.
>> 
>> Signed-off-by: Abhiram Sangana 
>> ---
>> controller/ovn-controller.c  | 23 ++---
>> controller/physical.c| 32 +--
>> include/ovn/actions.h|  1 +
>> include/ovn/logical-fields.h |  1 +
>> lib/actions.c| 50 
>> lib/ovn-util.c   |  4 +--
>> lib/ovn-util.h       |  2 +-
>> northd/northd.c  | 14 --
>> northd/ovn-northd.8.xml  | 14 --
>> ovn-sb.xml   | 17 
>> ovs  |  2 +-
> 
> This shouldn't need to change the OVS submodule.
> 
My bad. Will fix this.

Thanks,
Abhiram Sangana

> Thanks,
> Dumitru
> 
>> utilities/ovn-nbctl.c|  7 ++---
>> 12 files changed, 151 insertions(+), 16 deletions(-)
>> 
>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> index c97744d57..1ad20fe55 100644
>> --- a/controller/ovn-controller.c
>> +++ b/controller/ovn-controller.c
>> @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports,
>> unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)];
>> 
>> struct shash_node *shash_node;
>> +const struct binding_lport *lport;
>> SHASH_FOR_EACH (shash_node, binding_lports) {
>> sset_add(_users, shash_node->name);
>> +
>> +/* Zone for committing dropped connections of a vNIC. */
>> +lport = shash_node->data;
>> +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, 
>> "drop");
>> +sset_add(_users, drop_zone);
>> +free(drop_zone);
>> }
>> 
>> /* Local patched datapath (gateway routers) need zones assigned. */
>> @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports,
>> HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>> /* XXX Add metho

[ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone

2022-10-18 Thread Abhiram Sangana
To identify connections dropped by ACLs, users can enable logging for ACLs
but this approach does not scale. ACL logging uses "controller" action
which causes a significant spike in the CPU usage of ovs-vswitchd (and
ovn-controller to a lesser extent) even with metering enabled (observed
65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another
approach is to use drop sampling (patch by Adrian Moreno currently in
review) but we might miss specific connections of interest with this
approach.

This patch commits connections dropped by ACLs to the connection tracking
table with a specific ACL label that was introduced in 0e0228be (
northd: Add ACL label). The dropped connections are committed in a separate
CT zone so that they can be managed independently.

Each logical port is assigned a new zone for committing dropped flows.
The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.

A new lflow action "ct_commit_drop" is introduced that commits flows
to connection tracking table in a zone identified by
MFF_LOG_ACL_DROP_ZONE register.

An ACL with "drop" action and non-empty label is translated to "ct_commit_drop"
instead of silently dropping the packet.

Signed-off-by: Abhiram Sangana 
---
 controller/ovn-controller.c  | 23 ++---
 controller/physical.c| 32 +--
 include/ovn/actions.h|  1 +
 include/ovn/logical-fields.h |  1 +
 lib/actions.c| 50 
 lib/ovn-util.c   |  4 +--
 lib/ovn-util.h   |  2 +-
 northd/northd.c  | 14 --
 northd/ovn-northd.8.xml  | 14 --
 ovn-sb.xml   | 17 
 ovs  |  2 +-
 utilities/ovn-nbctl.c|  7 ++---
 12 files changed, 151 insertions(+), 16 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index c97744d57..1ad20fe55 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports,
 unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)];
 
 struct shash_node *shash_node;
+const struct binding_lport *lport;
 SHASH_FOR_EACH (shash_node, binding_lports) {
 sset_add(_users, shash_node->name);
+
+/* Zone for committing dropped connections of a vNIC. */
+lport = shash_node->data;
+char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, "drop");
+sset_add(_users, drop_zone);
+free(drop_zone);
 }
 
 /* Local patched datapath (gateway routers) need zones assigned. */
@@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports,
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
 /* XXX Add method to limit zone assignment to logical router
  * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat");
+char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat");
 sset_add(_users, dnat);
 sset_add(_users, snat);
 shash_add(_lds, dnat, ld);
@@ -2090,7 +2097,7 @@ ct_zones_datapath_binding_handler(struct engine_node 
*node, void *data)
 /* Check if the requested snat zone has changed for the datapath
  * or not.  If so, then fall back to full recompute of
  * ct_zone engine. */
-char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, "snat");
+char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, "snat");
 struct simap_node *simap_node = simap_find(_zones_data->current,
snat_dp_zone_key);
 free(snat_dp_zone_key);
@@ -2148,6 +2155,16 @@ ct_zones_runtime_data_handler(struct engine_node *node, 
void *data)
 _zones_data->pending);
 updated = true;
 }
+char *drop_zone = alloc_ct_zone_key(
+_lport->pb->header_.uuid, "drop");
+if (!simap_contains(_zones_data->current, drop_zone)) {
+alloc_id_to_ct_zone(drop_zone,
+_zones_data->current,
+ct_zones_data->bitmap, _start,
+_zones_data->pending);
+updated = true;
+}
+free(drop_zone);
 } else if (t_lport->tracked_type == TRACKED_RESOURCE_REMOVED) {
 struct simap_node *ct_zone =
 simap_find(_zones_data->current,
diff

Re: [ovs-dev] [PATCH ovn v2] northd: Determine gateway port for NAT when not specified

2022-06-09 Thread Abhiram Sangana
Thank you for reviewing the patch, Mark and Han.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Determine gateway port for NAT when not specified

2022-05-24 Thread Abhiram Sangana
Hi Mark,

Thanks for reviewing the patch. I have made the suggested changes and sent out 
a v2 patch.

Thanks,
Abhiram Sangana

On 19 May 2022, at 21:06, Mark Michelson 
mailto:mmich...@redhat.com>> wrote:

Hi Abhiram,

This is a great idea. I only have a couple of minor comments below.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] northd: Determine gateway port for NAT when not specified

2022-05-24 Thread Abhiram Sangana
If multiple distributed gateway ports (DGP) are configured on a
logical router, "gateway_port" column needs to be set for NAT entries
of the logical router for the rules to be applied, as part of commit
2d942be (northd: Add support for NAT with multiple DGP).

This patch updates the behavior by inferring the DGP where the NAT
rule needs to be applied based on the "external_ip" column of the
NAT rule when "gateway_port" column is not set.

Signed-off-by: Abhiram Sangana 
---
 lib/ovn-util.c|  56 
 lib/ovn-util.h|   2 +
 northd/northd.c   | 109 --
 northd/ovn-northd.8.xml   |  19 ---
 ovn-nb.xml|  19 ---
 tests/ovn-nbctl.at|  47 
 tests/ovn-northd.at   | 102 ++-
 utilities/ovn-nbctl.8.xml |   5 +-
 utilities/ovn-nbctl.c |  57 +---
 9 files changed, 260 insertions(+), 156 deletions(-)

diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 81f18d6df..616999eab 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -370,6 +370,62 @@ destroy_lport_addresses(struct lport_addresses *laddrs)
 free(laddrs->ipv6_addrs);
 }
 
+/* Returns a string of the IP address of 'laddrs' that overlaps with 'ip_s'.
+ * If one is not found, returns NULL.
+ *
+ * The caller must not free the returned string. */
+const char *
+find_lport_address(const struct lport_addresses *laddrs, const char *ip_s)
+{
+bool is_ipv4 = strchr(ip_s, '.') ? true : false;
+
+if (is_ipv4) {
+ovs_be32 ip;
+
+if (!ip_parse(ip_s, )) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "bad ip address %s", ip_s);
+return NULL;
+}
+
+for (int i = 0; i < laddrs->n_ipv4_addrs; i++) {
+const struct ipv4_netaddr *na = >ipv4_addrs[i];
+
+if (!((na->network ^ ip) & na->mask)) {
+/* There should be only 1 interface that matches the
+ * supplied IP.  Otherwise, it's a configuration error,
+ * because subnets of a router's interfaces should NOT
+ * overlap. */
+return na->addr_s;
+}
+}
+} else {
+struct in6_addr ip6;
+
+if (!ipv6_parse(ip_s, )) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "bad ipv6 address %s", ip_s);
+return NULL;
+}
+
+for (int i = 0; i < laddrs->n_ipv6_addrs; i++) {
+const struct ipv6_netaddr *na = >ipv6_addrs[i];
+struct in6_addr xor_addr = ipv6_addr_bitxor(>network, );
+struct in6_addr and_addr = ipv6_addr_bitand(_addr, >mask);
+
+if (ipv6_is_zero(_addr)) {
+/* There should be only 1 interface that matches the
+ * supplied IP.  Otherwise, it's a configuration error,
+ * because subnets of a router's interfaces should NOT
+ * overlap. */
+return na->addr_s;
+}
+}
+}
+
+return NULL;
+}
+
 /* Go through 'addresses' and add found IPv4 addresses to 'ipv4_addrs' and
  * IPv6 addresses to 'ipv6_addrs'. */
 void
diff --git a/lib/ovn-util.h b/lib/ovn-util.h
index 024b86b89..b3905ef7b 100644
--- a/lib/ovn-util.h
+++ b/lib/ovn-util.h
@@ -86,6 +86,8 @@ bool extract_lrp_networks__(char *mac, char **networks, 
size_t n_networks,
 
 bool lport_addresses_is_empty(struct lport_addresses *);
 void destroy_lport_addresses(struct lport_addresses *);
+const char *find_lport_address(const struct lport_addresses *laddrs,
+   const char *ip_s);
 
 void split_addresses(const char *addresses, struct svec *ipv4_addrs,
  struct svec *ipv6_addrs);
diff --git a/northd/northd.c b/northd/northd.c
index 51dec36b3..81473d93a 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -1965,6 +1965,23 @@ ipam_add_port_addresses(struct ovn_datapath *od, struct 
ovn_port *op)
 }
 }
 
+static const char *find_lrp_member_ip(const struct ovn_port *op,
+  const char *ip_s);
+
+/* Returns true if the given router port 'op' (assumed to be a distributed
+ * gateway port) is the relevant DGP where the NAT rule of the router needs to
+ * be applied. */
+static bool
+is_nat_gateway_port(const struct nbrec_nat *nat, const struct ovn_port *op)
+{
+if (op->od->n_l3dgw_ports > 1
+&& ((!nat->gateway_port && !find_lrp_member_ip(op, nat->external_ip))
+|| (nat->gateway_port && nat->gateway_port != op->nbrp))) {
+return false;
+}
+return true;
+}
+
 enum dynamic_update_type {
 NONE,/* No change to the address */
 REMOVE,  /* Address is no longer dynamic 

[ovs-dev] [PATCH ovn] northd: Determine gateway port for NAT when not specified

2022-05-05 Thread Abhiram Sangana
If multiple distributed gateway ports (DGP) are configured on a
logical router, "gateway_port" column needs to be set for NAT entries
of the logical router for the rules to be applied, as part of commit
2d942be (northd: Add support for NAT with multiple DGP).

This patch updates the behavior by inferring the DGP where the NAT
rule needs to be applied based on the "external_ip" column of the
NAT rule when "gateway_port" column is not set.

Signed-off-by: Abhiram Sangana 
---
 northd/northd.c   |  49 --
 northd/ovn-northd.8.xml   |  19 ---
 ovn-nb.xml|  19 ---
 tests/ovn-nbctl.at|  47 ++
 tests/ovn-northd.at   | 102 --
 tests/ovn.at  |   2 +-
 utilities/ovn-nbctl.8.xml |   5 +-
 utilities/ovn-nbctl.c |  94 ---
 8 files changed, 228 insertions(+), 109 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index a5297..aa1c2ae05 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -2729,14 +2729,17 @@ join_logical_ports(struct northd_input *input_data,
 }
 }
 
+static const char *find_lrp_member_ip(const struct ovn_port *op,
+  const char *ip_s);
+
 /* Returns an array of strings, each consisting of a MAC address followed
  * by one or more IP addresses, and if the port is a distributed gateway
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
  * logical_port specified in a NAT rule. These strings include the
- * external IP addresses of NAT rules defined on that router which have
- * gateway_port not set or have gateway_port as the router port 'op', and all
- * of the IP addresses used in load balancer VIPs defined on that router.
+ * external IP addresses of NAT rules defined on that router whose
+ * gateway_port is router port 'op', and all of the IP addresses used in
+ * load balancer VIPs defined on that router.
  *
  * The caller must free each of the n returned strings with free(),
  * and must free the returned array when it is no longer needed. */
@@ -2779,7 +2782,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 
 /* Not including external IP of NAT rules whose gateway_port is
  * not 'op'. */
-if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+if (op->od->n_l3dgw_ports > 1 &&
+((!nat->gateway_port && !find_lrp_member_ip(op, nat->external_ip))
+ || (nat->gateway_port && nat->gateway_port != op->nbrp))) {
 continue;
 }
 
@@ -3454,8 +3459,12 @@ ovn_port_update_sbrec(struct northd_input *input_data,
 struct ds nat_addr = DS_EMPTY_INITIALIZER;
 ds_put_format(_addr, "%s", nat_addresses);
 if (l3dgw_ports) {
+const struct ovn_port *l3dgw_port = (
+is_l3dgw_port(op->peer)
+? op->peer
+: op->peer->od->l3dgw_ports[0]);
 ds_put_format(_addr, " is_chassis_resident(%s)",
-op->peer->od->l3dgw_ports[0]->cr_port->json_key);
+l3dgw_port->cr_port->json_key);
 }
 nats[0] = xstrdup(ds_cstr(_addr));
 ds_destroy(_addr);
@@ -10502,9 +10511,11 @@ build_lrouter_port_nat_arp_nd_flow(struct ovn_port *op,
 const struct nbrec_nat *nat = nat_entry->nb;
 struct ds match = DS_EMPTY_INITIALIZER;
 
-/* ARP/ND should be sent from distributed gateway port specified in
+/* ARP/ND should be sent from distributed gateway port relevant to
  * the NAT rule. */
-if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+if (op->od->n_l3dgw_ports > 1 &&
+((!nat->gateway_port && !find_lrp_member_ip(op, nat->external_ip)) ||
+ (nat->gateway_port && nat->gateway_port != op->nbrp))) {
 return;
 }
 
@@ -13287,14 +13298,24 @@ lrouter_check_nat_entry(struct ovn_datapath *od, 
const struct nbrec_nat *nat,
 /* Validate gateway_port of NAT rule. */
 *nat_l3dgw_port = NULL;
 if (nat->gateway_port == NULL) {
-if (od->n_l3dgw_ports > 1) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-VLOG_WARN_RL(, "NAT configured on logical router: %s with"
- "multiple distributed gateway ports needs to specify"
- "valid gateway_port.", od->nbr->name);
-return -EINVAL;
-} else if (od-&g

Re: [ovs-dev] [PATCH ovn v5] northd: Add support for NAT with multiple DGP

2022-04-06 Thread Abhiram Sangana
Thanks for reviewing, Mark. Can we merge the patch if it looks good?

Thanks,
Abhiram Sangana

> On 4 Apr 2022, at 21:05, Mark Michelson  wrote:
> 
> Hi Abhriam,
> 
> Thanks for you patience on this. It looks good by me.
> 
> Acked-by: Mark Michelson 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-04-01 Thread Abhiram Sangana
Hi Numan,

Uploaded v5 patch making the column as weak reference and fixing the tests with 
—sync=sb.
V4 patch was already ignoring a NAT rule if gateway_port is not set and LR has 
multiple DGP.
The behaviour is same in v5.

Will send out a patch to determine gateway port for a NAT rule when it is not 
specified soon.

Thanks,
Abhiram

On 31 Mar 2022, at 15:51, Numan Siddique 
mailto:num...@ovn.org>> wrote:

I'm fine if you can
 - make the column as weak reference
 - and if gateway_port is not set, then ignore that NAT entry in ovn-northd

And in the  subsequent patch determine the gateway port if not specified.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5] northd: Add support for NAT with multiple DGP

2022-04-01 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7
(northd: Multiple distributed gateway port support.)

This patch adds a new column called "gateway_port" in NAT table that
references a distributed gateway port in the Logical_Router_Port
table. A NAT rule is only applied on matching packets entering or
leaving the DGP configured for the rule, when a router has
multiple DGPs.

If a router has a single DGP, NAT rules are applied at the DGP even if
the "gateway_port" column is not set. It is an error to not set this
column for a NAT rule when the router has multiple DGPs.

This patch also updates the NAT commands in ovn-nbctl to support the
new column.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 184 +--
 northd/ovn-northd.8.xml   |  27 +++---
 ovn-architecture.7.xml|   6 +-
 ovn-nb.ovsschema  |  10 +-
 ovn-nb.xml|  37 ++-
 tests/ovn-nbctl.at| 197 +-
 tests/ovn-northd.at   | 176 --
 tests/ovn.at  |   2 +-
 utilities/ovn-nbctl.8.xml |  48 ++
 utilities/ovn-nbctl.c | 170 +++-
 11 files changed, 651 insertions(+), 207 deletions(-)

diff --git a/NEWS b/NEWS
index 3e8358723..c881764a6 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,7 @@ Post v22.03.0
 different OVN Interconnection availability zones.
   - Replaced the usage of masked ct_label by ct_mark in most cases to work
 better with hardware-offloading.
+  - Support NAT for logical routers with multiple distributed gateway ports.
 
 OVN v22.03.0 - 11 Mar 2022
 --
diff --git a/northd/northd.c b/northd/northd.c
index 2fb0a93c2..ad0e8b95d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -617,11 +617,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -2713,8 +2713,9 @@ join_logical_ports(struct northd_input *input_data,
  * by one or more IP addresses, and if the port is a distributed gateway
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
- * logical_port specified in a NAT rule.  These strings include the
- * external IP addresses of all NAT rules defined on that router, and all
+ * logical_port specified in a NAT rule. These strings include the
+ * external IP addresses of NAT rules defined on that router which have
+ * gateway_port not set or have gateway_port as the router port 'op', and all
  * of the IP addresses used in load balancer VIPs defined on that router.
  *
  * The caller must free each of the n returned strings with free(),
@@ -2727,8 +2728,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 struct eth_addr mac;
 if (!op || !op->nbrp || !op->od || !op->od->nbr
 || (!op->od->nbr->n_nat && !op->od->has_lb_vip)
-|| !eth_addr_from_string(op->nbrp->mac, )
-|| op->od->n_l3dgw_ports > 1) {
+|| !eth_addr_from_string(op->nbrp->mac, )) {
 *n = n_nats;
 return NULL;
 }
@@ -2757,6 +2757,12 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 continue;
 }
 
+/* Not including external IP of NAT rules whose gateway_port is
+ * not 'op'. */
+if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+continue;
+}
+
 /* Determine whether this NAT rule satisfies the conditions for
  * distributed NAT processing. */
 if (op->od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat")
@@ -2827,9 +2833,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 if (central_ip_address) {
 /* Gratuitous ARP for centralized NAT rules on distributed gateway
  * ports should be restricted to the gateway chassis. */
-if (op->od->n_l3dgw_ports) {
+if (

Re: [ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-31 Thread Abhiram Sangana
Hi Numan,

Is it ok if we make the “gateway_port” column optional in a subsequent patch?
Will send out a v5 patch that addresses other comments.

Thanks,
Abhiram Sangana

> On 30 Mar 2022, at 19:32, Numan Siddique  wrote:
> 
> On Wed, Mar 30, 2022 at 1:41 PM Abhiram Sangana
>  wrote:
>> 
>> Hi Numan,
>> 
>> Thanks for reviewing the patch
>> 
>> On 30 Mar 2022, at 16:55, Numan Siddique 
>> mailto:num...@ovn.org>> wrote:
>> 
>> I think the "gateway_port" column can be a weak reference  to
>> "Logical_Router_Port".
>> Otherwise CMS cannot delete the logical router port unless it clears
>> this column or deletes the NAT row.
>> 
>> My bad. Will change this column to be a weak reference to LRP.
> 
> No worries.  I guess it would make sense to have a strong reference if
> we go with the approach to make it mandatory for CMS
> to specify gateway_port.
> 
> But since we can determine the gateway_port ourselves,  this can be
> optional and weak reference.
>> 
>> 
>> IMO, this column should be made optional.  If CMS doesn''t specify the
>> gateway port to use for a NAT entry,
>> ovn-northd can figure out which router port is reachable for the
>> "external_ip" of the NAT.
>> ovn-northd already does this for the Logical_Router_Static_Route's
>> output_port column.
>> 
>> 
>> I'd suggest making this column optional.  What do you think ?
>> 
>> Got it. So, if we make this column optional, is this the behaviour
>> that we want?
>> 
>> If LR has a single DGP and “gateway_port” is not specified, we program
>> the NAT rule with the DGP even if "external_ip" of the rule is not in
>> the LRP networks.
>> 
>> If LR has multiple DGPs and "gateway_port" is not specified, we try
>> to determine which DGP is appropriate based on the LRP networks and
>> "external_ip" of the rule. If no DGP matches, we do not program the
>> NAT rule.
>> 
>> Is the current behaviour of ovn-nbctl ok?
> 
> Yes.  But I'd suggest ovn-northd determine the appropriate DGP and not 
> ovn-nbctl
> if CMS doesn't specify the gateway_port.
> 
> Numan
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-30 Thread Abhiram Sangana
Hi Numan,

Thanks for reviewing the patch.

> On 30 Mar 2022, at 16:55, Numan Siddique  wrote:
> 
> 
> I think the "gateway_port" column can be a weak reference  to
> "Logical_Router_Port".
> Otherwise CMS cannot delete the logical router port unless it clears
> this column or deletes the NAT row.

My bad. Will change this column to be a weak reference to LRP.

> IMO, this column should be made optional.  If CMS doesn''t specify the
> gateway port to use for a NAT entry,
> ovn-northd can figure out which router port is reachable for the
> "external_ip" of the NAT.
> ovn-northd already does this for the Logical_Router_Static_Route's
> output_port column.
> 
> Please check this out -
> 
> I'd suggest making this column optional.  What do you think ?
> 

Got it. So, if we make this column optional, is this the behaviour
that we want?

If LR has a single DGP and “gateway_port” is not specified, we program
the NAT rule with the DGP even if "external_ip" of the rule is not in
the LRP networks.

If LR has multiple DGPs and "gateway_port" is not specified, we try
to determine which DGP is appropriate based on the LRP networks and
"external_ip" of the rule. If no DGP matches, we do not program the
NAT rule.

> 
> There can be odd timing issues in CI.  So I'd suggest adding --wait=sb
> or --wait=hv  in the  ovn-nbctl command just before AT_CHECK is
> called.
> 
> There are many places in the test case you can do the same i..e use
> --wait=sb/hv.
> 
> Can you please check and add this in the relevant places ?

Sure, will fix this.

Thanks,
Abhiram Sangana
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-30 Thread Abhiram Sangana
Hi Numan,

Thanks for reviewing the patch

On 30 Mar 2022, at 16:55, Numan Siddique 
mailto:num...@ovn.org>> wrote:

I think the "gateway_port" column can be a weak reference  to
"Logical_Router_Port".
Otherwise CMS cannot delete the logical router port unless it clears
this column or deletes the NAT row.

My bad. Will change this column to be a weak reference to LRP.


IMO, this column should be made optional.  If CMS doesn''t specify the
gateway port to use for a NAT entry,
ovn-northd can figure out which router port is reachable for the
"external_ip" of the NAT.
ovn-northd already does this for the Logical_Router_Static_Route's
output_port column.

Please check this out -
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_blob_main_northd_northd.c-23L9327=DwIFaQ=s883GpUCOChKOHiocYtGcg=RotwPXnAckhnfEzdWgnpPR0nnZ46Y0RYo-uUGaS4vXY=JI0ykaKcvbB2_KQ9SVReBzWtqdpVkyfUf4yA4lHeJW7p7AGEj0-UcEijs3jMkRpS=htB40Gb52UEOPhtBUlXqtMHZq12vpII3d-eHclbBqlM=

I'd suggest making this column optional.  What do you think ?

Got it. So, if we make this column optional, is this the behaviour
that we want?

If LR has a single DGP and “gateway_port” is not specified, we program
the NAT rule with the DGP even if "external_ip" of the rule is not in
the LRP networks.

If LR has multiple DGPs and "gateway_port" is not specified, we try
to determine which DGP is appropriate based on the LRP networks and
"external_ip" of the rule. If no DGP matches, we do not program the
NAT rule.

Is the current behaviour of ovn-nbctl ok?


There can be odd timing issues in CI.  So I'd suggest adding --wait=sb
or --wait=hv  in the  ovn-nbctl command just before AT_CHECK is
called.

There are many places in the test case you can do the same i..e use
--wait=sb/hv.

Can you please check and add this in the relevant places ?

Sure, will fix this.

Thanks,
Abhiram Sangana
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-24 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7
(northd: Multiple distributed gateway port support.)

This patch adds a new column called "gateway_port" in NAT table that
references a distributed gateway port in the Logical_Router_Port
table. A NAT rule is only applied on matching packets entering or
leaving the DGP configured for the rule, when a router has
multiple DGPs.

If a router has a single DGP, NAT rules are applied at the DGP even if
the "gateway_port" column is not set. It is an error to not set this
column for a NAT rule when the router has multiple DGPs.

This patch also updates the NAT commands in ovn-nbctl to support the
new column.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 184 +--
 northd/ovn-northd.8.xml   |  27 +++---
 ovn-architecture.7.xml|   6 +-
 ovn-nb.ovsschema  |  10 +-
 ovn-nb.xml|  37 ++-
 tests/ovn-nbctl.at| 197 +-
 tests/ovn-northd.at   | 172 +++--
 tests/ovn.at  |   2 +-
 utilities/ovn-nbctl.8.xml |  48 ++
 utilities/ovn-nbctl.c | 170 +++-
 11 files changed, 647 insertions(+), 207 deletions(-)

diff --git a/NEWS b/NEWS
index 5ba9c3cf4..458db45a0 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,7 @@ Post v22.03.0
 -
   - Support IGMP and MLD snooping on transit logical switches that connect
 different OVN Interconnection availability zones.
+  - Support NAT for logical routers with multiple distributed gateway ports.
 
 OVN v22.03.0 - XX XXX 
 --
diff --git a/northd/northd.c b/northd/northd.c
index a2cf8d6fc..df890d97d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -606,11 +606,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -2702,8 +2702,9 @@ join_logical_ports(struct northd_input *input_data,
  * by one or more IP addresses, and if the port is a distributed gateway
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
- * logical_port specified in a NAT rule.  These strings include the
- * external IP addresses of all NAT rules defined on that router, and all
+ * logical_port specified in a NAT rule. These strings include the
+ * external IP addresses of NAT rules defined on that router which have
+ * gateway_port not set or have gateway_port as the router port 'op', and all
  * of the IP addresses used in load balancer VIPs defined on that router.
  *
  * The caller must free each of the n returned strings with free(),
@@ -2716,8 +2717,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 struct eth_addr mac;
 if (!op || !op->nbrp || !op->od || !op->od->nbr
 || (!op->od->nbr->n_nat && !op->od->has_lb_vip)
-|| !eth_addr_from_string(op->nbrp->mac, )
-|| op->od->n_l3dgw_ports > 1) {
+|| !eth_addr_from_string(op->nbrp->mac, )) {
 *n = n_nats;
 return NULL;
 }
@@ -2746,6 +2746,12 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 continue;
 }
 
+/* Not including external IP of NAT rules whose gateway_port is
+ * not 'op'. */
+if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+continue;
+}
+
 /* Determine whether this NAT rule satisfies the conditions for
  * distributed NAT processing. */
 if (op->od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat")
@@ -2816,9 +2822,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 if (central_ip_address) {
 /* Gratuitous ARP for centralized NAT rules on distributed gateway
  * ports should be restricted to the gateway chassis. */
-if (op->od->n_l3dgw_ports) {
+if (is_l3dgw_port(op)) {
 ds_put_format(_ad

Re: [ovs-dev] [PATCH ovn v3] northd: Add support for NAT with multiple DGP

2022-03-22 Thread Abhiram Sangana



> On 22 Mar 2022, at 14:31, Mark Michelson  wrote:
> 
> No, that was not my intention. Basically, the following forms would never 
> return an error:
> 
> ovn-nbctl lr-nat-del my_router dnat (type only)
> ovn-nbctl lr-nat-del my_router dnat 172.16.0.1 (type and IP)
> ovn-nbctl lr-nat-del my_router dnat my_router_gateway_port1 (type and gateway 
> port)
> 
> But the following form could return an error
> 
> ovn-nbctl lr-nat-del my_router dnat 172.16.0.1 my_router_gateway_port1 (type, 
> IP, and gateway port specified)

My bad - Got it.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] northd: Add support for NAT with multiple DGP

2022-03-22 Thread Abhiram Sangana


> On 21 Mar 2022, at 19:49, Mark Michelson  wrote:
> 
> I'm a bit surprised that with this change, the hierarchy for specificity is 
> TYPE < IP < GATEWAY_PORT . This makes it sound as though the primary use case 
> would be to use the same IP for multiple NAT rules across different gateway 
> ports. Wouldn't it be just as likely that the same gateway port would be used 
> for multiple NAT rules all with different IPs?
Yes, that makes sense.

> 
>   ovn-nbctl lr-nat-del [TYPE [IP] [GATEWAY_PORT]]
> 
> I think you should leave it the way you have it since it is easiest to 
> explain (deleting multiple rules == never raise an error, deleting a specific 
> rule == raise an error if it doesn't exist). Otherwise, the nuances are 
> difficult to explain and difficult to maintain.

This syntax looks good. So, with this syntax, we never expect to match a single 
NAT rule and hence, we would not need —if_exists, right? I will retain the arg 
but we might never hit that case.

Thanks,
Abhiram Sangana
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] northd: Add support for NAT with multiple DGP

2022-03-15 Thread Abhiram Sangana
Hey Mark,

Thanks for reviewing the patch. Regarding `ovn-nbctl lr-nat-del`, I
have updated the command to have the following structure:
`lr-nat-del ROUTER [TYPE [IP [GATEWAY_PORT]]]`. Most of the
earlier checks are not enforced unless `GATEWAY_PORT` is also
passed.

With the new patch, a NAT rule is no longer uniquely identified by
`TYPE` and `IP`. The user also needs to pass `GATEWAY_PORT` to
uniquely identify a NAT rule. So, I have updated the behavior of
`lr-nat-del` to not raise an error when `IP` is passed but no NAT
rules match, similar to its behavior when `TYPE` is passed and no
NAT rules match. Also, with the new patch, if `IP` is passed without
passing `GATEWAY_PORT`, all NAT rules (possibly None) with the
`TYPE` and `IP` values are deleted irrespective of their
`GATEWAY_PORT` value.

Should we retain the previous behavior of `lr-nat-del` when LR has a
single DGP - raise an error if no NAT rules match when `IP` is passed?

On 14 Mar 2022, at 20:55, Mark Michelson 
mailto:mmich...@redhat.com>> wrote:

Hi Abhiram,

I had a look through the code and I'm happy with how it looks. I also did a 
quick check through the testsuite and it all seems good. All that being said,

I nearly acked this, but I have one question down below in the test code. It 
may indicate some issue in `ovn-nbctl lr-nat-del` so I'd like to get that 
cleared up first.


-AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat_and_snat 30.0.0.3], [1], [],
-[ovn-nbctl: no matching NAT with the type (dnat_and_snat) and external_ip 
(30.0.0.3)
-])
-AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat 30.0.0.2], [1], [],
-[ovn-nbctl: no matching NAT with the type (dnat) and external_ip (30.0.0.2)
-])
-AT_CHECK([ovn-nbctl lr-nat-del lr0 snat 192.168.10.0/24], [1], [],
-[ovn-nbctl: no matching NAT with the type (snat) and logical_ip 
(192.168.10.0/24)
-])
-AT_CHECK([ovn-nbctl --if-exists lr-nat-del lr0 snat 192.168.10.0/24])
+AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat_and_snat 30.0.0.3])

I don't understand the above change. First, I don't understand why many of the 
lr-nat-del checks were removed. Second, I don't understand why the lr-nat-del 
of 30.0.0.3 doesn't return an error. There is no NAT with that address, so 
shouldn't that return an error?

   AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat_and_snat 30.0.0.1])
 AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat_and_snat fd01::1])


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v3] northd: Add support for NAT with multiple DGP

2022-03-07 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7
(northd: Multiple distributed gateway port support.)

This patch adds a new column called "gateway_port" in NAT table that
references a distributed gateway port in the Logical_Router_Port
table. A NAT rule is only applied on matching packets entering or
leaving the DGP configured for the rule, when a router has
multiple DGPs.

If a router has a single DGP, NAT rules are applied at the DGP even if
the "gateway_port" column is not set. It is an error to not set this
column for a NAT rule when the router has multiple DGPs.

This patch also updates the NAT commands in ovn-nbctl to support the
new column.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 182 +++---
 northd/ovn-northd.8.xml   |  27 +++---
 ovn-architecture.7.xml|   6 +-
 ovn-nb.ovsschema  |  10 ++-
 ovn-nb.xml|  37 +++-
 tests/ovn-nbctl.at| 172 +--
 tests/ovn-northd.at   | 166 +-
 tests/ovn.at  |   2 +-
 utilities/ovn-nbctl.8.xml |  32 ---
 utilities/ovn-nbctl.c | 151 ++-
 11 files changed, 592 insertions(+), 194 deletions(-)

diff --git a/NEWS b/NEWS
index 9648f6cb2..764b2a03e 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
 Post v22.03.0
 -
+  - Support NAT with multiple distributed gateway ports on a logical router.
 
 OVN v22.03.0 - XX XXX 
 --
diff --git a/northd/northd.c b/northd/northd.c
index 294a59bd7..351f41134 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -604,11 +604,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -761,16 +761,6 @@ init_nat_entries(struct ovn_datapath *od)
 return;
 }
 
-if (od->n_l3dgw_ports > 1) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "NAT is configured on logical router %s, which has %"
- PRIuSIZE" distributed gateway ports. NAT is not supported"
- " yet when there is more than one distributed gateway "
- "port on the router.",
- od->nbr->name, od->n_l3dgw_ports);
-return;
-}
-
 od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries);
 
 for (size_t i = 0; i < od->nbr->n_nat; i++) {
@@ -2704,8 +2694,9 @@ join_logical_ports(struct northd_input *input_data,
  * by one or more IP addresses, and if the port is a distributed gateway
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
- * logical_port specified in a NAT rule.  These strings include the
- * external IP addresses of all NAT rules defined on that router, and all
+ * logical_port specified in a NAT rule. These strings include the
+ * external IP addresses of NAT rules defined on that router which have
+ * gateway_port not set or have gateway_port as the router port 'op', and all
  * of the IP addresses used in load balancer VIPs defined on that router.
  *
  * The caller must free each of the n returned strings with free(),
@@ -2718,8 +2709,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 struct eth_addr mac;
 if (!op || !op->nbrp || !op->od || !op->od->nbr
 || (!op->od->nbr->n_nat && !op->od->has_lb_vip)
-|| !eth_addr_from_string(op->nbrp->mac, )
-|| op->od->n_l3dgw_ports > 1) {
+|| !eth_addr_from_string(op->nbrp->mac, )) {
 *n = n_nats;
 return NULL;
 }
@@ -2748,6 +2738,12 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 continue;
 }
 
+/* Not including external IP of NAT rules whose gateway_port is
+ * not 'op'. */
+if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+ 

Re: [ovs-dev] [PATCH ovn v2] northd: Add support for NAT with multiple DGP

2022-01-07 Thread Abhiram Sangana
Hi Mark,

Thanks for looking into this patch.

> On 5 Jan 2022, at 22:01, Mark Michelson  wrote:
> 
> Hi,
> 
> I haven't done a full review of this patch, but I have noticed a problem 
> pretty early on when I started looking.
> 
> The new ip_in_lrp_networks() function that is added here is intended to 
> determine which l3dgw port to use for a particular NAT external address. The 
> problem is that there are configurations, especially from OpenStack, where 
> this will cause problems.
> 
> In current OpenStack configurations, they might set up a router with a 
> gateway port that serves the 10.0.0.0/8 network. But they might set up a DNAT 
> with external address 192.168.0.1 on that router. They expect that traffic 
> sent to 192.168.0.1 will be NATted correctly on that router, even though the 
> router port network does not include that IP address.

Yeah, this makes sense. I was thinking about this case for Load balancer rules 
but it’s true for DNAT in general.

> If you run `make check-system-userspace` with your series, you'll find that 
> the "Floating IP outside router subnet IPv4" test fails.
> 
> How could this be fixed?
> 
> One idea is to configure the external port explicitly either on the NAT 
> itself or on the router. This way, even if the IP address is outside the 
> subnet of the DGP, you can still know which DGP is the "correct" one.
> 
> Another idea would be to differentiate behavior between SNAT and DNAT. For 
> DNAT, you could treat all DGPs as equals, so it doesn't matter on which DGP 
> the traffic is received, it will get translated properly. For SNAT or 
> DNAT-and-SNAT, you'd probably need to explicitly specify which DGP to use, 
> though.

I was thinking of the second approach for Load balancer rules. I could do the 
same for DNAT rules. The first approach gives more control but expects the user 
to configure correctly. Which approach would you suggest?

Thanks,
Abhiram
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] northd: Add support for NAT with multiple DGP

2022-01-05 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7b
(northd: Multiple distributed gateway port support.)

This patch updates the behavior by selectively applying NAT rules at DGPs.
A NAT rule is applied on matching packets entering or leaving a specific
DGP only if the external_ip of the rule belongs to the same subnet as the
DGP.

This patch also updates ovn-nbctl to accept multiple NAT rules of type
`snat` with the same logical_ip but different external_ip for a logical
router.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 210 +-
 northd/ovn-northd.8.xml   |  27 +++--
 ovn-architecture.7.xml|   6 +-
 ovn-nb.xml|   4 +-
 tests/ovn-nbctl.at|  40 +++-
 tests/ovn-northd.at   | 200 +---
 utilities/ovn-nbctl.8.xml |  18 +++-
 utilities/ovn-nbctl.c | 157 ++--
 9 files changed, 547 insertions(+), 116 deletions(-)

diff --git a/NEWS b/NEWS
index 53f9718b1..a27f89150 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
 Post v21.12.0
 -
+  - Support NAT with multiple distributed gateway ports on a logical router.
 
 OVN v21.12.0 - xx xxx 
 --
diff --git a/northd/northd.c b/northd/northd.c
index c714227b2..7f766158d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -604,11 +604,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -761,16 +761,6 @@ init_nat_entries(struct ovn_datapath *od)
 return;
 }
 
-if (od->n_l3dgw_ports > 1) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "NAT is configured on logical router %s, which has %"
- PRIuSIZE" distributed gateway ports. NAT is not supported"
- " yet when there is more than one distributed gateway "
- "port on the router.",
- od->nbr->name, od->n_l3dgw_ports);
-return;
-}
-
 od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries);
 
 for (size_t i = 0; i < od->nbr->n_nat; i++) {
@@ -1631,6 +1621,49 @@ is_cr_port(const struct ovn_port *op)
 return op->l3dgw_port;
 }
 
+/* Checks if the IP address (assumed valid) represented by string 'address' is
+ * in one of the networks of the logical router port 'op'. */
+static bool
+ip_in_lrp_networks(const struct ovn_port *op, const char *address) {
+ovs_be32 ip, mask;
+struct in6_addr ipv6, mask_v6;
+
+char *error = ip_parse_masked(address, , );
+bool is_v6 = false;
+
+if (error || mask != OVS_BE32_MAX) {
+free(error);
+ipv6_parse_masked(address, , _v6);
+is_v6 = true;
+}
+
+struct lport_addresses lrp_networks;
+extract_lrp_networks(op->nbrp, _networks);
+
+bool ip_in_net = false;
+if (is_v6) {
+for (int i = 0; i < lrp_networks.n_ipv6_addrs; i++) {
+struct ipv6_netaddr *lrp6_addr = &(lrp_networks.ipv6_addrs[i]);
+struct in6_addr ip6_mask = ipv6_addr_bitand(_addr->mask,
+);
+
+if (ipv6_addr_equals(_mask, &(lrp6_addr->network))) {
+ip_in_net = true;
+}
+}
+} else {
+for (int i = 0; i < lrp_networks.n_ipv4_addrs; i++) {
+struct ipv4_netaddr *lrp4_addr = &(lrp_networks.ipv4_addrs[i]);
+
+if ((ip & lrp4_addr->mask) == lrp4_addr->network) {
+ip_in_net = true;
+}
+}
+}
+destroy_lport_addresses(_networks);
+return ip_in_net;
+}
+
 static void
 destroy_routable_addresses(struct ovn_port_routable_addresses *ra)
 {
@@ -2705,8 +2738,9 @@ join_logical_ports(struct northd_input *input_data,
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
  * logical_port specified in a NAT rule.  These strings include the
- 

[ovs-dev] [PATCH v2] northd: Add support for NAT with multiple DGP

2022-01-04 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7b
(northd: Multiple distributed gateway port support.)

This patch updates the behavior by selectively applying NAT rules at DGPs.
A NAT rule is applied on matching packets entering or leaving a specific
DGP only if the external_ip of the rule belongs to the same subnet as the
DGP.

This patch also updates ovn-nbctl to accept multiple NAT rules of type
`snat` with the same logical_ip but different external_ip for a logical
router.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 210 +-
 northd/ovn-northd.8.xml   |  27 +++--
 ovn-architecture.7.xml|   6 +-
 ovn-nb.xml|   4 +-
 tests/ovn-nbctl.at|  40 +++-
 tests/ovn-northd.at   | 200 +---
 utilities/ovn-nbctl.8.xml |  18 +++-
 utilities/ovn-nbctl.c | 157 ++--
 9 files changed, 547 insertions(+), 116 deletions(-)

diff --git a/NEWS b/NEWS
index 53f9718b1..a27f89150 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
 Post v21.12.0
 -
+  - Support NAT with multiple distributed gateway ports on a logical router.
 
 OVN v21.12.0 - xx xxx 
 --
diff --git a/northd/northd.c b/northd/northd.c
index c714227b2..7f766158d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -604,11 +604,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -761,16 +761,6 @@ init_nat_entries(struct ovn_datapath *od)
 return;
 }
 
-if (od->n_l3dgw_ports > 1) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "NAT is configured on logical router %s, which has %"
- PRIuSIZE" distributed gateway ports. NAT is not supported"
- " yet when there is more than one distributed gateway "
- "port on the router.",
- od->nbr->name, od->n_l3dgw_ports);
-return;
-}
-
 od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries);
 
 for (size_t i = 0; i < od->nbr->n_nat; i++) {
@@ -1631,6 +1621,49 @@ is_cr_port(const struct ovn_port *op)
 return op->l3dgw_port;
 }
 
+/* Checks if the IP address (assumed valid) represented by string 'address' is
+ * in one of the networks of the logical router port 'op'. */
+static bool
+ip_in_lrp_networks(const struct ovn_port *op, const char *address) {
+ovs_be32 ip, mask;
+struct in6_addr ipv6, mask_v6;
+
+char *error = ip_parse_masked(address, , );
+bool is_v6 = false;
+
+if (error || mask != OVS_BE32_MAX) {
+free(error);
+ipv6_parse_masked(address, , _v6);
+is_v6 = true;
+}
+
+struct lport_addresses lrp_networks;
+extract_lrp_networks(op->nbrp, _networks);
+
+bool ip_in_net = false;
+if (is_v6) {
+for (int i = 0; i < lrp_networks.n_ipv6_addrs; i++) {
+struct ipv6_netaddr *lrp6_addr = &(lrp_networks.ipv6_addrs[i]);
+struct in6_addr ip6_mask = ipv6_addr_bitand(_addr->mask,
+);
+
+if (ipv6_addr_equals(_mask, &(lrp6_addr->network))) {
+ip_in_net = true;
+}
+}
+} else {
+for (int i = 0; i < lrp_networks.n_ipv4_addrs; i++) {
+struct ipv4_netaddr *lrp4_addr = &(lrp_networks.ipv4_addrs[i]);
+
+if ((ip & lrp4_addr->mask) == lrp4_addr->network) {
+ip_in_net = true;
+}
+}
+}
+destroy_lport_addresses(_networks);
+return ip_in_net;
+}
+
 static void
 destroy_routable_addresses(struct ovn_port_routable_addresses *ra)
 {
@@ -2705,8 +2738,9 @@ join_logical_ports(struct northd_input *input_data,
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
  * logical_port specified in a NAT rule.  These strings include the
- 

[ovs-dev] [PATCH ovn] northd: Add support for NAT with multiple DGP.

2021-09-20 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7b
(northd: Multiple distributed gateway port support.)

This patch updates the behavior by selectively applying NAT rules at DGPs.
A NAT rule is applied on matching packets entering or leaving a specific
DGP only if the external_ip of the rule belongs to the same subnet as the
DGP.

This patch also updates ovn-nbctl to accept multiple NAT rules of type
`snat` with the same logical_ip but different external_ip for a logical
router.

Signed-off-by: Abhiram Sangana 
---
 NEWS|   1 +
 northd/northd.c | 155 +++
 northd/ovn-northd.8.xml |  29 ---
 ovn-architecture.7.xml  |   6 +-
 ovn-nb.xml  |   4 +-
 tests/ovn-nbctl.at  |  40 -
 tests/ovn-northd.at | 175 +---
 utilities/ovn-nbctl.c   | 156 ---
 8 files changed, 473 insertions(+), 93 deletions(-)

diff --git a/NEWS b/NEWS
index 8a21c029e..1fdc65bc0 100644
--- a/NEWS
+++ b/NEWS
@@ -12,6 +12,7 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Support NAT with multiple distributed gateway ports on a logical router.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/northd/northd.c b/northd/northd.c
index d1b87891c..21a72d238 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -593,12 +593,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' and 'l3redirect_port' will be
- * ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -747,16 +746,6 @@ init_nat_entries(struct ovn_datapath *od)
 return;
 }
 
-if (od->n_l3dgw_ports > 1) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "NAT is configured on logical router %s, which has %"
- PRIuSIZE" distributed gateway ports. NAT is not supported"
- " yet when there is more than one distributed gateway "
- "port on the router.",
- od->nbr->name, od->n_l3dgw_ports);
-return;
-}
-
 od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries);
 
 for (size_t i = 0; i < od->nbr->n_nat; i++) {
@@ -2681,8 +2670,11 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only)
 /* Gratuitous ARP for centralized NAT rules on distributed gateway
  * ports should be restricted to the gateway chassis. */
 if (op->od->n_l3dgw_ports) {
+const struct ovn_port *l3dgw_port = (is_l3dgw_port(op)
+ ? op
+ : op->od->l3dgw_ports[0]);
 ds_put_format(_addresses, " is_chassis_resident(%s)",
-  op->od->l3dgw_ports[0]->cr_port->json_key);
+  l3dgw_port->cr_port->json_key);
 }
 
 addresses[n_nats++] = ds_steal_cstr(_addresses);
@@ -3274,9 +3266,11 @@ ovn_port_update_sbrec(struct northd_context *ctx,
 }
 
 if (op->peer->od->n_l3dgw_ports) {
+const struct ovn_port *l3dgw_port = (
+is_l3dgw_port(op->peer) ? op->peer
+: 
op->peer->od->l3dgw_ports[0]);
 ds_put_format(_info, " is_chassis_resident(%s)",
-  op->peer->od->l3dgw_ports[0]
-  ->cr_port->json_key);
+  l3dgw_port->cr_port->json_key);
 }
 
 n_nats++;
@@ -9966,8 +9960,11 @@ build_lrouter_port_nat_arp_nd_flow(struct ovn_port *op,
  * Also need to avoid generation of multiple ARP responses
  * from different chassis. */
 if (op->od->n_l3dgw_ports) {
+const struct ovn_port *l3dgw_port = (is_l3