Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
On 11/21/22 14:56, Abhiram Sangana wrote: Hi Adrian, I apologise for the delay in replying back. On 14 Nov 2022, at 13:05, Adrian Moreno wrote: Hi, On 10/20/22 15:49, Abhiram Sangana wrote: Hi Dumitru, Thanks for reviewing the patch. On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: Hi Abhiram, Thanks for the patch! I only skimmed the changes so this is not a full review but more of a discussion starter. On 10/18/22 17:33, Abhiram Sangana wrote: To identify connections dropped by ACLs, users can enable logging for ACLs but this approach does not scale. ACL logging uses "controller" action which causes a significant spike in the CPU usage of ovs-vswitchd (and ovn-controller to a lesser extent) even with metering enabled (observed 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another approach is to use drop sampling (patch by Adrian Moreno currently in review) but we might miss specific connections of interest with this approach. This patch commits connections dropped by ACLs to the connection tracking table with a specific ACL label that was introduced in 0e0228be ( northd: Add ACL label). The dropped connections are committed in a separate CT zone so that they can be managed independently. I'm not sure I understand how the CMS can manage this. How is this better than sampling? Committed connections are going to time out at some point (30 sec by default for udp/icmp with the kernel datapath). So the CMS will have to continuously monitor the contents of the conntrack zone? Aren't we just moving the CPU load somewhere else with this? Even so, there's a chance an entry is missed. Linux nf_conntrack module supports sending connection tracking events to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel parameter). So, CMS can parse the stream of new connection events from conntrack and log the packets based on CT label. Isn't this datapath-specific? this won't be available for the netdev datapath, right? Yes, this approach is datapath specific - I haven’t checked how to send connection tracking events for netdev datapath. An issue with sampling is that if there are a large number of packets for a particular connection(s), packets of other connections might not get sampled and we miss information about these connections. With the conntrack approach, we get a single event for each connection (until they time out), so there is lesser load on the CMS/collector and lesser likelihood of missing connections. I don't see why sampling would inherently miss packets. The current RFC for ACL sampling (different from the generic drop sampling one) allows the user to specify a probability per ACL so 100% of packets can be sampled in connection establishment drops while we sample N% of accepted traffic having a lot of flexibility in the inevitable performance vs accuracy tradeoff. I'd like to better understand what are the limitations of the current approach to see if it can be improved in any way. I haven’t experimented with flow-based IPFIX sampling but I noticed high CPU usage of ovs-vswitchd while trying to export Netflow records (which I think is similar to 100% sampling) with large number of connections in OVN bridge. I was expecting a similar cost with respect to upcalls if we use 100% sampling rate for drop ACLs when there are multiple connection establishment packets in the bridge. I have also found that per-bridge sampling at 100% not very usable. Besides, by default both ingress and egress traffic is sampled so that's pretty much generating 2 IPFIX messages per packet. For ACL-rejection sampling, however, it shouldn't be that much. Besides, we can always make use of the flow cache to avoid keeping hold of the handler threads. I plan to run some performance benchmarking of this. Thanks, Abhiram Thanks, -- Adrián Each logical port is assigned a new zone for committing dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. A new lflow action "ct_commit_drop" is introduced that commits flows to connection tracking table in a zone identified by MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" action and non-empty label is translated to "ct_commit_drop" instead of silently dropping the packet. Signed-off-by: Abhiram Sangana --- controller/ovn-controller.c | 23 ++--- controller/physical.c| 32 +-- include/ovn/actions.h| 1 + include/ovn/logical-fields.h | 1 + lib/actions.c| 50 lib/ovn-util.c | 4 +-- lib/ovn-util.h | 2 +- northd/northd.c | 14 -- northd/ovn-northd.8.xml | 14 -- ovn-sb.xml | 17 ovs | 2 +- This shouldn't need to change the OVS submodule. My bad. Will fix this. Thanks, Abhiram Sangana Thanks, Dumitru utilities/ovn-nbctl.c| 7 ++--- 12 files changed, 151
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
Hi Adrian, I apologise for the delay in replying back. > On 14 Nov 2022, at 13:05, Adrian Moreno wrote: > > Hi, > > On 10/20/22 15:49, Abhiram Sangana wrote: >> Hi Dumitru, >> Thanks for reviewing the patch. >>> On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: >>> >>> Hi Abhiram, >>> >>> Thanks for the patch! I only skimmed the changes so this is not a full >>> review but more of a discussion starter. >>> >>> On 10/18/22 17:33, Abhiram Sangana wrote: To identify connections dropped by ACLs, users can enable logging for ACLs but this approach does not scale. ACL logging uses "controller" action which causes a significant spike in the CPU usage of ovs-vswitchd (and ovn-controller to a lesser extent) even with metering enabled (observed 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another approach is to use drop sampling (patch by Adrian Moreno currently in review) but we might miss specific connections of interest with this approach. This patch commits connections dropped by ACLs to the connection tracking table with a specific ACL label that was introduced in 0e0228be ( northd: Add ACL label). The dropped connections are committed in a separate CT zone so that they can be managed independently. >>> >>> I'm not sure I understand how the CMS can manage this. How is this >>> better than sampling? Committed connections are going to time out at >>> some point (30 sec by default for udp/icmp with the kernel datapath). >>> So the CMS will have to continuously monitor the contents of the >>> conntrack zone? Aren't we just moving the CPU load somewhere else with >>> this? Even so, there's a chance an entry is missed. >> Linux nf_conntrack module supports sending connection tracking events >> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel >> parameter). So, CMS can parse the stream of new connection events from >> conntrack and log the packets based on CT label. > > Isn't this datapath-specific? this won't be available for the netdev > datapath, right? Yes, this approach is datapath specific - I haven’t checked how to send connection tracking events for netdev datapath. > >> An issue with sampling is that if there are a large number of packets >> for a particular connection(s), packets of other connections might not >> get sampled and we miss information about these connections. With the >> conntrack approach, we get a single event for each connection (until >> they time out), so there is lesser load on the CMS/collector and lesser >> likelihood of missing connections. > > I don't see why sampling would inherently miss packets. The current RFC for > ACL sampling (different from the generic drop sampling one) allows the user > to specify a probability per ACL so 100% of packets can be sampled in > connection establishment drops while we sample N% of accepted traffic having > a lot of flexibility in the inevitable performance vs accuracy tradeoff. > > I'd like to better understand what are the limitations of the current > approach to see if it can be improved in any way. I haven’t experimented with flow-based IPFIX sampling but I noticed high CPU usage of ovs-vswitchd while trying to export Netflow records (which I think is similar to 100% sampling) with large number of connections in OVN bridge. I was expecting a similar cost with respect to upcalls if we use 100% sampling rate for drop ACLs when there are multiple connection establishment packets in the bridge. Thanks, Abhiram > Thanks, > -- > Adrián > >>> Each logical port is assigned a new zone for committing dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. A new lflow action "ct_commit_drop" is introduced that commits flows to connection tracking table in a zone identified by MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" action and non-empty label is translated to "ct_commit_drop" instead of silently dropping the packet. Signed-off-by: Abhiram Sangana --- controller/ovn-controller.c | 23 ++--- controller/physical.c| 32 +-- include/ovn/actions.h| 1 + include/ovn/logical-fields.h | 1 + lib/actions.c| 50 lib/ovn-util.c | 4 +-- lib/ovn-util.h | 2 +- northd/northd.c | 14 -- northd/ovn-northd.8.xml | 14 -- ovn-sb.xml | 17 ovs | 2 +- >>> >>> This shouldn't need to change the OVS submodule. >>> >> My bad. Will fix this. >> Thanks, >> Abhiram Sangana >>> Thanks, >>> Dumitru >>> utilities/ovn-nbctl.c| 7 ++--- 12 files changed, 151 insertions(+), 16 deletions(-) diff --git a/controller/ovn-controller.c
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
Hi, On 10/20/22 15:49, Abhiram Sangana wrote: Hi Dumitru, Thanks for reviewing the patch. On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: Hi Abhiram, Thanks for the patch! I only skimmed the changes so this is not a full review but more of a discussion starter. On 10/18/22 17:33, Abhiram Sangana wrote: To identify connections dropped by ACLs, users can enable logging for ACLs but this approach does not scale. ACL logging uses "controller" action which causes a significant spike in the CPU usage of ovs-vswitchd (and ovn-controller to a lesser extent) even with metering enabled (observed 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another approach is to use drop sampling (patch by Adrian Moreno currently in review) but we might miss specific connections of interest with this approach. This patch commits connections dropped by ACLs to the connection tracking table with a specific ACL label that was introduced in 0e0228be ( northd: Add ACL label). The dropped connections are committed in a separate CT zone so that they can be managed independently. I'm not sure I understand how the CMS can manage this. How is this better than sampling? Committed connections are going to time out at some point (30 sec by default for udp/icmp with the kernel datapath). So the CMS will have to continuously monitor the contents of the conntrack zone? Aren't we just moving the CPU load somewhere else with this? Even so, there's a chance an entry is missed. Linux nf_conntrack module supports sending connection tracking events to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel parameter). So, CMS can parse the stream of new connection events from conntrack and log the packets based on CT label. Isn't this datapath-specific? this won't be available for the netdev datapath, right? An issue with sampling is that if there are a large number of packets for a particular connection(s), packets of other connections might not get sampled and we miss information about these connections. With the conntrack approach, we get a single event for each connection (until they time out), so there is lesser load on the CMS/collector and lesser likelihood of missing connections. I don't see why sampling would inherently miss packets. The current RFC for ACL sampling (different from the generic drop sampling one) allows the user to specify a probability per ACL so 100% of packets can be sampled in connection establishment drops while we sample N% of accepted traffic having a lot of flexibility in the inevitable performance vs accuracy tradeoff. I'd like to better understand what are the limitations of the current approach to see if it can be improved in any way. Thanks, -- Adrián Each logical port is assigned a new zone for committing dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. A new lflow action "ct_commit_drop" is introduced that commits flows to connection tracking table in a zone identified by MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" action and non-empty label is translated to "ct_commit_drop" instead of silently dropping the packet. Signed-off-by: Abhiram Sangana --- controller/ovn-controller.c | 23 ++--- controller/physical.c| 32 +-- include/ovn/actions.h| 1 + include/ovn/logical-fields.h | 1 + lib/actions.c| 50 lib/ovn-util.c | 4 +-- lib/ovn-util.h | 2 +- northd/northd.c | 14 -- northd/ovn-northd.8.xml | 14 -- ovn-sb.xml | 17 ovs | 2 +- This shouldn't need to change the OVS submodule. My bad. Will fix this. Thanks, Abhiram Sangana Thanks, Dumitru utilities/ovn-nbctl.c| 7 ++--- 12 files changed, 151 insertions(+), 16 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index c97744d57..1ad20fe55 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; struct shash_node *shash_node; +const struct binding_lport *lport; SHASH_FOR_EACH (shash_node, binding_lports) { sset_add(_users, shash_node->name); + +/* Zone for committing dropped connections of a vNIC. */ +lport = shash_node->data; +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, "drop"); +sset_add(_users, drop_zone); +free(drop_zone); } /* Local patched datapath (gateway routers) need zones assigned. */ @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { /* XXX Add method to limit zone assignment to logical router * datapaths with NAT */ -char *dnat =
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
> On 1 Nov 2022, at 12:35, Dumitru Ceara wrote: > > On 11/1/22 12:40, Abhiram Sangana wrote: >> >> >>> On 21 Oct 2022, at 10:58, Dumitru Ceara wrote: >>> >>> On 10/20/22 17:34, Abhiram Sangana wrote: Hi Dumitru, Can you please check if the implementation for the proposal looks ok? Will send out v1 with the review comments and tests. Also, any ideas how we can selectively create new drop zones for ports. Currently, I am assigning a new zone for dropped connections to every port even if its parent LS doesn’t have drop ACLs with labels. >>> >>> Within a logical switch do we really need a drop zone per port? Isn't >>> it actually enough if we add a "from-lport-drop" and "to-lport-drop" >>> zone for the whole logical switch? That should simplify zone allocation. >>> >> Given that we are committing dropped connections to CT table, a DDOS >> attack can potentially fill up the CT table. We are planning to send >> another patch that limits the number of entries for a given CT zone. >> It is easier to manage the size of CT zones when there is a CT zone >> per port rather than per Logical_Switch. >> > > Ok. To answer your previous question, we could mark the SB.Port_Binding > with an option if there are ACLs applied to the switch containing it and > at least one of those needs drop zones. Even better would be to mark > the the SB.Datapath_Binding but we don't have options there, we'd have > to update the schema. > > I'm not sure if that works for you, what do you think? > > Thanks, > Dumitru Yes, I think that should work. I will send out a v1 patch with these changes. Thank you, Dumitru. Thanks, Abhiram ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
On 11/1/22 12:40, Abhiram Sangana wrote: > > >> On 21 Oct 2022, at 10:58, Dumitru Ceara wrote: >> >> On 10/20/22 17:34, Abhiram Sangana wrote: >>> Hi Dumitru, >>> >>> Can you please check if the implementation for the proposal looks ok? >>> Will send out v1 with the review comments and tests. >>> Also, any ideas how we can selectively create new drop zones for ports. >>> Currently, I am assigning a new zone for dropped connections to every >>> port even if its parent LS doesn’t have drop ACLs with labels. >>> >> >> Within a logical switch do we really need a drop zone per port? Isn't >> it actually enough if we add a "from-lport-drop" and "to-lport-drop" >> zone for the whole logical switch? That should simplify zone allocation. >> > Given that we are committing dropped connections to CT table, a DDOS > attack can potentially fill up the CT table. We are planning to send > another patch that limits the number of entries for a given CT zone. > It is easier to manage the size of CT zones when there is a CT zone > per port rather than per Logical_Switch. > Ok. To answer your previous question, we could mark the SB.Port_Binding with an option if there are ACLs applied to the switch containing it and at least one of those needs drop zones. Even better would be to mark the the SB.Datapath_Binding but we don't have options there, we'd have to update the schema. I'm not sure if that works for you, what do you think? Thanks, Dumitru ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
> On 21 Oct 2022, at 10:58, Dumitru Ceara wrote: > > On 10/20/22 17:34, Abhiram Sangana wrote: >> Hi Dumitru, >> >> Can you please check if the implementation for the proposal looks ok? >> Will send out v1 with the review comments and tests. >> Also, any ideas how we can selectively create new drop zones for ports. >> Currently, I am assigning a new zone for dropped connections to every >> port even if its parent LS doesn’t have drop ACLs with labels. >> > > Within a logical switch do we really need a drop zone per port? Isn't > it actually enough if we add a "from-lport-drop" and "to-lport-drop" > zone for the whole logical switch? That should simplify zone allocation. > Given that we are committing dropped connections to CT table, a DDOS attack can potentially fill up the CT table. We are planning to send another patch that limits the number of entries for a given CT zone. It is easier to manage the size of CT zones when there is a CT zone per port rather than per Logical_Switch. >> Thanks, >> Abhiram Sangana >> >>> On 20 Oct 2022, at 15:18, Dumitru Ceara wrote: >>> >>> On 10/20/22 15:49, Abhiram Sangana wrote: Hi Dumitru, Thanks for reviewing the patch. > On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: > > Hi Abhiram, > > Thanks for the patch! I only skimmed the changes so this is not a full > review but more of a discussion starter. > > On 10/18/22 17:33, Abhiram Sangana wrote: >> To identify connections dropped by ACLs, users can enable logging for >> ACLs >> but this approach does not scale. ACL logging uses "controller" action >> which causes a significant spike in the CPU usage of ovs-vswitchd (and >> ovn-controller to a lesser extent) even with metering enabled (observed >> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another >> approach is to use drop sampling (patch by Adrian Moreno currently in >> review) but we might miss specific connections of interest with this >> approach. >> >> This patch commits connections dropped by ACLs to the connection tracking >> table with a specific ACL label that was introduced in 0e0228be ( >> northd: Add ACL label). The dropped connections are committed in a >> separate >> CT zone so that they can be managed independently. >> > > I'm not sure I understand how the CMS can manage this. How is this > better than sampling? Committed connections are going to time out at > some point (30 sec by default for udp/icmp with the kernel datapath). > So the CMS will have to continuously monitor the contents of the > conntrack zone? Aren't we just moving the CPU load somewhere else with > this? Even so, there's a chance an entry is missed. Linux nf_conntrack module supports sending connection tracking events to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel parameter). So, CMS can parse the stream of new connection events from conntrack and log the packets based on CT label. >>> >>> Ack. >>> An issue with sampling is that if there are a large number of packets for a particular connection(s), packets of other connections might not get sampled and we miss information about these connections. With the conntrack approach, we get a single event for each connection (until they time out), so there is lesser load on the CMS/collector and lesser likelihood of missing connections. >>> >>> I hadn't considered this advantage, thanks for pointing it out! >>> > >> Each logical port is assigned a new zone for committing dropped flows. >> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. >> >> A new lflow action "ct_commit_drop" is introduced that commits flows >> to connection tracking table in a zone identified by >> MFF_LOG_ACL_DROP_ZONE register. >> >> An ACL with "drop" action and non-empty label is translated to >> "ct_commit_drop" >> instead of silently dropping the packet. >> >> Signed-off-by: Abhiram Sangana >> --- >> controller/ovn-controller.c | 23 ++--- >> controller/physical.c| 32 +-- >> include/ovn/actions.h| 1 + >> include/ovn/logical-fields.h | 1 + >> lib/actions.c| 50 >> lib/ovn-util.c | 4 +-- >> lib/ovn-util.h | 2 +- >> northd/northd.c | 14 -- >> northd/ovn-northd.8.xml | 14 -- >> ovn-sb.xml | 17 >> ovs | 2 +- > > This shouldn't need to change the OVS submodule. > My bad. Will fix this. >>> >>> Cool, thanks, looking forward for v1! >>> >>> Regards, >>> Dumitru >>> Thanks, Abhiram Sangana > Thanks, > Dumitru
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
On 10/20/22 17:34, Abhiram Sangana wrote: > Hi Dumitru, > > Can you please check if the implementation for the proposal looks ok? > Will send out v1 with the review comments and tests. > Also, any ideas how we can selectively create new drop zones for ports. > Currently, I am assigning a new zone for dropped connections to every > port even if its parent LS doesn’t have drop ACLs with labels. > Within a logical switch do we really need a drop zone per port? Isn't it actually enough if we add a "from-lport-drop" and "to-lport-drop" zone for the whole logical switch? That should simplify zone allocation. > Thanks, > Abhiram Sangana > >> On 20 Oct 2022, at 15:18, Dumitru Ceara wrote: >> >> On 10/20/22 15:49, Abhiram Sangana wrote: >>> Hi Dumitru, >>> >>> Thanks for reviewing the patch. >>> On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: Hi Abhiram, Thanks for the patch! I only skimmed the changes so this is not a full review but more of a discussion starter. On 10/18/22 17:33, Abhiram Sangana wrote: > To identify connections dropped by ACLs, users can enable logging for ACLs > but this approach does not scale. ACL logging uses "controller" action > which causes a significant spike in the CPU usage of ovs-vswitchd (and > ovn-controller to a lesser extent) even with metering enabled (observed > 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another > approach is to use drop sampling (patch by Adrian Moreno currently in > review) but we might miss specific connections of interest with this > approach. > > This patch commits connections dropped by ACLs to the connection tracking > table with a specific ACL label that was introduced in 0e0228be ( > northd: Add ACL label). The dropped connections are committed in a > separate > CT zone so that they can be managed independently. > I'm not sure I understand how the CMS can manage this. How is this better than sampling? Committed connections are going to time out at some point (30 sec by default for udp/icmp with the kernel datapath). So the CMS will have to continuously monitor the contents of the conntrack zone? Aren't we just moving the CPU load somewhere else with this? Even so, there's a chance an entry is missed. >>> >>> Linux nf_conntrack module supports sending connection tracking events >>> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel >>> parameter). So, CMS can parse the stream of new connection events from >>> conntrack and log the packets based on CT label. >>> >> >> Ack. >> >>> An issue with sampling is that if there are a large number of packets >>> for a particular connection(s), packets of other connections might not >>> get sampled and we miss information about these connections. With the >>> conntrack approach, we get a single event for each connection (until >>> they time out), so there is lesser load on the CMS/collector and lesser >>> likelihood of missing connections. >>> >> >> I hadn't considered this advantage, thanks for pointing it out! >> > Each logical port is assigned a new zone for committing dropped flows. > The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. > > A new lflow action "ct_commit_drop" is introduced that commits flows > to connection tracking table in a zone identified by > MFF_LOG_ACL_DROP_ZONE register. > > An ACL with "drop" action and non-empty label is translated to > "ct_commit_drop" > instead of silently dropping the packet. > > Signed-off-by: Abhiram Sangana > --- > controller/ovn-controller.c | 23 ++--- > controller/physical.c| 32 +-- > include/ovn/actions.h| 1 + > include/ovn/logical-fields.h | 1 + > lib/actions.c| 50 > lib/ovn-util.c | 4 +-- > lib/ovn-util.h | 2 +- > northd/northd.c | 14 -- > northd/ovn-northd.8.xml | 14 -- > ovn-sb.xml | 17 > ovs | 2 +- This shouldn't need to change the OVS submodule. >>> My bad. Will fix this. >>> >> >> Cool, thanks, looking forward for v1! >> >> Regards, >> Dumitru >> >>> Thanks, >>> Abhiram Sangana >>> Thanks, Dumitru > utilities/ovn-nbctl.c| 7 ++--- > 12 files changed, 151 insertions(+), 16 deletions(-) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index c97744d57..1ad20fe55 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, >unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; > >struct shash_node *shash_node; > +const
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
Hi Dumitru, Can you please check if the implementation for the proposal looks ok? Will send out v1 with the review comments and tests. Also, any ideas how we can selectively create new drop zones for ports. Currently, I am assigning a new zone for dropped connections to every port even if its parent LS doesn’t have drop ACLs with labels. Thanks, Abhiram Sangana > On 20 Oct 2022, at 15:18, Dumitru Ceara wrote: > > On 10/20/22 15:49, Abhiram Sangana wrote: >> Hi Dumitru, >> >> Thanks for reviewing the patch. >> >>> On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: >>> >>> Hi Abhiram, >>> >>> Thanks for the patch! I only skimmed the changes so this is not a full >>> review but more of a discussion starter. >>> >>> On 10/18/22 17:33, Abhiram Sangana wrote: To identify connections dropped by ACLs, users can enable logging for ACLs but this approach does not scale. ACL logging uses "controller" action which causes a significant spike in the CPU usage of ovs-vswitchd (and ovn-controller to a lesser extent) even with metering enabled (observed 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another approach is to use drop sampling (patch by Adrian Moreno currently in review) but we might miss specific connections of interest with this approach. This patch commits connections dropped by ACLs to the connection tracking table with a specific ACL label that was introduced in 0e0228be ( northd: Add ACL label). The dropped connections are committed in a separate CT zone so that they can be managed independently. >>> >>> I'm not sure I understand how the CMS can manage this. How is this >>> better than sampling? Committed connections are going to time out at >>> some point (30 sec by default for udp/icmp with the kernel datapath). >>> So the CMS will have to continuously monitor the contents of the >>> conntrack zone? Aren't we just moving the CPU load somewhere else with >>> this? Even so, there's a chance an entry is missed. >> >> Linux nf_conntrack module supports sending connection tracking events >> to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel >> parameter). So, CMS can parse the stream of new connection events from >> conntrack and log the packets based on CT label. >> > > Ack. > >> An issue with sampling is that if there are a large number of packets >> for a particular connection(s), packets of other connections might not >> get sampled and we miss information about these connections. With the >> conntrack approach, we get a single event for each connection (until >> they time out), so there is lesser load on the CMS/collector and lesser >> likelihood of missing connections. >> > > I hadn't considered this advantage, thanks for pointing it out! > >>> Each logical port is assigned a new zone for committing dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. A new lflow action "ct_commit_drop" is introduced that commits flows to connection tracking table in a zone identified by MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" action and non-empty label is translated to "ct_commit_drop" instead of silently dropping the packet. Signed-off-by: Abhiram Sangana --- controller/ovn-controller.c | 23 ++--- controller/physical.c| 32 +-- include/ovn/actions.h| 1 + include/ovn/logical-fields.h | 1 + lib/actions.c| 50 lib/ovn-util.c | 4 +-- lib/ovn-util.h | 2 +- northd/northd.c | 14 -- northd/ovn-northd.8.xml | 14 -- ovn-sb.xml | 17 ovs | 2 +- >>> >>> This shouldn't need to change the OVS submodule. >>> >> My bad. Will fix this. >> > > Cool, thanks, looking forward for v1! > > Regards, > Dumitru > >> Thanks, >> Abhiram Sangana >> >>> Thanks, >>> Dumitru >>> utilities/ovn-nbctl.c| 7 ++--- 12 files changed, 151 insertions(+), 16 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index c97744d57..1ad20fe55 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; struct shash_node *shash_node; +const struct binding_lport *lport; SHASH_FOR_EACH (shash_node, binding_lports) { sset_add(_users, shash_node->name); + +/* Zone for committing dropped connections of a vNIC. */ +lport = shash_node->data; +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, "drop"); +sset_add(_users,
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
On 10/20/22 15:49, Abhiram Sangana wrote: > Hi Dumitru, > > Thanks for reviewing the patch. > >> On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: >> >> Hi Abhiram, >> >> Thanks for the patch! I only skimmed the changes so this is not a full >> review but more of a discussion starter. >> >> On 10/18/22 17:33, Abhiram Sangana wrote: >>> To identify connections dropped by ACLs, users can enable logging for ACLs >>> but this approach does not scale. ACL logging uses "controller" action >>> which causes a significant spike in the CPU usage of ovs-vswitchd (and >>> ovn-controller to a lesser extent) even with metering enabled (observed >>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another >>> approach is to use drop sampling (patch by Adrian Moreno currently in >>> review) but we might miss specific connections of interest with this >>> approach. >>> >>> This patch commits connections dropped by ACLs to the connection tracking >>> table with a specific ACL label that was introduced in 0e0228be ( >>> northd: Add ACL label). The dropped connections are committed in a separate >>> CT zone so that they can be managed independently. >>> >> >> I'm not sure I understand how the CMS can manage this. How is this >> better than sampling? Committed connections are going to time out at >> some point (30 sec by default for udp/icmp with the kernel datapath). >> So the CMS will have to continuously monitor the contents of the >> conntrack zone? Aren't we just moving the CPU load somewhere else with >> this? Even so, there's a chance an entry is missed. > > Linux nf_conntrack module supports sending connection tracking events > to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel > parameter). So, CMS can parse the stream of new connection events from > conntrack and log the packets based on CT label. > Ack. > An issue with sampling is that if there are a large number of packets > for a particular connection(s), packets of other connections might not > get sampled and we miss information about these connections. With the > conntrack approach, we get a single event for each connection (until > they time out), so there is lesser load on the CMS/collector and lesser > likelihood of missing connections. > I hadn't considered this advantage, thanks for pointing it out! >> >>> Each logical port is assigned a new zone for committing dropped flows. >>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. >>> >>> A new lflow action "ct_commit_drop" is introduced that commits flows >>> to connection tracking table in a zone identified by >>> MFF_LOG_ACL_DROP_ZONE register. >>> >>> An ACL with "drop" action and non-empty label is translated to >>> "ct_commit_drop" >>> instead of silently dropping the packet. >>> >>> Signed-off-by: Abhiram Sangana >>> --- >>> controller/ovn-controller.c | 23 ++--- >>> controller/physical.c| 32 +-- >>> include/ovn/actions.h| 1 + >>> include/ovn/logical-fields.h | 1 + >>> lib/actions.c| 50 >>> lib/ovn-util.c | 4 +-- >>> lib/ovn-util.h | 2 +- >>> northd/northd.c | 14 -- >>> northd/ovn-northd.8.xml | 14 -- >>> ovn-sb.xml | 17 >>> ovs | 2 +- >> >> This shouldn't need to change the OVS submodule. >> > My bad. Will fix this. > Cool, thanks, looking forward for v1! Regards, Dumitru > Thanks, > Abhiram Sangana > >> Thanks, >> Dumitru >> >>> utilities/ovn-nbctl.c| 7 ++--- >>> 12 files changed, 151 insertions(+), 16 deletions(-) >>> >>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>> index c97744d57..1ad20fe55 100644 >>> --- a/controller/ovn-controller.c >>> +++ b/controller/ovn-controller.c >>> @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, >>> unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; >>> >>> struct shash_node *shash_node; >>> +const struct binding_lport *lport; >>> SHASH_FOR_EACH (shash_node, binding_lports) { >>> sset_add(_users, shash_node->name); >>> + >>> +/* Zone for committing dropped connections of a vNIC. */ >>> +lport = shash_node->data; >>> +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, >>> "drop"); >>> +sset_add(_users, drop_zone); >>> +free(drop_zone); >>> } >>> >>> /* Local patched datapath (gateway routers) need zones assigned. */ >>> @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, >>> HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { >>> /* XXX Add method to limit zone assignment to logical router >>> * datapaths with NAT */ >>> -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, >>> "dnat"); >>> -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, >>> "snat"); >>> +char
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
Hi Dumitru, Thanks for reviewing the patch. > On 19 Oct 2022, at 14:09, Dumitru Ceara wrote: > > Hi Abhiram, > > Thanks for the patch! I only skimmed the changes so this is not a full > review but more of a discussion starter. > > On 10/18/22 17:33, Abhiram Sangana wrote: >> To identify connections dropped by ACLs, users can enable logging for ACLs >> but this approach does not scale. ACL logging uses "controller" action >> which causes a significant spike in the CPU usage of ovs-vswitchd (and >> ovn-controller to a lesser extent) even with metering enabled (observed >> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another >> approach is to use drop sampling (patch by Adrian Moreno currently in >> review) but we might miss specific connections of interest with this >> approach. >> >> This patch commits connections dropped by ACLs to the connection tracking >> table with a specific ACL label that was introduced in 0e0228be ( >> northd: Add ACL label). The dropped connections are committed in a separate >> CT zone so that they can be managed independently. >> > > I'm not sure I understand how the CMS can manage this. How is this > better than sampling? Committed connections are going to time out at > some point (30 sec by default for udp/icmp with the kernel datapath). > So the CMS will have to continuously monitor the contents of the > conntrack zone? Aren't we just moving the CPU load somewhere else with > this? Even so, there's a chance an entry is missed. Linux nf_conntrack module supports sending connection tracking events to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel parameter). So, CMS can parse the stream of new connection events from conntrack and log the packets based on CT label. An issue with sampling is that if there are a large number of packets for a particular connection(s), packets of other connections might not get sampled and we miss information about these connections. With the conntrack approach, we get a single event for each connection (until they time out), so there is lesser load on the CMS/collector and lesser likelihood of missing connections. > >> Each logical port is assigned a new zone for committing dropped flows. >> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. >> >> A new lflow action "ct_commit_drop" is introduced that commits flows >> to connection tracking table in a zone identified by >> MFF_LOG_ACL_DROP_ZONE register. >> >> An ACL with "drop" action and non-empty label is translated to >> "ct_commit_drop" >> instead of silently dropping the packet. >> >> Signed-off-by: Abhiram Sangana >> --- >> controller/ovn-controller.c | 23 ++--- >> controller/physical.c| 32 +-- >> include/ovn/actions.h| 1 + >> include/ovn/logical-fields.h | 1 + >> lib/actions.c| 50 >> lib/ovn-util.c | 4 +-- >> lib/ovn-util.h | 2 +- >> northd/northd.c | 14 -- >> northd/ovn-northd.8.xml | 14 -- >> ovn-sb.xml | 17 >> ovs | 2 +- > > This shouldn't need to change the OVS submodule. > My bad. Will fix this. Thanks, Abhiram Sangana > Thanks, > Dumitru > >> utilities/ovn-nbctl.c| 7 ++--- >> 12 files changed, 151 insertions(+), 16 deletions(-) >> >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >> index c97744d57..1ad20fe55 100644 >> --- a/controller/ovn-controller.c >> +++ b/controller/ovn-controller.c >> @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, >> unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; >> >> struct shash_node *shash_node; >> +const struct binding_lport *lport; >> SHASH_FOR_EACH (shash_node, binding_lports) { >> sset_add(_users, shash_node->name); >> + >> +/* Zone for committing dropped connections of a vNIC. */ >> +lport = shash_node->data; >> +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, >> "drop"); >> +sset_add(_users, drop_zone); >> +free(drop_zone); >> } >> >> /* Local patched datapath (gateway routers) need zones assigned. */ >> @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, >> HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { >> /* XXX Add method to limit zone assignment to logical router >> * datapaths with NAT */ >> -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, >> "dnat"); >> -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, >> "snat"); >> +char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat"); >> +char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat"); >> sset_add(_users, dnat); >> sset_add(_users, snat); >> shash_add(_lds, dnat, ld); >> @@ -2090,7 +2097,7 @@
Re: [ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
Hi Abhiram, Thanks for the patch! I only skimmed the changes so this is not a full review but more of a discussion starter. On 10/18/22 17:33, Abhiram Sangana wrote: > To identify connections dropped by ACLs, users can enable logging for ACLs > but this approach does not scale. ACL logging uses "controller" action > which causes a significant spike in the CPU usage of ovs-vswitchd (and > ovn-controller to a lesser extent) even with metering enabled (observed > 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another > approach is to use drop sampling (patch by Adrian Moreno currently in > review) but we might miss specific connections of interest with this > approach. > > This patch commits connections dropped by ACLs to the connection tracking > table with a specific ACL label that was introduced in 0e0228be ( > northd: Add ACL label). The dropped connections are committed in a separate > CT zone so that they can be managed independently. > I'm not sure I understand how the CMS can manage this. How is this better than sampling? Committed connections are going to time out at some point (30 sec by default for udp/icmp with the kernel datapath). So the CMS will have to continuously monitor the contents of the conntrack zone? Aren't we just moving the CPU load somewhere else with this? Even so, there's a chance an entry is missed. > Each logical port is assigned a new zone for committing dropped flows. > The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. > > A new lflow action "ct_commit_drop" is introduced that commits flows > to connection tracking table in a zone identified by > MFF_LOG_ACL_DROP_ZONE register. > > An ACL with "drop" action and non-empty label is translated to > "ct_commit_drop" > instead of silently dropping the packet. > > Signed-off-by: Abhiram Sangana > --- > controller/ovn-controller.c | 23 ++--- > controller/physical.c| 32 +-- > include/ovn/actions.h| 1 + > include/ovn/logical-fields.h | 1 + > lib/actions.c| 50 > lib/ovn-util.c | 4 +-- > lib/ovn-util.h | 2 +- > northd/northd.c | 14 -- > northd/ovn-northd.8.xml | 14 -- > ovn-sb.xml | 17 > ovs | 2 +- This shouldn't need to change the OVS submodule. Thanks, Dumitru > utilities/ovn-nbctl.c| 7 ++--- > 12 files changed, 151 insertions(+), 16 deletions(-) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index c97744d57..1ad20fe55 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, > unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; > > struct shash_node *shash_node; > +const struct binding_lport *lport; > SHASH_FOR_EACH (shash_node, binding_lports) { > sset_add(_users, shash_node->name); > + > +/* Zone for committing dropped connections of a vNIC. */ > +lport = shash_node->data; > +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, > "drop"); > +sset_add(_users, drop_zone); > +free(drop_zone); > } > > /* Local patched datapath (gateway routers) need zones assigned. */ > @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, > HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { > /* XXX Add method to limit zone assignment to logical router > * datapaths with NAT */ > -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat"); > -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat"); > +char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat"); > +char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat"); > sset_add(_users, dnat); > sset_add(_users, snat); > shash_add(_lds, dnat, ld); > @@ -2090,7 +2097,7 @@ ct_zones_datapath_binding_handler(struct engine_node > *node, void *data) > /* Check if the requested snat zone has changed for the datapath > * or not. If so, then fall back to full recompute of > * ct_zone engine. */ > -char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, > "snat"); > +char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, > "snat"); > struct simap_node *simap_node = simap_find(_zones_data->current, > snat_dp_zone_key); > free(snat_dp_zone_key); > @@ -2148,6 +2155,16 @@ ct_zones_runtime_data_handler(struct engine_node > *node, void *data) > _zones_data->pending); > updated = true; > } > +char *drop_zone = alloc_ct_zone_key( > +
[ovs-dev] [RFC PATCH ovn] northd, controller: Commit flows dropped by ACLs in a separate CT zone
To identify connections dropped by ACLs, users can enable logging for ACLs but this approach does not scale. ACL logging uses "controller" action which causes a significant spike in the CPU usage of ovs-vswitchd (and ovn-controller to a lesser extent) even with metering enabled (observed 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another approach is to use drop sampling (patch by Adrian Moreno currently in review) but we might miss specific connections of interest with this approach. This patch commits connections dropped by ACLs to the connection tracking table with a specific ACL label that was introduced in 0e0228be ( northd: Add ACL label). The dropped connections are committed in a separate CT zone so that they can be managed independently. Each logical port is assigned a new zone for committing dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. A new lflow action "ct_commit_drop" is introduced that commits flows to connection tracking table in a zone identified by MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" action and non-empty label is translated to "ct_commit_drop" instead of silently dropping the packet. Signed-off-by: Abhiram Sangana --- controller/ovn-controller.c | 23 ++--- controller/physical.c| 32 +-- include/ovn/actions.h| 1 + include/ovn/logical-fields.h | 1 + lib/actions.c| 50 lib/ovn-util.c | 4 +-- lib/ovn-util.h | 2 +- northd/northd.c | 14 -- northd/ovn-northd.8.xml | 14 -- ovn-sb.xml | 17 ovs | 2 +- utilities/ovn-nbctl.c| 7 ++--- 12 files changed, 151 insertions(+), 16 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index c97744d57..1ad20fe55 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; struct shash_node *shash_node; +const struct binding_lport *lport; SHASH_FOR_EACH (shash_node, binding_lports) { sset_add(_users, shash_node->name); + +/* Zone for committing dropped connections of a vNIC. */ +lport = shash_node->data; +char *drop_zone = alloc_ct_zone_key(>pb->header_.uuid, "drop"); +sset_add(_users, drop_zone); +free(drop_zone); } /* Local patched datapath (gateway routers) need zones assigned. */ @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { /* XXX Add method to limit zone assignment to logical router * datapaths with NAT */ -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat"); -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat"); +char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat"); +char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat"); sset_add(_users, dnat); sset_add(_users, snat); shash_add(_lds, dnat, ld); @@ -2090,7 +2097,7 @@ ct_zones_datapath_binding_handler(struct engine_node *node, void *data) /* Check if the requested snat zone has changed for the datapath * or not. If so, then fall back to full recompute of * ct_zone engine. */ -char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, "snat"); +char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, "snat"); struct simap_node *simap_node = simap_find(_zones_data->current, snat_dp_zone_key); free(snat_dp_zone_key); @@ -2148,6 +2155,16 @@ ct_zones_runtime_data_handler(struct engine_node *node, void *data) _zones_data->pending); updated = true; } +char *drop_zone = alloc_ct_zone_key( +_lport->pb->header_.uuid, "drop"); +if (!simap_contains(_zones_data->current, drop_zone)) { +alloc_id_to_ct_zone(drop_zone, +_zones_data->current, +ct_zones_data->bitmap, _start, +_zones_data->pending); +updated = true; +} +free(drop_zone); } else if (t_lport->tracked_type == TRACKED_RESOURCE_REMOVED) { struct simap_node *ct_zone = simap_find(_zones_data->current, diff --git a/controller/physical.c b/controller/physical.c index f3c8bddce..fc46669c1 100644 --- a/controller/physical.c +++ b/controller/physical.c @@ -60,6 +60,7 @@ struct zone_ids { int ct; /*