Re: [ovs-dev] [PATCH ovn v4 5/9] northd: Add ACL Sampling.

Dumitru Ceara Wed, 31 Jul 2024 10:38:34 -0700

On 7/31/24 18:39, Ilya Maximets wrote:
> On 7/31/24 18:17, Ilya Maximets wrote:
>> On 7/31/24 11:05, Dumitru Ceara wrote:
>>> From: Adrian Moreno <[email protected]>
>>>
>>> Introduce a new table called Sample where per-flow IPFIX configuration
>>> can be specified.
>>> Also, reference rows from such table from the ACL table to enable the
>>> configuration of ACL sampling. If enabled, northd will add a sample
>>> action to each ACL related logical flow.
>>>
>>> Packets that hit stateful ACLs are sampled in different ways depending
>>> whether they are initiating a new session or are just forwarded on an
>>> existing (already allowed) session.  Two new columns ("sample_new" and
>>> "sample_est") are added to the ACL table to allow for potentially
>>> different sampling rates for the two cases.
>>
>> Do we actually need to be able to sample each ACL to a different set of 
>> collectors?


Yes, please see below.

>> I mean, is it not enough to have a common configuration per sampling type 
>> (drop,
>> acl-new, acl-est) and two boolean fields per ACL to opt-in as we do for acl 
>> logging?
>> Or a map ACL:sample = key: enum [new,est], value: metadata ?
>>
>>
>> Something like:
>>
>>  NB_Global:
>>    sample_config: map key enum [drop, acl-new, acl-est]
>>                       value ref Sample_Collector_Set
>>
>>  Sample_Collector_Set:
>>    ids: set key integer
>>    probability: integer
>>
>>  ACL:
>>    sample_metadata: map key enum [new, est]
>>                         value integer
>>
>>>
>>> Note: If an ACL has both sampling enabled and a label associated to it
>>> then the label value overrides the observation point ID defined in the
>>> sample configuration.  This is a side effect of the implementation
>>> (observation point IDs are stored in conntrack in the same part of the
>>> ct_label where ACL labels are also stored).  The two features
>>> (sampling and ACL labels) serve however similar purposes so it's not
>>> expected that they're both enabled together.
>>>
>>> When sampling is enabled on an ACL additional logical flows are created
>>> for that ACL (one for stateless ACLs and 3 for stateful ACLs) in the ACL
>>> action stage of the logical pipeline.  These additional flows match on a
>>> combination of conntrack state values and observation point id values
>>> (either against a logical register or against the stored ct_label state)
>>> in order to determine whether the packets hitting the ACLs must be
>>> sampled or not.  This comes with a slight increase in the number of
>>> logical flows and in the number of OpenFlow rules.  The number of
>>> additional flows _does not_ depend on the original ACL match or action.
>>>
>>> New --sample-new and --sample-est optional arguments are added to the
>>> 'ovn-nbctl acl-add' command to allow configuring these new types of
>>> sampling for ACLs.
>>>
>>> An example workflow of configuring ACL samples is:
>>>   # Create Sampling_App mappings for ACL traffic types:
>>>   ovn-nbctl create Sampling_App name="acl-new-traffic-sampling" \
>>>                                 id="42"
>>>   ovn-nbctl create sampling_app name="acl-est-traffic-sampling" \
>>>                             id="43"
>>>   # Create two sample collectors, one that samples all packets (c1)
>>>   # and another one that samples with a probability of 10% (c2):
>>>   c1=$(ovn-nbctl create Sample_Collector name=c1 \
>>>        probability=65535 set_id=1)
>>>   c2=$(ovn-nbctl create Sample_Collector name=c2 \
>>>        probability=6553 set_id=2)
>>>   # Create two sample configurations (for new and for established
>>>   # traffic):
>>>   s1=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4301)
>>>   s2=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4302)
>>>   # Create an ingress ACL to allow IP traffic:
>>>   ovn-nbctl --sample-new=$s1 --sample-est=$s2 acl-add ls \
>>>             from-lport 1 "ip" allow-related
>>>
>>> The config above will generate IPFIX samples with:
>>> - 8 MSB of observation domain id set to 42 (Sampling_App
>>>   "acl-new-traffic-sampling" config) and observation point id
>>>   set to 4301 (Sample s1) for packets that create a new
>>>   connection
>>> - 8 MSB of observation domain id set to 43 (Sampling_app
>>>   "acl-est-traffic-sampling" config) and observation point id
>>>   set to 4302 (Sample s2) for packets that are part of an already
>>>   existing connection
>>>
>>> Note: in general, all generated IPFIX sample observation domain IDs are
>>> built by ovn-controller in the following way:
>>> The 8 MSB taken from the sample action's obs_domain_id and the last 24
>>> LSB taken from the Southbound logical datapath tunnel_key (datapath ID).
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-305
>>> Signed-off-by: Adrian Moreno <[email protected]>
>>> Co-authored-by: Ales Musil <[email protected]>
>>> Signed-off-by: Ales Musil <[email protected]>
>>> Co-authored-by: Dumitru Ceara <[email protected]>
>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>> ---
>>> V4:
>>> - added explicit sampling stages
>>> - reduced set_id max supported value
>>
>> I don't get that.  Does it end up in the observation domain/point
>> somehow?  Or in conntrack mark/label?  Sounds strange.  If it's only
>> in logical flow and OpenFlow actions, then it shouldn't matter what
>> the ID is.  Or am I missing something?
> 
> Hmm, I see that they are indeed passed around via registers and
> committed to conntrack...
> 
> But I'm still not sure why we would need to have different collectors
> per ACL.  We can already differentiate by metadata and sending to
> different collectors on this level sounds like a very niche use-case
> that may not even exist.
> 

There actually is a real use case behind this.  Two different external
actors that observe how traffic is processed by OVN ACLs:

a. one of them wishes to receive samples of traffic hitting all ACLs
(with a given probability, P1)
b. the other one wishes to receive samples of traffic hitting some of
the ACLs (with a different probability, P2)

The sets of ACLs that have sampling enabled in "a" and "b" above are
usually _not_ disjoint.

To make it less abstract:
- "a" is a k8s network observability application (for the whole cluster
and for all network policies - ACLs - in the cluster)
- "b" is an OVN-Kubernetes module that is used for debugging the traffic
path in a live cluster, e.g., trying to figure out why a certain type of
traffic is allowed while there's no network policy that allows it.

They both need to be able to function simultaneously.

> If we use the schema that I suggested above, we may not need to
> match on the collector set ids, so they will not take space in the
> registers or conntrack mark/labels.
> 

That's true, but I don't think we can use the schema you suggest.

> Best regards, Ilya Maximets.
> 

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v4 5/9] northd: Add ACL Sampling.

Reply via email to