On 8/1/24 13:58, Ilya Maximets wrote: > On 7/31/24 19:38, Dumitru Ceara wrote: >> On 7/31/24 18:39, Ilya Maximets wrote: >>> On 7/31/24 18:17, Ilya Maximets wrote: >>>> On 7/31/24 11:05, Dumitru Ceara wrote: >>>>> From: Adrian Moreno <[email protected]> >>>>> >>>>> Introduce a new table called Sample where per-flow IPFIX configuration >>>>> can be specified. >>>>> Also, reference rows from such table from the ACL table to enable the >>>>> configuration of ACL sampling. If enabled, northd will add a sample >>>>> action to each ACL related logical flow. >>>>> >>>>> Packets that hit stateful ACLs are sampled in different ways depending >>>>> whether they are initiating a new session or are just forwarded on an >>>>> existing (already allowed) session. Two new columns ("sample_new" and >>>>> "sample_est") are added to the ACL table to allow for potentially >>>>> different sampling rates for the two cases. >>>> >>>> Do we actually need to be able to sample each ACL to a different set of >>>> collectors? >> >> Yes, please see below. >> >>>> I mean, is it not enough to have a common configuration per sampling type >>>> (drop, >>>> acl-new, acl-est) and two boolean fields per ACL to opt-in as we do for >>>> acl logging? >>>> Or a map ACL:sample = key: enum [new,est], value: metadata ? >>>> >>>> >>>> Something like: >>>> >>>> NB_Global: >>>> sample_config: map key enum [drop, acl-new, acl-est] >>>> value ref Sample_Collector_Set >>>> >>>> Sample_Collector_Set: >>>> ids: set key integer >>>> probability: integer >>>> >>>> ACL: >>>> sample_metadata: map key enum [new, est] >>>> value integer >>>> >>>>> >>>>> Note: If an ACL has both sampling enabled and a label associated to it >>>>> then the label value overrides the observation point ID defined in the >>>>> sample configuration. This is a side effect of the implementation >>>>> (observation point IDs are stored in conntrack in the same part of the >>>>> ct_label where ACL labels are also stored). The two features >>>>> (sampling and ACL labels) serve however similar purposes so it's not >>>>> expected that they're both enabled together. >>>>> >>>>> When sampling is enabled on an ACL additional logical flows are created >>>>> for that ACL (one for stateless ACLs and 3 for stateful ACLs) in the ACL >>>>> action stage of the logical pipeline. These additional flows match on a >>>>> combination of conntrack state values and observation point id values >>>>> (either against a logical register or against the stored ct_label state) >>>>> in order to determine whether the packets hitting the ACLs must be >>>>> sampled or not. This comes with a slight increase in the number of >>>>> logical flows and in the number of OpenFlow rules. The number of >>>>> additional flows _does not_ depend on the original ACL match or action. >>>>> >>>>> New --sample-new and --sample-est optional arguments are added to the >>>>> 'ovn-nbctl acl-add' command to allow configuring these new types of >>>>> sampling for ACLs. >>>>> >>>>> An example workflow of configuring ACL samples is: >>>>> # Create Sampling_App mappings for ACL traffic types: >>>>> ovn-nbctl create Sampling_App name="acl-new-traffic-sampling" \ >>>>> id="42" >>>>> ovn-nbctl create sampling_app name="acl-est-traffic-sampling" \ >>>>> id="43" >>>>> # Create two sample collectors, one that samples all packets (c1) >>>>> # and another one that samples with a probability of 10% (c2): >>>>> c1=$(ovn-nbctl create Sample_Collector name=c1 \ >>>>> probability=65535 set_id=1) >>>>> c2=$(ovn-nbctl create Sample_Collector name=c2 \ >>>>> probability=6553 set_id=2) >>>>> # Create two sample configurations (for new and for established >>>>> # traffic): >>>>> s1=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4301) >>>>> s2=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4302) >>>>> # Create an ingress ACL to allow IP traffic: >>>>> ovn-nbctl --sample-new=$s1 --sample-est=$s2 acl-add ls \ >>>>> from-lport 1 "ip" allow-related >>>>> >>>>> The config above will generate IPFIX samples with: >>>>> - 8 MSB of observation domain id set to 42 (Sampling_App >>>>> "acl-new-traffic-sampling" config) and observation point id >>>>> set to 4301 (Sample s1) for packets that create a new >>>>> connection >>>>> - 8 MSB of observation domain id set to 43 (Sampling_app >>>>> "acl-est-traffic-sampling" config) and observation point id >>>>> set to 4302 (Sample s2) for packets that are part of an already >>>>> existing connection >>>>> >>>>> Note: in general, all generated IPFIX sample observation domain IDs are >>>>> built by ovn-controller in the following way: >>>>> The 8 MSB taken from the sample action's obs_domain_id and the last 24 >>>>> LSB taken from the Southbound logical datapath tunnel_key (datapath ID). >>>>> >>>>> Reported-at: https://issues.redhat.com/browse/FDP-305 >>>>> Signed-off-by: Adrian Moreno <[email protected]> >>>>> Co-authored-by: Ales Musil <[email protected]> >>>>> Signed-off-by: Ales Musil <[email protected]> >>>>> Co-authored-by: Dumitru Ceara <[email protected]> >>>>> Signed-off-by: Dumitru Ceara <[email protected]> >>>>> --- >>>>> V4: >>>>> - added explicit sampling stages >>>>> - reduced set_id max supported value >>>> >>>> I don't get that. Does it end up in the observation domain/point >>>> somehow? Or in conntrack mark/label? Sounds strange. If it's only >>>> in logical flow and OpenFlow actions, then it shouldn't matter what >>>> the ID is. Or am I missing something? >>> >>> Hmm, I see that they are indeed passed around via registers and >>> committed to conntrack... >>> >>> But I'm still not sure why we would need to have different collectors >>> per ACL. We can already differentiate by metadata and sending to >>> different collectors on this level sounds like a very niche use-case >>> that may not even exist. >>> >> >> There actually is a real use case behind this. Two different external >> actors that observe how traffic is processed by OVN ACLs: >> >> a. one of them wishes to receive samples of traffic hitting all ACLs >> (with a given probability, P1) >> b. the other one wishes to receive samples of traffic hitting some of >> the ACLs (with a different probability, P2) >> >> The sets of ACLs that have sampling enabled in "a" and "b" above are >> usually _not_ disjoint. >> >> To make it less abstract: >> - "a" is a k8s network observability application (for the whole cluster >> and for all network policies - ACLs - in the cluster) >> - "b" is an OVN-Kubernetes module that is used for debugging the traffic >> path in a live cluster, e.g., trying to figure out why a certain type of >> traffic is allowed while there's no network policy that allows it. >> >> They both need to be able to function simultaneously. > > Wouldn't "b" also need to listen on all the ACLs to figure out which > one is allowing the traffic? >
I listed that only as an example, "trying to figure out why a certain type of traffic is allowed while there's no network policy that allows it". But the general use case is to allow focused troubleshooting (of network policy for now) on live production setups. As far as I understand, users of "b" would not do "blanket enabling of sampling" because: - it might kill performance (if "a" is not already running) - they're troubleshooting a specific problem: traffic from pod X is allowed. So there's no point to blindly sample everything, we can for example sample only ACLs that corresponding to the namespace pod X lives in. Or even a single specific ACL if there's already a potential suspect network policy. > One reasonable scenario would be for "b" is "I want to know which > traffic is allowed by this one particular ACL", but I'm still not > convinced this is a common use case, because ACL itself says which > traffic it allows and it's usually what you described - we don't It might seem obvious to OVN developers but reading ACLs is not that straight forward for end users. The match potentially includes address sets, port groups, registers, essentially any OVS field. > know what is happening and need to apply blanket sampling on all of > them to figure out which one is a culprit of the issue. > > Even if we want to listen on a particular set of ACLs, I'm not sure > how bad it is if the collector will receive traffic for other ACLs > as well, it can filter those out. > > We can attach the probability to a collector, it will change the > suggested schema design, but we will still not need to match on > those fields. > I'm not sure I follow this part, the schema in the patch already adds the probability to the Sample_Collector. > Also, we're not providing the same flexibility for drop sampling, > even though it is treated as part of the same functionality. > We're not yet, and maybe we should add it in the future for default drop debugging too. However, legit (user-defined) drops due to the default-deny network policy rule are implemented through ACLs. So those are going to be sampled (to multiple collectors if applicable). > And "a" in the real world scenario scenario is likely using local > sampling with psample, so "b" can just tap into psample, filtering > out ACLs it doesn't need, instead of setting a separate collector > for itself. > Except that "a" and "b" might need different probabilities for different ACLs. > And the optimization that uses those registers is only possible when > there is only one collector configured per ACL, which will not be > the case in the described scenario. > While we want to support the case when "b" is running, "b" - the debugging application - will not be continuously running. And we're operating under the premise that it won't enable blanket sampling of all ACLs. So the optimization is still relevant for most ACLs. > > Speaking of matching, don't we also need to match on the value of > probability? Nothing prevents us from creating two collectors with > the same collector set ID but different probability. Meaning the > set id and the observation stage do not actually uniquely identify > the sampling configuration. > Maybe we should add [set-id] as index then. > OTOH, observation point id (a.k.a. Sample->metadata) does uniquely > identify sampling configuration, because it is an indexed field in > the database. Why can't we use it for matching instead and save > some registers? It's a single collector use case anyway, so metadata > matches to exactly one collector. > I'm not sure I follow this part. Are you talking about the sampling-with-registers optimization? If so, that's patch 9/9. >> >>> If we use the schema that I suggested above, we may not need to >>> match on the collector set ids, so they will not take space in the >>> registers or conntrack mark/labels. >>> >> >> That's true, but I don't think we can use the schema you suggest. >> >>> Best regards, Ilya Maximets. >>> >> >> Regards, >> Dumitru >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
