On 7/31/24 18:39, Ilya Maximets wrote: > On 7/31/24 18:17, Ilya Maximets wrote: >> On 7/31/24 11:05, Dumitru Ceara wrote: >>> From: Adrian Moreno <[email protected]> >>> >>> Introduce a new table called Sample where per-flow IPFIX configuration >>> can be specified. >>> Also, reference rows from such table from the ACL table to enable the >>> configuration of ACL sampling. If enabled, northd will add a sample >>> action to each ACL related logical flow. >>> >>> Packets that hit stateful ACLs are sampled in different ways depending >>> whether they are initiating a new session or are just forwarded on an >>> existing (already allowed) session. Two new columns ("sample_new" and >>> "sample_est") are added to the ACL table to allow for potentially >>> different sampling rates for the two cases. >> >> Do we actually need to be able to sample each ACL to a different set of >> collectors?
Yes, please see below. >> I mean, is it not enough to have a common configuration per sampling type >> (drop, >> acl-new, acl-est) and two boolean fields per ACL to opt-in as we do for acl >> logging? >> Or a map ACL:sample = key: enum [new,est], value: metadata ? >> >> >> Something like: >> >> NB_Global: >> sample_config: map key enum [drop, acl-new, acl-est] >> value ref Sample_Collector_Set >> >> Sample_Collector_Set: >> ids: set key integer >> probability: integer >> >> ACL: >> sample_metadata: map key enum [new, est] >> value integer >> >>> >>> Note: If an ACL has both sampling enabled and a label associated to it >>> then the label value overrides the observation point ID defined in the >>> sample configuration. This is a side effect of the implementation >>> (observation point IDs are stored in conntrack in the same part of the >>> ct_label where ACL labels are also stored). The two features >>> (sampling and ACL labels) serve however similar purposes so it's not >>> expected that they're both enabled together. >>> >>> When sampling is enabled on an ACL additional logical flows are created >>> for that ACL (one for stateless ACLs and 3 for stateful ACLs) in the ACL >>> action stage of the logical pipeline. These additional flows match on a >>> combination of conntrack state values and observation point id values >>> (either against a logical register or against the stored ct_label state) >>> in order to determine whether the packets hitting the ACLs must be >>> sampled or not. This comes with a slight increase in the number of >>> logical flows and in the number of OpenFlow rules. The number of >>> additional flows _does not_ depend on the original ACL match or action. >>> >>> New --sample-new and --sample-est optional arguments are added to the >>> 'ovn-nbctl acl-add' command to allow configuring these new types of >>> sampling for ACLs. >>> >>> An example workflow of configuring ACL samples is: >>> # Create Sampling_App mappings for ACL traffic types: >>> ovn-nbctl create Sampling_App name="acl-new-traffic-sampling" \ >>> id="42" >>> ovn-nbctl create sampling_app name="acl-est-traffic-sampling" \ >>> id="43" >>> # Create two sample collectors, one that samples all packets (c1) >>> # and another one that samples with a probability of 10% (c2): >>> c1=$(ovn-nbctl create Sample_Collector name=c1 \ >>> probability=65535 set_id=1) >>> c2=$(ovn-nbctl create Sample_Collector name=c2 \ >>> probability=6553 set_id=2) >>> # Create two sample configurations (for new and for established >>> # traffic): >>> s1=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4301) >>> s2=$(ovn-nbctl create sample collector="$c1 $c2" metadata=4302) >>> # Create an ingress ACL to allow IP traffic: >>> ovn-nbctl --sample-new=$s1 --sample-est=$s2 acl-add ls \ >>> from-lport 1 "ip" allow-related >>> >>> The config above will generate IPFIX samples with: >>> - 8 MSB of observation domain id set to 42 (Sampling_App >>> "acl-new-traffic-sampling" config) and observation point id >>> set to 4301 (Sample s1) for packets that create a new >>> connection >>> - 8 MSB of observation domain id set to 43 (Sampling_app >>> "acl-est-traffic-sampling" config) and observation point id >>> set to 4302 (Sample s2) for packets that are part of an already >>> existing connection >>> >>> Note: in general, all generated IPFIX sample observation domain IDs are >>> built by ovn-controller in the following way: >>> The 8 MSB taken from the sample action's obs_domain_id and the last 24 >>> LSB taken from the Southbound logical datapath tunnel_key (datapath ID). >>> >>> Reported-at: https://issues.redhat.com/browse/FDP-305 >>> Signed-off-by: Adrian Moreno <[email protected]> >>> Co-authored-by: Ales Musil <[email protected]> >>> Signed-off-by: Ales Musil <[email protected]> >>> Co-authored-by: Dumitru Ceara <[email protected]> >>> Signed-off-by: Dumitru Ceara <[email protected]> >>> --- >>> V4: >>> - added explicit sampling stages >>> - reduced set_id max supported value >> >> I don't get that. Does it end up in the observation domain/point >> somehow? Or in conntrack mark/label? Sounds strange. If it's only >> in logical flow and OpenFlow actions, then it shouldn't matter what >> the ID is. Or am I missing something? > > Hmm, I see that they are indeed passed around via registers and > committed to conntrack... > > But I'm still not sure why we would need to have different collectors > per ACL. We can already differentiate by metadata and sending to > different collectors on this level sounds like a very niche use-case > that may not even exist. > There actually is a real use case behind this. Two different external actors that observe how traffic is processed by OVN ACLs: a. one of them wishes to receive samples of traffic hitting all ACLs (with a given probability, P1) b. the other one wishes to receive samples of traffic hitting some of the ACLs (with a different probability, P2) The sets of ACLs that have sampling enabled in "a" and "b" above are usually _not_ disjoint. To make it less abstract: - "a" is a k8s network observability application (for the whole cluster and for all network policies - ACLs - in the cluster) - "b" is an OVN-Kubernetes module that is used for debugging the traffic path in a live cluster, e.g., trying to figure out why a certain type of traffic is allowed while there's no network policy that allows it. They both need to be able to function simultaneously. > If we use the schema that I suggested above, we may not need to > match on the collector set ids, so they will not take space in the > registers or conntrack mark/labels. > That's true, but I don't think we can use the schema you suggest. > Best regards, Ilya Maximets. > Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
