On 8/6/24 13:47, Dumitru Ceara wrote: > On 8/6/24 13:41, Nadia Pinaeva wrote: >> I don't think so, let's do the math together. >> >> I already have 2 apps (observability and debug) = 2 different >> sample_collector.set_id. Not many more apps are expected in the future, >> but I can easily imagine at least 1 extra, 3 total. >> We already have 2 sampling_apps: acl new and acl est. More apps are >> expected in the future, e.g. to sample on logical port ingress/egress, >> loadbalancers, NATs, etc. >> The amount of different probabilities depends on the number of ovn-k >> features for every sampling_app. For ACLs, we currently have 5 different >> features (I only mentioned 3 of them in my previous example). I expect >> other sampling_apps to have less features, but very possible more than 1. >> >> So, right now with the minimal functionality I need 2 set_ids * 2 >> sampling_apps * 5 ovn-k features = 20 different sample_collectors. >> If we, say, expect 10 new sampling_apps with 2 ovn-k features each (just >> an example, we don't know the exact number of features per future >> sampling_apps), it becomes 2 set_ids * ((acl_new + acl_est) * 5 features >> + 10 new sampling_apps * 2 (hypothetical) ovn-k features) = 2 * (10 + >> 20) = 60 different sample_collectors. >> >> The number will grow if we ever get 3rd application besides >> observability and debug, if we get more than 10 sampling_apps or new >> sampling_apps will be split into more than 2 ovn-k features on average. >> >> So I would say 255 sample_collectors sounds like a safer option for >> future development. > > OK, thanks for summarizing this! I'll revert back to 8 bits (255 > values) for the Sample_Collector.ID in v7. I'll post a new version > later today.
The math here makes me think that something is terribly wrong with the design of this feature. Can't put a finger on what it is though yet. Best regards, Ilya Maximets. > > Regards, > Dumitru > >> >> Kind regards, >> Nadia Pinaeva. >> >> On Tue, Aug 6, 2024 at 1:26 PM Dumitru Ceara <[email protected] >> <mailto:[email protected]>> wrote: >> >> On 8/6/24 11:44, Dumitru Ceara wrote: >> > From: Adrian Moreno <[email protected] <mailto:[email protected]>> >> > >> > Introduce a new table called Sample where per-flow IPFIX configuration >> > can be specified. >> > Also, reference rows from such table from the ACL table to enable the >> > configuration of ACL sampling. If enabled, northd will add a sample >> > action to each ACL related logical flow. >> > >> > Packets that hit stateful ACLs are sampled in different ways depending >> > whether they are initiating a new session or are just forwarded on an >> > existing (already allowed) session. Two new columns ("sample_new" and >> > "sample_est") are added to the ACL table to allow for potentially >> > different sampling rates for the two cases. >> > >> > Note: If an ACL has both sampling enabled and a label associated to it >> > then the label value overrides the observation point ID defined in the >> > sample configuration. This is a side effect of the implementation >> > (observation point IDs are stored in conntrack in the same part of the >> > ct_label where ACL labels are also stored). The two features >> > (sampling and ACL labels) serve however similar purposes so it's not >> > expected that they're both enabled together. >> > >> > When sampling is enabled on an ACL additional logical flows are >> created >> > for that ACL (one for stateless ACLs and 3 for stateful ACLs) in >> the ACL >> > action stage of the logical pipeline. These additional flows >> match on a >> > combination of conntrack state values and observation point id values >> > (either against a logical register or against the stored ct_label >> state) >> > in order to determine whether the packets hitting the ACLs must be >> > sampled or not. This comes with a slight increase in the number of >> > logical flows and in the number of OpenFlow rules. The number of >> > additional flows _does not_ depend on the original ACL match or >> action. >> > >> > New --sample-new and --sample-est optional arguments are added to the >> > 'ovn-nbctl acl-add' command to allow configuring these new types of >> > sampling for ACLs. >> > >> > An example workflow of configuring ACL samples is: >> > # Create Sampling_App mappings for ACL traffic types: >> > ovn-nbctl create Sampling_App name="acl-new-traffic-sampling" \ >> > id="42" >> > ovn-nbctl create sampling_app name="acl-est-traffic-sampling" \ >> > id="43" >> > # Create two sample collectors, one that samples all packets (c1) >> > # and another one that samples with a probability of 10% (c2): >> > c1=$(ovn-nbctl create Sample_Collector name=c1 \ >> > probability=65535 set_id=1) >> > c2=$(ovn-nbctl create Sample_Collector name=c2 \ >> > probability=6553 set_id=2) >> > # Create two sample configurations (for new and for established >> > # traffic) and an ingress ACL to allow IP traffic: >> > ovn-nbctl \ >> > -- --id=@s1 create sample collector="$c1 $c2" metadata=4301 \ >> > -- --id=@s2 create sample collector="$c1 $c2" metadata=4302 \ >> > -- --sample-new=@s1 --sample-est=@s2 acl-add ls \ >> > from-lport 1 "ip" allow-related >> > >> > The config above will generate IPFIX samples with: >> > - 8 MSB of observation domain id set to 42 (Sampling_App >> > "acl-new-traffic-sampling" config) and observation point id >> > set to 4301 (Sample s1) for packets that create a new >> > connection >> > - 8 MSB of observation domain id set to 43 (Sampling_app >> > "acl-est-traffic-sampling" config) and observation point id >> > set to 4302 (Sample s2) for packets that are part of an already >> > existing connection >> > >> > Note: in general, all generated IPFIX sample observation domain >> IDs are >> > built by ovn-controller in the following way: >> > The 8 MSB taken from the sample action's obs_domain_id and the last 24 >> > LSB taken from the Southbound logical datapath tunnel_key >> (datapath ID). >> > >> > Reported-at: https://issues.redhat.com/browse/FDP-305 >> <https://issues.redhat.com/browse/FDP-305> >> > Signed-off-by: Adrian Moreno <[email protected] >> <mailto:[email protected]>> >> > Co-authored-by: Ales Musil <[email protected] >> <mailto:[email protected]>> >> > Signed-off-by: Ales Musil <[email protected] >> <mailto:[email protected]>> >> > Co-authored-by: Dumitru Ceara <[email protected] >> <mailto:[email protected]>> >> > Signed-off-by: Dumitru Ceara <[email protected] >> <mailto:[email protected]>> >> > --- >> > V6: >> > - Address Ilya's comments: >> > - make Sample non-root >> > - Add unique ID to Sample_Collector to be stored in the session >> > conntrack information. Limit the max number of Sample_Collector >> > records to 15 (4 bit). >> >> Just to be sure: Nadia, do you think maximum 15 unique >> NB.Sample_Collectors is too restrictive for OVN-kubernets? >> >> Thanks, >> Dumitru >> >> >> > - Remove the Sample.external_ids: with non-root it doesn't really make >> > sense, there's always going to be a direct mapping from a single ACL >> > object to the sample. >> > - Remove HAVE_TCPDUMP from system tests. >> > - Add fix and test for to-lport ACLs with sampling enabled hit on >> egress >> > towards routers. >> > - Add test with different collectors (multiple probabilities) that >> share >> > the same set_id. >> > - Fixed system test checking nfcapd output files (these can auto >> > rotate). >> > V5: >> > - rebase >> > - address Ilya's comment: >> > - add documentation notes about behavior when mixing ACL labels with >> > ACL sampling. >> > V4: >> > - added explicit sampling stages >> > - reduced set_id max supported value >> > - added support for tiered "pass" ACLs >> > - improved system test + added tiered ACL system test >> > - added Ales as co-author for most of the above >> > - Addressed Mark's comment about better error messages in ovn-nbctl >> > V3: >> > - Addressed Ilya's comment: >> > - Bumped NB schema version. >> > V2: >> > - Addressed Adrian's comments: >> > - fixed up observation domain id comment in commit log. >> > - store the obs_domain_id in the ct_label as an 8 bit value (add a >> > test). >> > - removed redundant check in build_acl_sample_label_action(). >> > - added missing space after ternary ":" operator. >> > - documented limitation for sampling ACLs with action "pass". >> > - documented sample_new behavior for stateless ACLs. >> > - Removed unused OVN_CT_SAMPLE_ID_SET_BIT and OVN_CT_SAMPLE_ID_SET. >> > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
