On 8/6/24 13:47, Dumitru Ceara wrote:
> On 8/6/24 13:41, Nadia Pinaeva wrote:
>> I don't think so, let's do the math together.
>>
>> I already have 2 apps (observability and debug) = 2 different
>> sample_collector.set_id. Not many more apps are expected in the future,
>> but I can easily imagine at least 1 extra, 3 total.
>> We already have 2 sampling_apps: acl new and acl est. More apps are
>> expected in the future, e.g. to sample on logical port ingress/egress,
>> loadbalancers, NATs, etc.
>> The amount of different probabilities depends on the number of ovn-k
>> features for every sampling_app. For ACLs, we currently have 5 different
>> features (I only mentioned 3 of them in my previous example). I expect
>> other sampling_apps to have less features, but very possible more than 1.
>>
>> So, right now with the minimal functionality I need 2 set_ids * 2
>> sampling_apps * 5 ovn-k features = 20 different sample_collectors.
>> If we, say, expect 10 new sampling_apps with 2 ovn-k features each (just
>> an example, we don't know the exact number of features per future
>> sampling_apps), it becomes 2 set_ids * ((acl_new + acl_est) * 5 features
>> + 10 new sampling_apps * 2 (hypothetical) ovn-k features) = 2 * (10 +
>> 20) = 60 different sample_collectors.
>>
>> The number will grow if we ever get 3rd application besides
>> observability and debug, if we get more than 10 sampling_apps or new
>> sampling_apps will be split into more than 2 ovn-k features on average.
>>
>> So I would say 255 sample_collectors sounds like a safer option for
>> future development.
> 
> OK, thanks for summarizing this!  I'll revert back to 8 bits (255
> values) for the Sample_Collector.ID in v7.  I'll post a new version
> later today.

The math here makes me think that something is terribly wrong with the
design of this feature.  Can't put a finger on what it is though yet.

Best regards, Ilya Maximets.

> 
> Regards,
> Dumitru
> 
>>
>> Kind regards,
>> Nadia Pinaeva.
>>
>> On Tue, Aug 6, 2024 at 1:26 PM Dumitru Ceara <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     On 8/6/24 11:44, Dumitru Ceara wrote:
>>     > From: Adrian Moreno <[email protected] <mailto:[email protected]>>
>>     >
>>     > Introduce a new table called Sample where per-flow IPFIX configuration
>>     > can be specified.
>>     > Also, reference rows from such table from the ACL table to enable the
>>     > configuration of ACL sampling. If enabled, northd will add a sample
>>     > action to each ACL related logical flow.
>>     >
>>     > Packets that hit stateful ACLs are sampled in different ways depending
>>     > whether they are initiating a new session or are just forwarded on an
>>     > existing (already allowed) session.  Two new columns ("sample_new" and
>>     > "sample_est") are added to the ACL table to allow for potentially
>>     > different sampling rates for the two cases.
>>     >
>>     > Note: If an ACL has both sampling enabled and a label associated to it
>>     > then the label value overrides the observation point ID defined in the
>>     > sample configuration.  This is a side effect of the implementation
>>     > (observation point IDs are stored in conntrack in the same part of the
>>     > ct_label where ACL labels are also stored).  The two features
>>     > (sampling and ACL labels) serve however similar purposes so it's not
>>     > expected that they're both enabled together.
>>     >
>>     > When sampling is enabled on an ACL additional logical flows are
>>     created
>>     > for that ACL (one for stateless ACLs and 3 for stateful ACLs) in
>>     the ACL
>>     > action stage of the logical pipeline.  These additional flows
>>     match on a
>>     > combination of conntrack state values and observation point id values
>>     > (either against a logical register or against the stored ct_label
>>     state)
>>     > in order to determine whether the packets hitting the ACLs must be
>>     > sampled or not.  This comes with a slight increase in the number of
>>     > logical flows and in the number of OpenFlow rules.  The number of
>>     > additional flows _does not_ depend on the original ACL match or
>>     action.
>>     >
>>     > New --sample-new and --sample-est optional arguments are added to the
>>     > 'ovn-nbctl acl-add' command to allow configuring these new types of
>>     > sampling for ACLs.
>>     >
>>     > An example workflow of configuring ACL samples is:
>>     >   # Create Sampling_App mappings for ACL traffic types:
>>     >   ovn-nbctl create Sampling_App name="acl-new-traffic-sampling" \
>>     >                                 id="42"
>>     >   ovn-nbctl create sampling_app name="acl-est-traffic-sampling" \
>>     >                               id="43"
>>     >   # Create two sample collectors, one that samples all packets (c1)
>>     >   # and another one that samples with a probability of 10% (c2):
>>     >   c1=$(ovn-nbctl create Sample_Collector name=c1 \
>>     >        probability=65535 set_id=1)
>>     >   c2=$(ovn-nbctl create Sample_Collector name=c2 \
>>     >        probability=6553 set_id=2)
>>     >   # Create two sample configurations (for new and for established
>>     >   # traffic) and an ingress ACL to allow IP traffic:
>>     >   ovn-nbctl \
>>     >     -- --id=@s1 create sample collector="$c1 $c2" metadata=4301 \
>>     >     -- --id=@s2 create sample collector="$c1 $c2" metadata=4302 \
>>     >     -- --sample-new=@s1 --sample-est=@s2 acl-add ls \
>>     >             from-lport 1 "ip" allow-related
>>     >
>>     > The config above will generate IPFIX samples with:
>>     > - 8 MSB of observation domain id set to 42 (Sampling_App
>>     >   "acl-new-traffic-sampling" config) and observation point id
>>     >   set to 4301 (Sample s1) for packets that create a new
>>     >   connection
>>     > - 8 MSB of observation domain id set to 43 (Sampling_app
>>     >   "acl-est-traffic-sampling" config) and observation point id
>>     >   set to 4302 (Sample s2) for packets that are part of an already
>>     >   existing connection
>>     >
>>     > Note: in general, all generated IPFIX sample observation domain
>>     IDs are
>>     > built by ovn-controller in the following way:
>>     > The 8 MSB taken from the sample action's obs_domain_id and the last 24
>>     > LSB taken from the Southbound logical datapath tunnel_key
>>     (datapath ID).
>>     >
>>     > Reported-at: https://issues.redhat.com/browse/FDP-305
>>     <https://issues.redhat.com/browse/FDP-305>
>>     > Signed-off-by: Adrian Moreno <[email protected]
>>     <mailto:[email protected]>>
>>     > Co-authored-by: Ales Musil <[email protected]
>>     <mailto:[email protected]>>
>>     > Signed-off-by: Ales Musil <[email protected]
>>     <mailto:[email protected]>>
>>     > Co-authored-by: Dumitru Ceara <[email protected]
>>     <mailto:[email protected]>>
>>     > Signed-off-by: Dumitru Ceara <[email protected]
>>     <mailto:[email protected]>>
>>     > ---
>>     > V6:
>>     > - Address Ilya's comments:
>>     >   - make Sample non-root
>>     >   - Add unique ID to Sample_Collector to be stored in the session
>>     >     conntrack information.  Limit the max number of Sample_Collector
>>     >     records to 15 (4 bit).
>>
>>     Just to be sure: Nadia, do you think maximum 15 unique
>>     NB.Sample_Collectors is too restrictive for OVN-kubernets?
>>
>>     Thanks,
>>     Dumitru
>>
>>
>>     > - Remove the Sample.external_ids: with non-root it doesn't really make
>>     >   sense, there's always going to be a direct mapping from a single ACL
>>     >   object to the sample.
>>     > - Remove HAVE_TCPDUMP from system tests.
>>     > - Add fix and test for to-lport ACLs with sampling enabled hit on
>>     egress
>>     >   towards routers.
>>     > - Add test with different collectors (multiple probabilities) that
>>     share
>>     >   the same set_id.
>>     > - Fixed system test checking nfcapd output files (these can auto
>>     >   rotate).
>>     > V5:
>>     > - rebase
>>     > - address Ilya's comment:
>>     >   - add documentation notes about behavior when mixing ACL labels with
>>     >     ACL sampling.
>>     > V4:
>>     > - added explicit sampling stages
>>     > - reduced set_id max supported value
>>     > - added support for tiered "pass" ACLs
>>     > - improved system test + added tiered ACL system test
>>     > - added Ales as co-author for most of the above
>>     > - Addressed Mark's comment about better error messages in ovn-nbctl
>>     > V3:
>>     > - Addressed Ilya's comment:
>>     >   - Bumped NB schema version.
>>     > V2:
>>     > - Addressed Adrian's comments:
>>     >   - fixed up observation domain id comment in commit log.
>>     >   - store the obs_domain_id in the ct_label as an 8 bit value (add a
>>     >     test).
>>     >   - removed redundant check in build_acl_sample_label_action().
>>     >   - added missing space after ternary ":" operator.
>>     >   - documented limitation for sampling ACLs with action "pass".
>>     >   - documented sample_new behavior for stateless ACLs.
>>     > - Removed unused OVN_CT_SAMPLE_ID_SET_BIT and OVN_CT_SAMPLE_ID_SET.
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to