On 5/14/24 13:27, Eelco Chaudron wrote:
> 
> 
> On 14 May 2024, at 13:05, Ilya Maximets wrote:
> 
>> On 5/14/24 12:14, Adrian Moreno wrote:
>>>
>>>
>>> On 5/14/24 11:09 AM, Ilya Maximets wrote:
>>>> On 5/14/24 09:39, Adrian Moreno wrote:
>>>>>
>>>>>
>>>>> On 5/10/24 12:45 PM, Adrian Moreno wrote:
>>>>>>
>>>>>>
>>>>>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>>>>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>>>>>
>>>>>>>> The new odp sample attributes allow userspace to specify a group_id and
>>>>>>>> user-defined cookie to be passed down to psample.
>>>>>>>>
>>>>>>>> Add support for parsing and formatting such action.
>>>>>>>>
>>>>>>>> Signed-off-by: Adrian Moreno <amore...@redhat.com>
>>>>>>>
>>>>>>> Hi Adrian,
>>>>>>>
>>>>>>> See my comments below inline.
>>>>>>>
>>>>>>> //Eelco
>>>>>>>
>>>>>>>> ---
>>>>>>>>    include/linux/openvswitch.h  |  49 +++++++++---
>>>>>>>>    lib/odp-execute.c            |   3 +
>>>>>>>>    lib/odp-util.c               | 150 
>>>>>>>> ++++++++++++++++++++++++++---------
>>>>>>>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>>>>>>>    python/ovs/flow/odp.py       |   2 +
>>>>>>>>    tests/odp.at                 |   5 ++
>>>>>>>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>>>>>>> index d9fb991ef..3e748405c 100644
>>>>>>>> --- a/include/linux/openvswitch.h
>>>>>>>> +++ b/include/linux/openvswitch.h
>>>>>>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>>>>>>>    #define OVS_UFID_F_OMIT_MASK     (1 << 1)
>>>>>>>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>>>>>>>
>>>>>>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>>>>>>>    /**
>>>>>>>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>>>>>>>> action.
>>>>>>>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to 
>>>>>>>> sample with
>>>>>>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>>>>>>>     * %UINT32_MAX samples all packets and intermediate values sample 
>>>>>>>> intermediate
>>>>>>>>     * fractions of packets.
>>>>>>>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
>>>>>>>> event.
>>>>>>>> - * Actions are passed as nested attributes.
>>>>>>>> + * Actions are passed as nested attributes. Optional if
>>>>>>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>>>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>>>>>>>> psample group.
>>>>>>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>>>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>>>>>>>> that, if
>>>>>>>> + * provided, will be copied to the psample cookie.
>>>>>>>
>>>>>>> Should we mention any limit on the actual length of the cookie?
>>>>>>>
>>>>>>> Any reason we explicitly call this psample? From an OVS perspective, 
>>>>>>> this
>>>>>>> could just be
>>>>>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>>>>>>>  and
>>>>>>> _GROUP. Other datapaths, do not have PSAMPLE.
>>>>>>>
>>>>>>
>>>>>> See my response to your comment on the cover-letter for possible name 
>>>>>> alternatives.
>>>>>>
>>>>>>
>>>>>>> Thinking about it more, from an OVS perspective we should probably not 
>>>>>>> even
>>>>>>> send down a COOKIE, but we should send down an obs_domain_id and 
>>>>>>> obs_point_id,
>>>>>>> and then the kernel can move this into a cookie.
>>>>>>>
>>>>>>
>>>>>> I did consider this. However, the opaque cookie fits well with the tc 
>>>>>> API which
>>>>>> does not have two 32-bit values but an opaque 128-bit cookie. So if the 
>>>>>> action
>>>>>> is offloaded OVS needs to "encode" the two 32-bit values into the cookie 
>>>>>> anyways.
>>>>>>
>>>>>>> The collector itself could be called system/local collector, or 
>>>>>>> something like
>>>>>>> that. This way it can use for example psample for kernel and UDST 
>>>>>>> probes for
>>>>>>> usespace. Both can pass the group and cookie (obs_domain_id and 
>>>>>>> obs_point_id).
>>>>>>>
>>>>>>> Also, the presence of any of this should not dictate the psample 
>>>>>>> action, we
>>>>>>> probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>>>>>>
>>>>>>
>>>>>> It clearer and it might simplify option verification a bit, but isn't a 
>>>>>> bit
>>>>>> redundant? There is no flag for action execution for instance, the 
>>>>>> presence of
>>>>>> the attribute is enough.
>>>>>>
>>>>>> An alternative would be to have nested attributes, e.g:
>>>>>>
>>>>>> enum ovs_sample_attr {
>>>>>>       OVS_SAMPLE_ATTR_UNSPEC,
>>>>>>       OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>>>       OVS_SAMPLE_ATTR_ACTIONS,     /* Nested OVS_ACTION_ATTR_* 
>>>>>> attributes. */
>>>>>>       OVS_SAMPLE_ATTR_SYSTEM,     /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
>>>>>> attributes. */
>>>>>>
>>>>>>       __OVS_SAMPLE_ATTR_MAX,
>>>>>> };
>>>>>>
>>>>>> enum ovs_sample_system_attr {
>>>>>>       OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
>>>>>>       OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>>>>>>       OVS_SAMPLE_SYSTEM_ATTR_COOKIE,     /* Nested OVS_ACTION_ATTR_*
>>>>>>
>>>>>>       __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
>>>>>> };
>>>>>>
>>>>>> WDYT?
>>>>>>
>>>>>
>>>>> Another benefit of nested attributes is the easier addition of other
>>>>> sample-related attributes, I already have a candidate: 
>>>>> OVS_SAMPLE_SYSTEM_TRUNC
>>>>>
>>>>> This will be passed to psample (md->trunc_size) who will truncate the 
>>>>> packet
>>>>> size to the specified value. This can be useful for large packets for 
>>>>> which we
>>>>> are only interested in the headers.
>>>>>
>>>>> WDYT?
>>>>>
>>>>> [snip]
>>>>>
>>>>
>>>> Hrm.  This makes me think these should not be attributes of a sample 
>>>> action at all,
>>>> but rather we should have a separate EMIT_SAMPLE action that just calls 
>>>> psample, so
>>>> we can combine it with other actions:
>>>>
>>>>    sample(..., actions(userspace(...),trunc(100),emit_sample(cookie=...)))
>>>>
>>>> or even:
>>>>
>>>>    sample(..., 
>>>> actions(clone(trunc(100),emit_sample(cookie=...)),userspace(...)))
>>>>
>>>> This will also simplify the requires_datapath_assistance() check as we can 
>>>> just
>>>> check the action and not iterate over the list of attributes.  And it will 
>>>> be
>>>> capable to only send this one action to the datapath and not the whole 
>>>> thing.
>>>>
>>>
>>> My suggestion of adding a OVS_SAMPLE_{}_TRUNC was precisely to avoid using 
>>> the
>>> existing trunc implementation.
>>>
>>> Truncation is implemented inside psample so we would not be maintaining two
>>> implementations.
>>
>> We will maintain trunc action and a trunc attribute for sampling, so we will
>> have a duplication here.
>>
>>> psample implementation does not require packet clone (which is
>>> needed for OVS trunc) plus it can be easily optimized out if there is no
>>> listener in the multicast group (I plan to add that optimization in the next
>>> kernel series).
>>
>> 'trunc' doesn't require a clone in case there are no other actions after, 
>> which
>> is the main use case for drop sampling.
>>
>> BTW, having explicit drop actions might have broken some of the assumptions
>> around this optimization, but it should be fixable.
>>
>> And psample itself might need a clone() in case there was a trunc() before
>> sampling.  psample will report skb->len as original packet size, but that
>> is will not be correct as we have a postponed cutlen, so the reported length
>> must be the length after the cut and that can't be done without actually
>> cutting the skb at this point.
>>
>>>
>>> Having said that, I'm not against adding a new action if that simplifies 
>>> things.
>>> I think it might simplify things in ovs-vswitchd and even in the kernel.
>>> However, I have one concern.
>>>
>>> IIUC and based on your examples, your suggestions means we would not have
>>> sampling rate inside emit_sample, right? This might be a problem because 
>>> psample
>>> does have sampling rate which I believe to be crucial to balance performance
>>> impact and accuracy.
>>
>> psample doesn't limit the rate, it just reports a number.  Rate is part of
>> the main sample() action, i.e.
>>
>>   sample(sample=75%,actions(emit_sample(cookie=...)))
>>
>> Having rate also in the 'emit_sample' would be another API duplication.
> 
> I agree, but this is the confusing part :) If you use the psample TC action, 
> it will have the rate-limiting as part of the action.
> 
> So if we offload the above, we will have a nested tc psample action, the 
> outer one with the desired sample rate, and the inner one with a 100% sample 
> rate. Actually, the inner one needs to be handled by sw as it will go to 
> vswitchd (and than back to psample?). I guess this avoids the idea of doing 
> this fast… Maybe if emit_sample() is the only action we can do some 
> optimisation, but it needs to be translatable back and forth from TC to 
> OVS_ATTRs.
> 
> I guess we need the same for OVS_ACTION_ATTR_USERSPACE actions, which by the 
> way Nvidia was going to work on once the sFlow part was accepted.

The actual sample() action can not be offloaded in a generic case.
There is no smaple action in TC that would accept arbitrary action
list.  We can only offload a few corner cases.  Today we offload
none.  With the sFlow series we can offload a single specific
OVS_ACTION_ATTR_USERSPACE that contains USER_ACTION_COOKIE_SFLOW.

And we can't offload more than one sampling method within a single
OVS_ACTION_ATTR_SAMPLE, because in TC they'll have independent
probabilities, which is not a desired behavior.

So, we'll always have special case offloading for each particular
action inside the sample().

> 
>> BTW, "sampling" column is part of IPFIX and sFlow tables and you're not
>> adding a new table for this new smapling method, i.e. it will not be
>> possible to set sampling rate for psample with the current database
>> configuration.
>>
>>> This value must be present in the psample sample or else
>>> users will not know how inaccurate the sample is.
>>
>> Users can check the sampling configuration in the database if we will
>> have one.  And they are configuring it in the first place.  And it will
>> be confusing to see two rates in the same action list.
>>
>> It might be possible to carry the current rate in the OVS_CB(skb) and
>> pass it to the psample.
>>
>> In any case, the rate reported by psample will not be accurate if there
>> are nested sample actions, unless we carry and properly multiply / divide
>> probabilities when entering / exiting the sample action.
>>
>>>
>>>
>>>> Naming is still hard though.  I like the word "emit" because it doesn't 
>>>> have a
>>>> concept of directionality, i.e. we're not sending it to somewhere, we're 
>>>> just
>>>> emitting it to the wild.  That kind of matches with our vague definition 
>>>> of a
>>>> "local" or "system" sampling.
>>>>
>>>
>>> I like the word "emit".
>>>
>>>> Another option might be 'OUTPUT_SAMPLE/SAMPLE_OUTPUT', i.e. output(sample, 
>>>> cookie=...).
>>>>
>>>> The main concern is that we already have truncation implemented, so we can 
>>>> be more
>>>> flexible in a way we generate actions instead of adding new attributes.
>>>>
>>>> Thoughts?
>>>
>>> See my comment above. I am not suggesting we use the current trunc
>>> implementation nor to add a new one, just to use psample's.
>>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to