On 5/13/24 20:59, Adrian Moreno wrote:
> 
> 
> On 5/10/24 3:06 PM, Ilya Maximets wrote:
>> On 5/10/24 14:01, Eelco Chaudron wrote:
>>>
>>>
>>> On 10 May 2024, at 12:45, Adrian Moreno wrote:
>>>
>>>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>>>
>>>>>> The new odp sample attributes allow userspace to specify a group_id and
>>>>>> user-defined cookie to be passed down to psample.
>>>>>>
>>>>>> Add support for parsing and formatting such action.
>>>>>>
>>>>>> Signed-off-by: Adrian Moreno <[email protected]>
>>>>>
>>>>> Hi Adrian,
>>>>>
>>>>> See my comments below inline.
>>>>>
>>>>> //Eelco
>>>>>
>>>>>> ---
>>>>>>    include/linux/openvswitch.h  |  49 +++++++++---
>>>>>>    lib/odp-execute.c            |   3 +
>>>>>>    lib/odp-util.c               | 150 ++++++++++++++++++++++++++---------
>>>>>>    ofproto/ofproto-dpif-ipfix.c |   2 +
>>>>>>    python/ovs/flow/odp.py       |   2 +
>>>>>>    tests/odp.at                 |   5 ++
>>>>>>    6 files changed, 163 insertions(+), 48 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>>>>> index d9fb991ef..3e748405c 100644
>>>>>> --- a/include/linux/openvswitch.h
>>>>>> +++ b/include/linux/openvswitch.h
>>>>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>>>>>    #define OVS_UFID_F_OMIT_MASK     (1 << 1)
>>>>>>    #define OVS_UFID_F_OMIT_ACTIONS  (1 << 2)
>>>>>>
>>>>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>>>>>    /**
>>>>>>     * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE 
>>>>>> action.
>>>>>>     * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample 
>>>>>> with
>>>>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>>>>>     * %UINT32_MAX samples all packets and intermediate values sample 
>>>>>> intermediate
>>>>>>     * fractions of packets.
>>>>>>     * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling 
>>>>>> event.
>>>>>> - * Actions are passed as nested attributes.
>>>>>> + * Actions are passed as nested attributes. Optional if
>>>>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as 
>>>>>> psample group.
>>>>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie 
>>>>>> that, if
>>>>>> + * provided, will be copied to the psample cookie.
>>>>>
>>>>> Should we mention any limit on the actual length of the cookie?
>>>>>
>>>>> Any reason we explicitly call this psample? From an OVS perspective, this 
>>>>> could just be 
>>>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE 
>>>>> and _GROUP. Other datapaths, do not have PSAMPLE.
>>>>>
>>>>
>>>> See my response to your comment on the cover-letter for possible name 
>>>> alternatives.
>>>
>>> ACK, we can continue the discussion there.
>>>
>>>>> Thinking about it more, from an OVS perspective we should probably not 
>>>>> even send down a COOKIE, but we should send down an obs_domain_id and 
>>>>> obs_point_id, and then the kernel can move this into a cookie.
>>>>>
>>>>
>>>> I did consider this. However, the opaque cookie fits well with the tc API 
>>>> which does not have two 32-bit values but an opaque 128-bit cookie. So if 
>>>> the action is offloaded OVS needs to "encode" the two 32-bit values into 
>>>> the cookie anyways.
>>>
>>> Which is fine to me, the OVS API should be clear that we have two u32 
>>> entries. The cookie is implementation-specific, and we use the netlink 
>>> format as our API for all DPs.
>>
>>
>> I'm not sure about this.  It is a kernel interface and we can't deliver
>> two separate values from the psample.  So, passing two separate values
>> to the kernel doesn't make a lot of sense.  They are going to be mangled
>> into a single cookie anyway and we'll have to document that somehow.
>> We may as well always mangle the data from the beginning, document once
>> at the top level and save ourselves from documenting a million differences
>> between different implementations.  While USDT could report them separately,
>> I'm not sure there is a ton of value in that.
>>
>>>
>>>>> The collector itself could be called system/local collector, or something 
>>>>> like that. This way it can use for example psample for kernel and UDST 
>>>>> probes for usespace. Both can pass the group and cookie (obs_domain_id 
>>>>> and obs_point_id).
>>>>>
>>>>> Also, the presence of any of this should not dictate the psample action, 
>>>>> we probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>>>>
>>>>
>>>> It clearer and it might simplify option verification a bit, but isn't a 
>>>> bit redundant? There is no flag for action execution for instance, the 
>>>> presence of the attribute is enough.
>>>>
>>>> An alternative would be to have nested attributes, e.g:
>>>>
>>>> enum ovs_sample_attr {
>>>>    OVS_SAMPLE_ATTR_UNSPEC,
>>>>    OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>    OVS_SAMPLE_ATTR_ACTIONS,     /* Nested OVS_ACTION_ATTR_* attributes. */
>>>>    OVS_SAMPLE_ATTR_SYSTEM,     /* Nested OVS_SAMPLE_SYSTEM_ATTR_* 
>>>> attributes. */
>>>>    
>>>>    __OVS_SAMPLE_ATTR_MAX,
>>>> };
>>>>
>>>> enum ovs_sample_system_attr {
>>>>    OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN,  /* u32 number */
>>>>    OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>>>>    OVS_SAMPLE_SYSTEM_ATTR_COOKIE,     /* Nested OVS_ACTION_ATTR_*
>>>>    
>>>>    __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
>>>> };
>>>>
>>>> WDYT?
>>>>
>>>>> So I would envision your delta to look something like this:
>>>>>
>>>>>    enum ovs_sample_attr {
>>>>>           OVS_SAMPLE_ATTR_UNSPEC,
>>>>> - OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>> - OVS_SAMPLE_ATTR_ACTIONS,     /* Nested OVS_ACTION_ATTR_* attributes. */
>>>>> + OVS_SAMPLE_ATTR_PROBABILITY,    /* u32 number */
>>>>> + OVS_SAMPLE_ATTR_ACTIONS,        /* Nested OVS_ACTION_ATTR_
>>>>> +                                  * attributes.
>>>>> +                                  */
>>>>> + OVS_SAMPLE_ATTR_OBS_DOMAIN,        /* Observation domain id, u32 
>>>>> number. */
>>>>> + OVS_SAMPLE_ATTR_OBS_POINT,         /* Observation point id, u32 number. 
>>>>> */
>>>>> +   OVS_SAMPLE_ATTR_SYSTEM_TARGET, /* Flag to additionally sample to 
>>>>> system target. */
>>>>> + OVS_SAMPLE_ATTR_SYSTEM_GROUP,  /* System target group id, u32 number. */
>>>>>           __OVS_SAMPLE_ATTR_MAX,
>>>
>>> I’ve been thinking about it a bit more (since I wrote this comment), and I 
>>> guess we might not need the extra flag, as we could use the 
>>> OVS_SAMPLE_ATTR_SYSTEM_GROUP as the key for doing a local system copy (or 
>>> whatever name we come up with). The OBS DOMAIN/POINT can be used in the 
>>> future by others.
>>>
>>>>>>     *
>>>>>> - * Executes the specified actions with the given probability on a 
>>>>>> per-packet
>>>>>> - * basis.
>>>>>> + * Either OVS_SAMPLE_ATTR_PSAMPLE_GROUP or OVS_SAMPLE_ATTR_ACTIONS must 
>>>>>> be
>>>>>> + * specified.
>>>>>> + *
>>>>>> + * Executes the specified actions and/or sends the packet to psample
>>>>>> + * with the given probability on a per-packet basis.
>>>>>>     */
>>>>>>    enum ovs_sample_attr {
>>>>>>          OVS_SAMPLE_ATTR_UNSPEC,
>>>>>> -        OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>>> -        OVS_SAMPLE_ATTR_ACTIONS,     /* Nested OVS_ACTION_ATTR_* 
>>>>>> attributes. */
>>>>>> +        OVS_SAMPLE_ATTR_PROBABILITY,    /* u32 number */
>>>>>> +        OVS_SAMPLE_ATTR_ACTIONS,        /* Nested OVS_ACTION_ATTR_
>>>>>
>>>>> Missing *
>>>>>
>>>>
>>>> Yeah, it's a pitty the "*" makes checkpatch raise a warning.
>>>>
>>>>>> +                                         * attributes.
>>>>>> +                                         */
>>>>>> +        OVS_SAMPLE_ATTR_PSAMPLE_GROUP,  /* u32 number */
>>>>>> +        OVS_SAMPLE_ATTR_PSAMPLE_COOKIE, /* binary */
>>>>>>          __OVS_SAMPLE_ATTR_MAX,
>>>>>>
>>>>>>    #ifdef __KERNEL__
>>>>>> @@ -722,13 +735,27 @@ enum ovs_sample_attr {
>>>>>>    #define OVS_SAMPLE_ATTR_MAX (__OVS_SAMPLE_ATTR_MAX - 1)
>>>>>>
>>>>>>    #ifdef __KERNEL__
>>>>>> +
>>>>>> +/* Definition for flags in struct sample_arg. */
>>>>>> +enum {
>>>>>> +        /* When actions in sample will not change the flows. */
>>>>>> +        OVS_SAMPLE_ARG_FLAG_EXEC = 1 << 0,
>>>>>> +        /* When set, the packet will be sent to psample. */
>>>>>> +        OVS_SAMPLE_ARG_FLAG_PSAMPLE = 1 << 1,
>>>>>
>>>>> Same on psample name here, should it be FLAG_SYSTEM_TARGET instead? See 
>>>>> the above comment about an explicit option.
>>>>>
>>>>>> +};
>>>>>> +
>>>>>>    struct sample_arg {
>>>>>> -        bool exec;                   /* When true, actions in sample 
>>>>>> will not
>>>>>> -                                      * change flow keys. False 
>>>>>> otherwise.
>>>>>> -                                      */
>>>>>> -        u32  probability;            /* Same value as
>>>>>> -                                      * 'OVS_SAMPLE_ATTR_PROBABILITY'.
>>>>>> -                                      */
>>>>>> +        u16 flags;             /* Flags that modify the behavior of the
>>>>>> +                                * action. See SAMPLE_ARG_FLAG_*
>>>>>> +                                */
>>>>>> +        u32  probability;       /* Same value as
>>>>>> +                                 * 'OVS_SAMPLE_ATTR_PROBABILITY'.
>>>>>> +                                 */
>>>>>> +        u32  group_id;          /* Same value as
>>>>>> +                                 * 'OVS_SAMPLE_ATTR_PSAMPLE_GROUP'.
>>>>>> +                                 */
>>>>>> +        u8  cookie_len;         /* Length of psample cookie */
>>>>>> +        char cookie[OVS_PSAMPLE_COOKIE_MAX_SIZE]; /* psample cookie 
>>>>>> data. */
>>>>>
>>>>> Would it make sense for the cookie also to be u8?
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>>>    };
>>>>>>    #endif
>>>>>>
>>>>>> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
>>>>>> index 081e4d432..d8ee93429 100644
>>>>>> --- a/lib/odp-execute.c
>>>>>> +++ b/lib/odp-execute.c
>>>>>> @@ -716,6 +716,9 @@ odp_execute_sample(void *dp, struct dp_packet 
>>>>>> *packet, bool steal,
>>>>>>                subactions = a;
>>>>>>                break;
>>>>>>
>>>>>> +        /* Ignored in userspace datapath. */
>>>>>> +        case OVS_SAMPLE_ATTR_PSAMPLE_GROUP:
>>>>>> +        case OVS_SAMPLE_ATTR_PSAMPLE_COOKIE:
>>
>> Action that has these attributes should be marked as action that requires
>> datapath assistance.  And we should never hit them here.
>>
>> We should have a test case covering a packet sent from a controller or
>> execution with a debug_slow flag.
>>
> 
> Yes, I already have it ready for the next round. It does work.
> 
>> BTW, this code can be executed for kernel datapath as well if the packet
>> goes to userspace after upcall or comes from a controller, so the comment
>> is not correct.
>>
> 
> Yes. How about replacing it with something like:
> /* If the action contains these attributes it must be executed with datapath 
> assistance so it's safe to ignore them here.
> */
> 
> For context, what I plan to add in the next version is something like this:
> 
> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
> index d8ee93429..9a405e62f 100644
> --- a/lib/odp-execute.c
> +++ b/lib/odp-execute.c
> @@ -823,11 +823,22 @@ requires_datapath_assistance(const struct nlattr *a)
>       case OVS_ACTION_ATTR_METER: 
> 
>           return true; 
> 
> 
> +    case OVS_ACTION_ATTR_SAMPLE:
> +    {

'{' on a previous line.

> +        const struct nlattr *nested;
> +        unsigned int left;

An empty line here.

> +        NL_NESTED_FOR_EACH (nested, left, a) {
> +            if (nl_attr_type(nested) == OVS_SAMPLE_ATTR_PSAMPLE_GROUP) {
> +                return true;
> +            }
> +        }
> +        return false;
> +    }
> +

Otherwise, seems reasonable at a first glance, but see my other reply
about attributes vs actions.

>       case OVS_ACTION_ATTR_SET:
>       case OVS_ACTION_ATTR_SET_MASKED:
>       case OVS_ACTION_ATTR_PUSH_VLAN:
>       case OVS_ACTION_ATTR_POP_VLAN:
> -    case OVS_ACTION_ATTR_SAMPLE:
>       case OVS_ACTION_ATTR_HASH:
>       case OVS_ACTION_ATTR_PUSH_MPLS:
>       case OVS_ACTION_ATTR_POP_MPLS:
> 
> 
> 
>>
>>>>>>            case OVS_SAMPLE_ATTR_UNSPEC:
>>>>>>            case __OVS_SAMPLE_ATTR_MAX:
>>>>>>            default:
>>>>>> diff --git a/lib/odp-util.c b/lib/odp-util.c
>>>>>> index 21f34d955..611b5229e 100644
>>>>>> --- a/lib/odp-util.c
>>>>>> +++ b/lib/odp-util.c
>>>>>> @@ -227,12 +227,16 @@ format_odp_sample_action(struct ds *ds, const 
>>>>>> struct nlattr *attr,
>>>>>>    {
>>>>>>        static const struct nl_policy ovs_sample_policy[] = {
>>>>>>            [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 },
>>>>>> -        [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED }
>>>>>> +        [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED,
>>>>>> +                                      .optional = true },
>>>>>> +        [OVS_SAMPLE_ATTR_PSAMPLE_GROUP] = { .type = NL_A_U32,
>>>>>> +                                            .optional = true },
>>>>>> +        [OVS_SAMPLE_ATTR_PSAMPLE_COOKIE] = { .type = NL_A_UNSPEC,
>>>>>> +                                             .optional = true }
>>>>>>        };
>>>>>>        struct nlattr *a[ARRAY_SIZE(ovs_sample_policy)];
>>>>>> +    const struct nlattr *nla;
>>>>>>        double percentage;
>>>>>> -    const struct nlattr *nla_acts;
>>>>>> -    int len;
>>>>>>
>>>>>>        ds_put_cstr(ds, "sample");
>>>>>>
>>>>>> @@ -244,13 +248,33 @@ format_odp_sample_action(struct ds *ds, const 
>>>>>> struct nlattr *attr,
>>>>>>        percentage = (100.0 * 
>>>>>> nl_attr_get_u32(a[OVS_SAMPLE_ATTR_PROBABILITY])) /
>>>>>>                            UINT32_MAX;
>>>>>>
>>>>>> -    ds_put_format(ds, "(sample=%.1f%%,", percentage);
>>>>>> +    ds_put_format(ds, "(sample=%.1f%%", percentage);
>>>>>>
>>>>>> -    ds_put_cstr(ds, "actions(");
>>>>>> -    nla_acts = nl_attr_get(a[OVS_SAMPLE_ATTR_ACTIONS]);
>>>>>> -    len = nl_attr_get_size(a[OVS_SAMPLE_ATTR_ACTIONS]);
>>>>>> -    format_odp_actions(ds, nla_acts, len, portno_names);
>>>>>> -    ds_put_format(ds, "))");
>>>>>> +    nla = a[OVS_SAMPLE_ATTR_PSAMPLE_GROUP];
>>>>>> +    if (nla) {
>>>>>> +        ds_put_format(ds, ",group_id=%d", nl_attr_get_u32(nla));
>>>>>> +
>>>>>> +        nla = a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE];
>>>>>> +
>>>>>> +        if (nla) {
>>>>>> +            size_t i;
>>>>>> +            const uint8_t *c = nl_attr_get(nla);
>>>>>> +            ds_put_cstr(ds, ",cookie=");
>>>>>> +            for (i = 0; i < nl_attr_get_size(nla); i++) {
>>>>>> +                ds_put_format(ds, "%02x", c[i]);
>>
>>
>> We print userdata in some other actions as dot-separated bytes.
>> maybe do the same here?  Otherwise we should at least add 0x to
>> the front to signify that it is a hex data.
>>
> Sure.
> 
>> Best regards, Ilya Maximets.
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to