On 5/13/24 20:59, Adrian Moreno wrote:
>
>
> On 5/10/24 3:06 PM, Ilya Maximets wrote:
>> On 5/10/24 14:01, Eelco Chaudron wrote:
>>>
>>>
>>> On 10 May 2024, at 12:45, Adrian Moreno wrote:
>>>
>>>> On 5/10/24 12:06 PM, Eelco Chaudron wrote:
>>>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>>>>
>>>>>> The new odp sample attributes allow userspace to specify a group_id and
>>>>>> user-defined cookie to be passed down to psample.
>>>>>>
>>>>>> Add support for parsing and formatting such action.
>>>>>>
>>>>>> Signed-off-by: Adrian Moreno <[email protected]>
>>>>>
>>>>> Hi Adrian,
>>>>>
>>>>> See my comments below inline.
>>>>>
>>>>> //Eelco
>>>>>
>>>>>> ---
>>>>>> include/linux/openvswitch.h | 49 +++++++++---
>>>>>> lib/odp-execute.c | 3 +
>>>>>> lib/odp-util.c | 150 ++++++++++++++++++++++++++---------
>>>>>> ofproto/ofproto-dpif-ipfix.c | 2 +
>>>>>> python/ovs/flow/odp.py | 2 +
>>>>>> tests/odp.at | 5 ++
>>>>>> 6 files changed, 163 insertions(+), 48 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>>>>> index d9fb991ef..3e748405c 100644
>>>>>> --- a/include/linux/openvswitch.h
>>>>>> +++ b/include/linux/openvswitch.h
>>>>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr {
>>>>>> #define OVS_UFID_F_OMIT_MASK (1 << 1)
>>>>>> #define OVS_UFID_F_OMIT_ACTIONS (1 << 2)
>>>>>>
>>>>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16
>>>>>> /**
>>>>>> * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE
>>>>>> action.
>>>>>> * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample
>>>>>> with
>>>>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr {
>>>>>> * %UINT32_MAX samples all packets and intermediate values sample
>>>>>> intermediate
>>>>>> * fractions of packets.
>>>>>> * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling
>>>>>> event.
>>>>>> - * Actions are passed as nested attributes.
>>>>>> + * Actions are passed as nested attributes. Optional if
>>>>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set.
>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as
>>>>>> psample group.
>>>>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set.
>>>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie
>>>>>> that, if
>>>>>> + * provided, will be copied to the psample cookie.
>>>>>
>>>>> Should we mention any limit on the actual length of the cookie?
>>>>>
>>>>> Any reason we explicitly call this psample? From an OVS perspective, this
>>>>> could just be
>>>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE
>>>>> and _GROUP. Other datapaths, do not have PSAMPLE.
>>>>>
>>>>
>>>> See my response to your comment on the cover-letter for possible name
>>>> alternatives.
>>>
>>> ACK, we can continue the discussion there.
>>>
>>>>> Thinking about it more, from an OVS perspective we should probably not
>>>>> even send down a COOKIE, but we should send down an obs_domain_id and
>>>>> obs_point_id, and then the kernel can move this into a cookie.
>>>>>
>>>>
>>>> I did consider this. However, the opaque cookie fits well with the tc API
>>>> which does not have two 32-bit values but an opaque 128-bit cookie. So if
>>>> the action is offloaded OVS needs to "encode" the two 32-bit values into
>>>> the cookie anyways.
>>>
>>> Which is fine to me, the OVS API should be clear that we have two u32
>>> entries. The cookie is implementation-specific, and we use the netlink
>>> format as our API for all DPs.
>>
>>
>> I'm not sure about this. It is a kernel interface and we can't deliver
>> two separate values from the psample. So, passing two separate values
>> to the kernel doesn't make a lot of sense. They are going to be mangled
>> into a single cookie anyway and we'll have to document that somehow.
>> We may as well always mangle the data from the beginning, document once
>> at the top level and save ourselves from documenting a million differences
>> between different implementations. While USDT could report them separately,
>> I'm not sure there is a ton of value in that.
>>
>>>
>>>>> The collector itself could be called system/local collector, or something
>>>>> like that. This way it can use for example psample for kernel and UDST
>>>>> probes for usespace. Both can pass the group and cookie (obs_domain_id
>>>>> and obs_point_id).
>>>>>
>>>>> Also, the presence of any of this should not dictate the psample action,
>>>>> we probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag.
>>>>>
>>>>
>>>> It clearer and it might simplify option verification a bit, but isn't a
>>>> bit redundant? There is no flag for action execution for instance, the
>>>> presence of the attribute is enough.
>>>>
>>>> An alternative would be to have nested attributes, e.g:
>>>>
>>>> enum ovs_sample_attr {
>>>> OVS_SAMPLE_ATTR_UNSPEC,
>>>> OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>> OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
>>>> OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_*
>>>> attributes. */
>>>>
>>>> __OVS_SAMPLE_ATTR_MAX,
>>>> };
>>>>
>>>> enum ovs_sample_system_attr {
>>>> OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN, /* u32 number */
>>>> OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */
>>>> OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_*
>>>>
>>>> __OVS_SSAMPLE_SYSTEM_ATTR_MAX,
>>>> };
>>>>
>>>> WDYT?
>>>>
>>>>> So I would envision your delta to look something like this:
>>>>>
>>>>> enum ovs_sample_attr {
>>>>> OVS_SAMPLE_ATTR_UNSPEC,
>>>>> - OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>> - OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */
>>>>> + OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>> + OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_
>>>>> + * attributes.
>>>>> + */
>>>>> + OVS_SAMPLE_ATTR_OBS_DOMAIN, /* Observation domain id, u32
>>>>> number. */
>>>>> + OVS_SAMPLE_ATTR_OBS_POINT, /* Observation point id, u32 number.
>>>>> */
>>>>> + OVS_SAMPLE_ATTR_SYSTEM_TARGET, /* Flag to additionally sample to
>>>>> system target. */
>>>>> + OVS_SAMPLE_ATTR_SYSTEM_GROUP, /* System target group id, u32 number. */
>>>>> __OVS_SAMPLE_ATTR_MAX,
>>>
>>> I’ve been thinking about it a bit more (since I wrote this comment), and I
>>> guess we might not need the extra flag, as we could use the
>>> OVS_SAMPLE_ATTR_SYSTEM_GROUP as the key for doing a local system copy (or
>>> whatever name we come up with). The OBS DOMAIN/POINT can be used in the
>>> future by others.
>>>
>>>>>> *
>>>>>> - * Executes the specified actions with the given probability on a
>>>>>> per-packet
>>>>>> - * basis.
>>>>>> + * Either OVS_SAMPLE_ATTR_PSAMPLE_GROUP or OVS_SAMPLE_ATTR_ACTIONS must
>>>>>> be
>>>>>> + * specified.
>>>>>> + *
>>>>>> + * Executes the specified actions and/or sends the packet to psample
>>>>>> + * with the given probability on a per-packet basis.
>>>>>> */
>>>>>> enum ovs_sample_attr {
>>>>>> OVS_SAMPLE_ATTR_UNSPEC,
>>>>>> - OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>>> - OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_*
>>>>>> attributes. */
>>>>>> + OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */
>>>>>> + OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_
>>>>>
>>>>> Missing *
>>>>>
>>>>
>>>> Yeah, it's a pitty the "*" makes checkpatch raise a warning.
>>>>
>>>>>> + * attributes.
>>>>>> + */
>>>>>> + OVS_SAMPLE_ATTR_PSAMPLE_GROUP, /* u32 number */
>>>>>> + OVS_SAMPLE_ATTR_PSAMPLE_COOKIE, /* binary */
>>>>>> __OVS_SAMPLE_ATTR_MAX,
>>>>>>
>>>>>> #ifdef __KERNEL__
>>>>>> @@ -722,13 +735,27 @@ enum ovs_sample_attr {
>>>>>> #define OVS_SAMPLE_ATTR_MAX (__OVS_SAMPLE_ATTR_MAX - 1)
>>>>>>
>>>>>> #ifdef __KERNEL__
>>>>>> +
>>>>>> +/* Definition for flags in struct sample_arg. */
>>>>>> +enum {
>>>>>> + /* When actions in sample will not change the flows. */
>>>>>> + OVS_SAMPLE_ARG_FLAG_EXEC = 1 << 0,
>>>>>> + /* When set, the packet will be sent to psample. */
>>>>>> + OVS_SAMPLE_ARG_FLAG_PSAMPLE = 1 << 1,
>>>>>
>>>>> Same on psample name here, should it be FLAG_SYSTEM_TARGET instead? See
>>>>> the above comment about an explicit option.
>>>>>
>>>>>> +};
>>>>>> +
>>>>>> struct sample_arg {
>>>>>> - bool exec; /* When true, actions in sample
>>>>>> will not
>>>>>> - * change flow keys. False
>>>>>> otherwise.
>>>>>> - */
>>>>>> - u32 probability; /* Same value as
>>>>>> - * 'OVS_SAMPLE_ATTR_PROBABILITY'.
>>>>>> - */
>>>>>> + u16 flags; /* Flags that modify the behavior of the
>>>>>> + * action. See SAMPLE_ARG_FLAG_*
>>>>>> + */
>>>>>> + u32 probability; /* Same value as
>>>>>> + * 'OVS_SAMPLE_ATTR_PROBABILITY'.
>>>>>> + */
>>>>>> + u32 group_id; /* Same value as
>>>>>> + * 'OVS_SAMPLE_ATTR_PSAMPLE_GROUP'.
>>>>>> + */
>>>>>> + u8 cookie_len; /* Length of psample cookie */
>>>>>> + char cookie[OVS_PSAMPLE_COOKIE_MAX_SIZE]; /* psample cookie
>>>>>> data. */
>>>>>
>>>>> Would it make sense for the cookie also to be u8?
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>>> };
>>>>>> #endif
>>>>>>
>>>>>> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
>>>>>> index 081e4d432..d8ee93429 100644
>>>>>> --- a/lib/odp-execute.c
>>>>>> +++ b/lib/odp-execute.c
>>>>>> @@ -716,6 +716,9 @@ odp_execute_sample(void *dp, struct dp_packet
>>>>>> *packet, bool steal,
>>>>>> subactions = a;
>>>>>> break;
>>>>>>
>>>>>> + /* Ignored in userspace datapath. */
>>>>>> + case OVS_SAMPLE_ATTR_PSAMPLE_GROUP:
>>>>>> + case OVS_SAMPLE_ATTR_PSAMPLE_COOKIE:
>>
>> Action that has these attributes should be marked as action that requires
>> datapath assistance. And we should never hit them here.
>>
>> We should have a test case covering a packet sent from a controller or
>> execution with a debug_slow flag.
>>
>
> Yes, I already have it ready for the next round. It does work.
>
>> BTW, this code can be executed for kernel datapath as well if the packet
>> goes to userspace after upcall or comes from a controller, so the comment
>> is not correct.
>>
>
> Yes. How about replacing it with something like:
> /* If the action contains these attributes it must be executed with datapath
> assistance so it's safe to ignore them here.
> */
>
> For context, what I plan to add in the next version is something like this:
>
> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
> index d8ee93429..9a405e62f 100644
> --- a/lib/odp-execute.c
> +++ b/lib/odp-execute.c
> @@ -823,11 +823,22 @@ requires_datapath_assistance(const struct nlattr *a)
> case OVS_ACTION_ATTR_METER:
>
> return true;
>
>
> + case OVS_ACTION_ATTR_SAMPLE:
> + {
'{' on a previous line.
> + const struct nlattr *nested;
> + unsigned int left;
An empty line here.
> + NL_NESTED_FOR_EACH (nested, left, a) {
> + if (nl_attr_type(nested) == OVS_SAMPLE_ATTR_PSAMPLE_GROUP) {
> + return true;
> + }
> + }
> + return false;
> + }
> +
Otherwise, seems reasonable at a first glance, but see my other reply
about attributes vs actions.
> case OVS_ACTION_ATTR_SET:
> case OVS_ACTION_ATTR_SET_MASKED:
> case OVS_ACTION_ATTR_PUSH_VLAN:
> case OVS_ACTION_ATTR_POP_VLAN:
> - case OVS_ACTION_ATTR_SAMPLE:
> case OVS_ACTION_ATTR_HASH:
> case OVS_ACTION_ATTR_PUSH_MPLS:
> case OVS_ACTION_ATTR_POP_MPLS:
>
>
>
>>
>>>>>> case OVS_SAMPLE_ATTR_UNSPEC:
>>>>>> case __OVS_SAMPLE_ATTR_MAX:
>>>>>> default:
>>>>>> diff --git a/lib/odp-util.c b/lib/odp-util.c
>>>>>> index 21f34d955..611b5229e 100644
>>>>>> --- a/lib/odp-util.c
>>>>>> +++ b/lib/odp-util.c
>>>>>> @@ -227,12 +227,16 @@ format_odp_sample_action(struct ds *ds, const
>>>>>> struct nlattr *attr,
>>>>>> {
>>>>>> static const struct nl_policy ovs_sample_policy[] = {
>>>>>> [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 },
>>>>>> - [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED }
>>>>>> + [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED,
>>>>>> + .optional = true },
>>>>>> + [OVS_SAMPLE_ATTR_PSAMPLE_GROUP] = { .type = NL_A_U32,
>>>>>> + .optional = true },
>>>>>> + [OVS_SAMPLE_ATTR_PSAMPLE_COOKIE] = { .type = NL_A_UNSPEC,
>>>>>> + .optional = true }
>>>>>> };
>>>>>> struct nlattr *a[ARRAY_SIZE(ovs_sample_policy)];
>>>>>> + const struct nlattr *nla;
>>>>>> double percentage;
>>>>>> - const struct nlattr *nla_acts;
>>>>>> - int len;
>>>>>>
>>>>>> ds_put_cstr(ds, "sample");
>>>>>>
>>>>>> @@ -244,13 +248,33 @@ format_odp_sample_action(struct ds *ds, const
>>>>>> struct nlattr *attr,
>>>>>> percentage = (100.0 *
>>>>>> nl_attr_get_u32(a[OVS_SAMPLE_ATTR_PROBABILITY])) /
>>>>>> UINT32_MAX;
>>>>>>
>>>>>> - ds_put_format(ds, "(sample=%.1f%%,", percentage);
>>>>>> + ds_put_format(ds, "(sample=%.1f%%", percentage);
>>>>>>
>>>>>> - ds_put_cstr(ds, "actions(");
>>>>>> - nla_acts = nl_attr_get(a[OVS_SAMPLE_ATTR_ACTIONS]);
>>>>>> - len = nl_attr_get_size(a[OVS_SAMPLE_ATTR_ACTIONS]);
>>>>>> - format_odp_actions(ds, nla_acts, len, portno_names);
>>>>>> - ds_put_format(ds, "))");
>>>>>> + nla = a[OVS_SAMPLE_ATTR_PSAMPLE_GROUP];
>>>>>> + if (nla) {
>>>>>> + ds_put_format(ds, ",group_id=%d", nl_attr_get_u32(nla));
>>>>>> +
>>>>>> + nla = a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE];
>>>>>> +
>>>>>> + if (nla) {
>>>>>> + size_t i;
>>>>>> + const uint8_t *c = nl_attr_get(nla);
>>>>>> + ds_put_cstr(ds, ",cookie=");
>>>>>> + for (i = 0; i < nl_attr_get_size(nla); i++) {
>>>>>> + ds_put_format(ds, "%02x", c[i]);
>>
>>
>> We print userdata in some other actions as dot-separated bytes.
>> maybe do the same here? Otherwise we should at least add 0x to
>> the front to signify that it is a hex data.
>>
> Sure.
>
>> Best regards, Ilya Maximets.
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev