On 4/28/22 13:49, Ilya Maximets wrote:
> On 4/28/22 11:06, Jianbo Liu wrote:
>> On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote:
>>> On 4/27/22 04:41, Jianbo Liu wrote:
>>>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote:
>>>>> On 4/26/22 09:14, Simon Horman wrote:
>>>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via dev
>>>>>> wrote:
>>>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote:
>>>>>>>> This series is to add support for tc offloading of ovs
>>>>>>>> metering, and
>>>>>>>> enhance OVS to use new kernel feature which offload tc
>>>>>>>> police
>>>>>>>> action to
>>>>>>>> hardware.
>>>>>>>> To do the offloading, new APIs for meter are added in
>>>>>>>> netdev-
>>>>>>>> offload,
>>>>>>>> and OVS meters are mapped to tc police actions with one-to-
>>>>>>>> one
>>>>>>>> relationship for dpif-netlink.
>>>>>>>>
>>>>>>>> Notes:
>>>>>>>> v3
>>>>>>>> - Add netdev-offload APIs for meter.
>>>>>>>> - Move the implementation to lower netdev-offload-tc
>>>>>>>> layer.
>>>>>>>>
>>>>>>>> v2
>>>>>>>> - Move tc police parse call from last patch to 2nd
>>>>>>>> patch.
>>>>>>>> 2nd patch is adding the parse lib func and add the
>>>>>>>> call
>>>>>>>> to it in
>>>>>>>> that patch.
>>>>>>>> Last patch is the put of tc police act.
>>>>>>>> In 2nd patch also add empty switch case for tc police
>>>>>>>> act
>>>>>>>> put as
>>>>>>>> the impl.
>>>>>>>> is done in the last patch when support for put is
>>>>>>>> being
>>>>>>>> added.
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>> Friendly ping for review.
>>>>>>
>>>>>> Hi Jianbo,
>>>>>>
>>>>>> sorry for the delayed response.
>>>>>>
>>>>>> My colleagues at Corigine have exercised this patchset and have
>>>>>> verified
>>>>>> it for our test-cases.
>>>>>>
>>>>>> Also, I have run the upstream test workflows over them, and
>>>>>> they
>>>>>> look good.
>>>>>>
>>>>>> 1. netdev-offload: Add meter offload provider
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8lRjZK35bENpZ3vRbMFBfGWJLXX4p%2BfjR57JEhADucw%3D&reserved=0
>>>>>> 2. tc: Add support parsing tc police action
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iorfBm3gxzS3fILCBF8h18%2FWLSUQXEhc2g7jWMn%2F4ug%3D&reserved=0
>>>>>> 3. netdev-linux: Refactor put police action netlink message
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9%2BKMNZqAe6CM7ysWXwbeFpIyf%2FF44NdSJj6WVN46IdM%3D&reserved=0
>>>>>> 4. netdev-linux: Add functions to manipulate tc police action
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wMPlN5cYZJfZloAZ48bKLpS3SKSMTyZAovWlSPICXxY%3D&reserved=0
>>>>>> 5. netdev-offload-tc: Implement and register meter offload API
>>>>>> for
>>>>>> tc
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MFKxDa7ZDScho536VVERyLS00uF76BgcbCS1q%2BRQoTI%3D&reserved=0
>>>>>> 6. netdev-offload-tc: Cleanup police actions with reserved
>>>>>> indexes
>>>>>> on startup
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ef%2BFz7twkOqc9PQC%2FQPkgaqHwreKHpnw4iuPyp%2BdFDQ%3D&reserved=0
>>>>>> 7. netdev-offload-tc: Offloading rules with police actions
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VjzfzOLeU%2FRgHhFBfGwTAUXvcRjIuYsY4%2BNV4l3OEb8%3D&reserved=0
>>>>>> 8. dpif-netlink: Offloading meter to tc police action
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HWC5u8pDX3vPRfeoJnXXBUiMlJDNYeApPFNauX2zVbc%3D&reserved=0
>>>>>>
>>>>>> I have also reviewed the patches, and with my OVS maintainer
>>>>>> hat
>>>>>> on, I'm
>>>>>> happy to apply them, and plan to do so in the coming days
>>>>>> unless
>>>>>> there are
>>>>>> objections.
>>>>>
>>>>> Thanks, Simon, for taking care of this patch set! And sorry for
>>>>> me
>>>>> not being
>>>>> very responsive.
>>>>>
>>>>> I had a quick look over the patch set just now and I think that
>>>>> in
>>>>> this
>>>>> version most of my main concerns were addressed and it looks good
>>>>> in
>>>>> general.
>>>>>
>>>>> One thing I didn't understand is why we need to have 'struct
>>>>> meter_offload_api'
>>>>> as a separate entity? I mean, it never used if a normal offload
>>>>> API
>>>>> is
>>>>> not initialized and it's tied with the particular offload API.
>>>>> Can
>>>>> these
>>>>> callbacks be just part of the normal 'struct netdev_flow_api' ?
>>>>> Should not be hard to merge meter_offload_api and netdev_flow_api
>>>>> into one.
>>>>> That should also reduce the amount of code. Or am I missing
>>>>> something?
>>>>
>>>> Conceptually, meter is used globally, and it does not belong to any
>>>> perticular netdev/flow. All the callbacks in 'struct
>>>> netdev_flow_api'
>>>> have a parameter related to netdev, but meter callbacks don't need.
>>>> The meter_offload_set/get/del can be part of 'struct
>>>> netdev_flow_api',
>>>> but meter_offload_init/destory must be placed to elsewhere, to
>>>> avoid
>>>> being called more than once.
>>>> So, why not add new API for meter? But I'd like to change if you
>>>> insist. What's your decision?
>>>
>>> I agree that meters do not belong to any particular netdev, that's
>>> true. However, netdev_flow_api is also a global entity. All netdevs
>>> are using the same instance of the netdev_fow_api. And most of the
>>> structures in netdev-offload-tc are global (hash maps,
>>> configurations).
>>> 'netdev' argument can just be viewed as a guide for these callbacks,
>>> like meter_id for the meter offload API, not something necessary.
>>
>> Usually, we get netdev_flow_api from netdev->flow_api, but dpif_class-
>>> meter_get/set/del don't have the param of netdev, we must add a helper
>> to get any netdev from dpif. The helper is more like:
>>
>> struct netdev *
>> netdev_get_any(const char *dpif_type)
>> {
>> struct port_to_netdev_data *data;
>> struct netdev *dev = NULL;
>>
>> ovs_rwlock_rdlock(&netdev_hmap_rwlock);
>> HMAP_FOR_EACH (data, portno_node, &port_to_netdev) {
>> if (netdev_get_dpif_type(data->netdev) == dpif_type) {
>> dev = netdev_ref(data->netdev);
>> break;
>> }
>> }
>> ovs_rwlock_unlock(&netdev_hmap_rwlock);
>>
>> return dev;
>> }
>>
>> Do you accept it, or have better solution?
>
> Hmm. Good point, I missed that.
> In fact, a single dpif can have ports with different offload API
> initialized. For example, linux ports may use TC, while DPDK ports
> can use DPDK's rte_flow while being in the same dpif-netdev instance.
>
> So, I think, the correct solution is to add a meter to all flow APIs
> that are initialized for ports within that datapath. Maybe something
> like this (not tested, pseudo-code):
>
> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
> index fb108c0d5..91504ee15 100644
> --- a/lib/netdev-offload.c
> +++ b/lib/netdev-offload.c
> @@ -79,6 +79,9 @@ struct netdev_registered_flow_api {
> /* Number of references: one for the flow_api itself and one for every
> * instance of the netdev that uses it. */
> struct ovs_refcount refcnt;
> +
> + /* Reference counter for dpif types which has a port with this flow API.
> */
> + struct simap used_by_dpif_types;
> };
>
> static struct netdev_registered_flow_api *
> @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct
> netdev_flow_api *new_flow_api)
> hash_string(new_flow_api->type, 0));
> rfa->flow_api = new_flow_api;
> ovs_refcount_init(&rfa->refcnt);
> + simap_init(&rfa->used_by_dpif_types);
> VLOG_DBG("netdev: flow API '%s' registered.", new_flow_api->type);
> }
> ovs_mutex_unlock(&netdev_flow_api_provider_mutex);
> @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char *type)
> } else {
> cmap_remove(&netdev_flow_apis, &rfa->cmap_node,
> hash_string(rfa->flow_api->type, 0));
> + ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types);
> ovsrcu_postpone(free, rfa);
> error = 0;
> }
> @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev)
> CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> if (!rfa->flow_api->init_flow_api(netdev)) {
> ovs_refcount_ref(&rfa->refcnt);
> + simap_increase(&rfa->used_by_dpif_types,
> + netdev_get_dpif_type(netdev), 1);
> ovsrcu_set(&netdev->flow_api, rfa->flow_api);
> VLOG_INFO("%s: Assigned flow API '%s'.",
> netdev_get_name(netdev), rfa->flow_api->type);
> @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev)
>
> ovsrcu_set(&netdev->flow_api, NULL);
> rfa = netdev_lookup_flow_api(flow_api->type);
> + ovs_assert(simap_increase(&rfa->used_by_dpif_types,
> + netdev_get_dpif_type(netdev), -1) >= 0);
> ovs_refcount_unref(&rfa->refcnt);
> }
>
> @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char *dpif_type,
> odp_port_t port_no,
> return ret;
> }
>
> +
> +void
> +netdev_offload_meter_set(const char *dpif_type, ...)
> +{
> + struct netdev_registered_flow_api *rfa;
> +
> + CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> + if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) {
> + rfa->flow_api->meter_set(...);
> + }
> + }
> +}
> +
> +
> odp_port_t
> netdev_ifindex_to_odp_port(int ifindex)
> {
> ---
>
> ?
Thinking more, there is no point in counting netdevs, since they
can be added after meters already created. So, I guess, we just
have to always add meters to all registered flow API providers,
i.e. just:
void
netdev_offload_meter_set(...)
{
struct netdev_registered_flow_api *rfa;
CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
rfa->flow_api->meter_set(...);
}
}
>
>>
>>> (I guess, we could rename netdev_flow_api to just offload_api to make
>>> this more clear, but let's not think about that too much for now to
>>> not overcomplicate things.)
>>>
>>> meter_offload_init/destory APIs might not be even needed.
>>>
>>> meter_tc_init_policer can be called directly from the
>>> netdev_tc_init_flow_api,
>>> if made re-enter-able (if (meter_police_ids) { return; }), or can be
>>> called from under the existing ovsthread_once_start(&once).
>>>
>>> meter_offload_destory in current implementation only frees the id
>>> pool.
>>> This might be not necessary for the following reasons:
>>> - If datapath is destroyed, but the process is still running, it's
>>> likely that we're waiting for the creation of a new datapath that
>>> will re-use the id-pool.
>>> - If we're going to shut down the OVS, it's nice but not necessary
>>> to free all the memory before doing so. Since the id-pool is a
>>> global
>>> variable, that will not be treated as a memory leak by sanitizers.
>>>
>>> What do you think?
>>>
>>
>> It makes sense.
>>
>>>
>>> Forgot to mention in a previous reply that the patch 8/8 of the patch
>>> set should add a NEWS entry for the new feature (the NEWS file is a
>>> bit mangled at the moment, but Alin is going to fix it soon).
>>> Some test in the tests/system-offloads-traffic.at would be also nice
>>> to have, if possible.
>>>
>>>>
>>>> Thanks!
>>>> Jianbo
>>>>
>>>>>
>>>>> Another small nit is that, I guess, lib/tc.c is a better home for
>>>>> most of the
>>>>> tc_police functions from lib/netdev-linux.c. But I agree that
>>>>> they
>>>>> are
>>>>> already mixed up and it's kind of a separate task to split them.
>>>>> So,
>>>>> it's
>>>>> not really a concern for this patch set.
>>>>>
>>>>>>
>>>>>> One area that we, at Corigine, would like to follow-up on is
>>>>>> using
>>>>>> a
>>>>>> revalidation process rather than evicting rules on start-up.
>>>>>
>>>>> If TC police deletion failures can happen, that can become a
>>>>> problem
>>>>> for
>>>>> long-running setups indeed. Not sure what the best solution
>>>>> should
>>>>> be, but
>>>>> I agree that this case has to be handled in some way.
>>>>>
>>>>> Best regards, Ilya Maximets.
>>>>
>>>
>>> _______________________________________________
>>> dev mailing list
>>> [email protected]
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dazl4Ccgfrkb7tkne2e6c%2BZTHIjZ2RJqjchon%2BA9l%2BY%3D&reserved=0
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev