On 3/17/22 02:12, Chris Mi wrote:
> This patch set adds offload support for sFlow.
>
> Psample is a genetlink channel for packet sampling. TC action act_sample
> uses psample to send sampled packets to userspace.
>
> When offloading sample action to TC, userspace creates a unique ID to
> map sFlow action and tunnel info and passes this ID to kernel instead
> of the sFlow info. psample will send this ID and sampled packet to
> userspace. Using the ID, userspace can recover the sFlow info and send
> sampled packet to the right sFlow monitoring host.
>
> v2-v1:
> - Fix robot errors.
> v3-v2:
> - Remove Gerrit Change-Id.
> - Add patch #9 to fix older kernels build issue.
> - Add travis test result.
> v4-v3:
> - Fix offload issue when sampling rate is 1.
> v5-v4:
> - Move polling thread from ofproto to netdev-offload-tc.
> v6-v5:
> - Rebase.
> - Add GitHub Actions test result.
> v7-v6:
> - Remove Gerrit Change-Id.
> - Fix "ERROR: Inappropriate spacing around cast"
> v8-v7
> - Address Eelco Chaudron's comment for patch #11.
> v9-v8
> - Remove sflow_len from struct dpif_sflow_attr.
> - Log a debug message for other userspace actions.
> v10-v9
> - Address Eelco Chaudron's comments on v9.
> v11-v10
> - Fix a bracing error.
> v12-v11
> - Add duplicate sample group id check.
> v13-v12
> - Remove the psample poll thread from netdev-offload-tc and reuse
> ofproto handler thread according to Ilya's new desgin.
> - Add dpif-offload-provider layer according to Eli's suggestion.
> v14-v13
> - Fix a robot error.
> v15-v14
> - Address Eelco Chaudron's comments on v14.
> v16-v15
> - Address Eelco Chaudron's comments on v15.
> - Add two test cases.
> v17-v16
> - Address Eelco Chaudron's comments on v16.
> - Move struct dpif_offload_api from struct dpif_class to struct dpif.
> v18-v17
> - Rename dpif_offload_api to dpif_offload_class.
> - Add init and destroy callbacks in dpif_offload_class. They are called
> when registering dpif_offload_class.
> v19-18
> - Fix a bug that psample_sock is destroyed when last bridge is deleted.
> v20-19
> - Move buf_stub to struct dpif_offload_sflow avoid garbage values when
> ofproto proceses the sampled packet.
> v21-20
> - Remove netdev dummy for dpif-offload according to Eelco's comment.
>
> Chris Mi (8):
> compat: Add psample and tc sample action defines for older kernels
> ovs-kmod-ctl: Load kernel module psample
> dpif-offload-provider: Introduce dpif-offload-provider layer
> netdev-offload-tc: Introduce group ID management API
> dpif-offload-netlink: Implement dpif-offload-provider API
> ofproto: Introduce API to process sFlow offload packet
> netdev-offload-tc: Add offload support for sFlow
> system-offloads-traffic.at: Add sFlow offload test cases
>
> NEWS | 1 +
> include/linux/automake.mk | 4 +-
> include/linux/psample.h | 62 ++++
> include/linux/tc_act/tc_sample.h | 25 ++
> lib/automake.mk | 3 +
> lib/dpif-netdev.c | 3 +-
> lib/dpif-netlink.c | 20 +-
> lib/dpif-offload-netlink.c | 227 +++++++++++++++
> lib/dpif-offload-provider.h | 94 +++++++
> lib/dpif-offload.c | 147 ++++++++++
> lib/dpif-provider.h | 10 +-
> lib/dpif.c | 3 +-
> lib/netdev-offload-tc.c | 466 +++++++++++++++++++++++++++++--
> lib/netdev-offload.h | 2 +
> lib/tc.c | 61 +++-
> lib/tc.h | 16 +-
> ofproto/ofproto-dpif-upcall.c | 73 +++++
> tests/system-offloads-traffic.at | 101 +++++++
> utilities/ovs-kmod-ctl.in | 14 +
> 19 files changed, 1299 insertions(+), 33 deletions(-)
> create mode 100644 include/linux/psample.h
> create mode 100644 include/linux/tc_act/tc_sample.h
> create mode 100644 lib/dpif-offload-netlink.c
> create mode 100644 lib/dpif-offload-provider.h
> create mode 100644 lib/dpif-offload.c
>
Hi, Chris, others.
Following up with a summary of what we talked about with Majd during
the conference with some additional technical details.
The main issue with this patch set right now is an dpif-offload-provider
concept. It might make sense on it's own, but as I said before, we
need to take 'all or nothing' approach with it, otherwise we'll stuck
with a half-baked solution scattered among different parts of OVS and
that will be not beneficial from the architecture perspective further
complicating already fairly complex solution.
The full implementation of the dpif-offload-provider concept is clearly
out of scope for the sFlow offload feature, so if we need it, it should
be done as a separate architectural change re-working all offload
implementations including both tc and rte_flow.
What I'm suggesting for the current patch set is to drop the
dpif-offload-provider parts and go back to the simple schema with
receiving upcalls from the regular dpif-provider. This should make the
code simpler and better contained in modules it belongs to.
Implementation suggestions:
1. OVS handler thread:
recv_upcalls()
--> dpif_recv()
--> dpif_netlink_recv()
if (dpif_netlink_recv_cpu_dispatch() == EAGAIN):
--> * netdev_offload_recv()
for each registered offload API:
--> * flow_api->recv()
--> * netdev_offload_tc_recv()
if handler.id == 0:
Receive psample message,
fill the struct dpif_upcall, etc.
--> break on success (one of the APIs
can starve, but we only have one
implementation, so it's not a problem
for today.)
--> upcall_receive()
--> process_upcall()
* new functions to implement.
Similar call chain for the recv_wait() callback.
2. process_offload_sflow() seems to duplicate existing upcall_receive(),
I'm not sure why we need that.
3. lib/id-pool.h or lib/id-fpool.h should probably be used to allocate
sample group IDs instead of looping over 'next_sample_group_id'.
Starving of netdev_offload_recv() can be solved by storing a flag in
the dpif_handler structure and alternating the order of receiving
normal upcalls and offload upcalls based on that flag, switching it
on each call, e.g.:
handler = &dpif->handlers[handler_id];
if (handler->recv_offload_first) {
error = netdev_offload_recv(handler_id, upcall, buf);
if (error == EAGAIN) {
error = dpif_netlink_recv__(dpif, handler_id, upcall, buf);
}
} else {
error = dpif_netlink_recv__(dpif, handler_id, upcall, buf);
if (error == EAGAIN) {
error = netdev_offload_recv(handler_id, upcall, buf);
}
}
handler->recv_offload_first = !handler->recv_offload_first;
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev