On 10 May 2024, at 10:23, Adrian Moreno wrote:

> On 5/10/24 9:14 AM, Eelco Chaudron wrote:
>>
>>
>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote:
>>
>>> This is the userspace counterpart of the work being done in the kernel
>>> [1]. Sending it as RFC to get some early feedback on the overall
>>> solution.
>>>
>>> ** Problem description **
>>> Currently, OVS supports several observability features, such as
>>> per-bridge IPFIX, per-flow IPFIX and sFlow. However, given the
>>> complexity of OVN-generated pipelines, a single sample is typically not
>>> enough to troubleshoot what is OVN/OVS is doing with the packet. We need
>>> highler level metadata alongside the packet sample.
>>>
>>> This can be achieved by the means of per-flow IPFIX sampling and
>>> NXAST_SAMPLE action, through which OVN can add two arbitrary 32-bit
>>> values (observation_domain_id and observation_point_id) that are sent
>>> alongside the packet information in the IPFIX frames.
>>>
>>> However, there is a fundamental limitation of the existing sampling
>>> infrastructure, specially in the kernel datapath: samples are handled by
>>> ovs-vswitchd, forcing the need for an upcall (userspace action). The
>>> fact that samples share the same netlink sockets and handler thread cpu
>>> time with actual packet misse, can easily cause packet drops making this
>>> solution unfit for use in highly loaded production systems.
>>>
>>> Users are left with little option than guessing what sampling rate will
>>> be OK for their traffic pattern and dealing with the lost accuracy.
>>>
>>> ** Feature Description **
>>> In order to solve this situation and enable this feature to be safely
>>> turned on in production, this RFC uses the psample support proposed in
>>> [1] to send NXAST_SAMPLE samples to psample adding the observability
>>> domain and point information as metadata.
>>>
>>> ~~ API ~~
>>> The API is simply a new field called "psample_group" in the
>>> Flow_Sample_Collector_Set (FSCS) table. Configuring this value is
>>> orthogonal to also associating the FSCS entry to an entry in the IPFIX
>>> table.
>>>
>>> Apart from that configuration, the controller needs to add NXAST_SAMPLE
>>> actions that refer the entry's id.
>>>
>>> ~~ HW Offload ~~
>>> psample is already supported by tc using the act_sample action. The
>>> metadata is taken from the actions' cookie.
>>> This series also adds support for offloading the odp sample action,
>>> only when it includes psample information but not nested actions.
>>>
>>> A bit of special care has to be taken in the tc layer to not store the
>>> ufid in the sample's cookie as it's used to carry action-specific data.
>>>
>>> ~~ Datapath support ~~
>>> This new behavior of the datapth sample action is only supported on the
>>> kernel datapath. A more detailed analysis is needed (and planned) to
>>> find a way to also optimize the userspace datapath. However, although
>>> potentially inefficient, there is not that much risk of dropping packets
>>> with the existing IPFIX infrastructure.
>>>
>>> ~~ Testing ~~
>>> The series includes an utility program called "ovs-psample" that listens
>>> to packets multicasted by the psample module and prints them (also
>>> printing the obs_domain and obs_point ids). In addition the kernel
>>> series includes a tracepoint for easy testing and troubleshooting.
>>>
>>> [1] https://patchwork.kernel.org/project/netdevbpf/list/?series=847473
>>
>> Hi Andrew,
>>
>
> Who is Andrew? :-D

Sorry :( Handling too many things in parallel...

>> Thanks for sending out this RFC series. I did a code-only review (no 
>> testing), and the main concern I have (besides some locking) is the direct 
>> mapping to psample. If we decide that for userspace we are going to use USDT 
>> probes, we need another target, and we will duplicate a lot of code. My 
>> proposal would be to have a more general target name like system, or local 
>> (or a better name ;). This will be a system-local dpif target with a 32-bit 
>> group id.
>>
>
> Regarding the name, I agree adding "psample" to the API can be confusing for 
> userspace datapath. Besides, I think it should also be in sync with the odp 
> naming:
>
> "system": My initial thought was it could be confused with the "system@" name 
> of the kernel datapath, but that is not present in the OVSDB level so it 
> could work. SAMPLE_ATTR_{SYSTEM_GROUP,SYSTEM_USER_COOKIE}?
>
> "offload": I did consider this but I discarded it as it could be confused 
> with actual hw/tc offloading of the sample action.
>
> "local": Funny, I initially thought of the opposite, i.e: "external" as the 
> sample collection and potential encoding (which is currently "local" to 
> ovs-vswitchd) is now some "external" process. Still I guess the fact you 
> thought of the opposite term suggests it's not expressive enough.
>
> "raw": As in, samples are not IPFIX any more (unless the external collector 
> decides to export them in IPFIX) but we're exporting the raw packet 
> (alongside some metadata). SAMPLE_ATTR_{RAW_GROUP, RAW_USER_COOKIE}?
>
> "dpif": As in, it depends on the dpif implementation. However, this term kind 
> of internal to OVS (it appears only once in ovs-vswitchd.conf.db(5) without 
> much clarification). Its more human synonym "datapath", or "dp" for short 
> would yield kind of redundant odp attributes: 
> SAMPLE_ATTR_{DP_GROUP,DP_USER_COOKIE} but could still work...
>
> Any other suggestion or preference?

Yes naming is hard, I just thought of ‘localhost’, ‘localEndpoint’, just adding 
another option…

Does anyone else have a cool name?!?

>> I’ll send out my comments to the individual patches shortly.
>>
>> //Eelco
>>
>>> Adrian Moreno (11):
>>>    odp-util: Add support to new psample uAPI.
>>>    ofproto_dpif: Check for psample support.
>>>    ofproto: Add ofproto-dpif-psample.
>>>    vswitchd: Add psample to schema and configure it.
>>>    ofproto_dpif_xlate: Use psample for OFP samples.
>>>    utilities: Add ovs-psample.
>>>    netlink-offload-tc: Rename act_cookie->flow_cookie.
>>>    netdev-offload-tc: Add sample support.
>>>    ofproto-dpif-xlate-cache: Add psample to xcache.
>>>    ofproto-dpif-psample: Add command to show stats.
>>>    tests: Add test for sample offloading.
>>>
>>>   Documentation/automake.mk           |   1 +
>>>   Documentation/conf.py               |   2 +
>>>   Documentation/ref/index.rst         |   1 +
>>>   Documentation/ref/ovs-psample.8.rst |  47 ++++
>>>   include/linux/automake.mk           |   6 +-
>>>   include/linux/openvswitch.h         |  49 ++++-
>>>   include/linux/pkt_cls.h             |   2 +
>>>   include/linux/psample.h             |  64 ++++++
>>>   include/linux/tc_act/tc_sample.h    |  26 +++
>>>   lib/netdev-offload-tc.c             |  75 ++++++-
>>>   lib/odp-execute.c                   |   3 +
>>>   lib/odp-util.c                      | 150 +++++++++----
>>>   lib/tc.c                            |  98 ++++++++-
>>>   lib/tc.h                            |   9 +-
>>>   ofproto/automake.mk                 |   2 +
>>>   ofproto/ofproto-dpif-ipfix.c        |   2 +
>>>   ofproto/ofproto-dpif-psample.c      | 266 ++++++++++++++++++++++
>>>   ofproto/ofproto-dpif-psample.h      |  39 ++++
>>>   ofproto/ofproto-dpif-xlate-cache.c  |  11 +-
>>>   ofproto/ofproto-dpif-xlate-cache.h  |   6 +
>>>   ofproto/ofproto-dpif-xlate.c        | 158 +++++++++----
>>>   ofproto/ofproto-dpif-xlate.h        |   3 +-
>>>   ofproto/ofproto-dpif.c              |  96 +++++++-
>>>   ofproto/ofproto-dpif.h              |   7 +-
>>>   ofproto/ofproto-provider.h          |   9 +
>>>   ofproto/ofproto.c                   |  10 +
>>>   ofproto/ofproto.h                   |   8 +
>>>   python/ovs/flow/odp.py              |   2 +
>>>   rhel/openvswitch-fedora.spec.in     |   2 +
>>>   rhel/openvswitch.spec.in            |   1 +
>>>   tests/odp.at                        |   5 +
>>>   tests/system-common-macros.at       |   4 +
>>>   tests/system-offloads-traffic.at    |  53 +++++
>>>   utilities/automake.mk               |   9 +
>>>   utilities/ovs-psample.c             | 330 ++++++++++++++++++++++++++++
>>>   vswitchd/bridge.c                   |  54 ++++-
>>>   vswitchd/vswitch.ovsschema          |   7 +-
>>>   vswitchd/vswitch.xml                |  32 ++-
>>>   38 files changed, 1533 insertions(+), 116 deletions(-)
>>>   create mode 100644 Documentation/ref/ovs-psample.8.rst
>>>   create mode 100644 include/linux/psample.h
>>>   create mode 100644 include/linux/tc_act/tc_sample.h
>>>   create mode 100644 ofproto/ofproto-dpif-psample.c
>>>   create mode 100644 ofproto/ofproto-dpif-psample.h
>>>   create mode 100644 utilities/ovs-psample.c
>>>
>>> -- 
>>> 2.44.0
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to