> -----Original Message-----
> From: Eelco Chaudron <[email protected]>
> Sent: Thursday, June 23, 2022 4:38 PM
> To: Finn, Emma <[email protected]>
> Cc: Stokes, Ian <[email protected]>; Van Haaren, Harry
> <[email protected]>; [email protected]
> Subject: Re: [PATCH v7 02/11] odp-execute: Add function pointers to
> odp-execute
> for different action implementations.
>
> On 14 Jun 2022, at 13:57, Emma Finn wrote:
<snip patch contents>
> > --
> > 2.32.0
>
>
> General discussion item from v6:
Thanks Eelco for the detailed review.
> EC> I have another general architecture question? What if for a specific
> packet (or
> batch of packets) the SIMD implementation is not able to process the
> packet/batch?
> EC> How can we handle this in a uniform way? Can you give this some thought
> and
> add support?
>
> EF> We support 100% of the action (regardless of input packet, we perform same
> action as scalar actions code). Autovalidation ensures the modifications are
> identical.
> EF> As a result there is no value in a "fallback to scalar" backup, as this
> must never
> occur.
>
> What I'm trying to say is, that in certain cases, your SIMD specific code
> might not be
> able to optimize the action handling, and it might want to fall back to the
> scaler
> implementation.
Today, the AVX512 implementation in this patchset will *always* be able to
handle the action.
There is no requirement for scalar fallback.
> If there is no way to handle this, it could lead to a lot of code
> duplication. This might
> not be the case in your current patch set (maybe patch 10 would need it), but
> could
> be in the future. Let's take the following "potential" example:
If in future a patchset is proposed with these requirements, we can refactor
based on
actual requirements of the patchset then. I don't like the idea of adding
building something
now "just in case" a future patchset might want something.
> Someone comes along and wants to offload these actions using OpenCL. As OpenCL
> is designed to do things in parallel, he wants to process an entire batch of
> packets in
> one go, but there is a constraint, all packets need to be at least 128 bytes
> long. If this
> is not the case, the scalar approach would be faster.
>
> So I'm suggesting we need an API like this example:
>
> if (all_in_batch_ge_128(batch) {
> openCL_handle_batch(batch)
> } else {
> odp_execute_scalar_action(batch, action);
> }
Understood, and perhaps this would be good. I recommend we find out if/when a
real
world patchset is proposed so we can discuss implementation details with
confidence.
For 2.18 release, I feel the existing approach is fine because it addresses the
actual needs
of the AVX512 implementation today. None of the details here are user-visible,
we can
refactor if patchsets like the above (e.g. OpenCL or whatever) is posted in a
future release.
<snip more patch suggestions/contents>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev