> Does this still happen after you replaced the RTE_PTR_ADD() with native 
> pointer arithmetic in the checksum function?
> In other words: Is this workaround still necessary?

Yes unfortunately it is necessary with the pointer access.
I updated the reproducer which shows this case:
https://gist.github.com/Scottmitch/bf23748b4588e68c9bdb8d124f92f1bd

> This is a showstopper:
> If the workaround is necessary, applications with similar use cases also need 
> to apply the workaround.
> If we cannot somehow enforce that, the series is likely to break some 
> applications, which is unacceptable.

That is a great point. This API isn't internal-only and this would
effectively be
an API breaking change which doesn't seem justified.

Given what I've learned through this process (thank you & stephen for
valuable feedback)
we have a few paths to achieve my goal (clang optimizes
__rte_raw_cksum). I've verified
if the RTE_PTR_ADD macros are changed to use char* clang optimizes (and gcc
still does too) [1]. To achieve this we have some options:

A. Modify RTE_PTR_[ADD|SUB] to use pointers
pros:
- [if API can be preserved] provides benefits to all use cases w/out
usage changes
- no additional API surface to expose
cons:
- more complex macro implementation to preserve API compatibility.

B. Add RTE_CONST_PTR_[ADD|SUB] with const [void*|char*] & use it in
__rte_raw_cksum
pros:
- no risk of impacting existing RTE_PTR_[ADD|SUB] APIs
- simple implementation using pointers from the start
cons:
- API may not support all use cases as RTE_PTR_[ADD|SUB] (e.g. ptr arg
as raw integer)
- requires manual opt-in to new API to get any benefit

I have a draft of A I will submit as a patch and we can discuss if it
makes sense
or fallback to B (or other approaches).

[1] https://godbolt.org/z/5bc1bTrhe

Reply via email to