On 2022-07-08 15:02, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
>> Sent: Friday, 8 July 2022 14.44
>>
>> On 2022-07-07 23:44, Morten Brørup wrote:
>>>> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
>>>> Sent: Thursday, 7 July 2022 20.35
>>>>
>>>> From: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>>>>
>>>> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed
>> its
>>>> data through an uint16_t pointer, which allowed the compiler to
>> assume
>>>> the data was 16-bit aligned. This in turn would, with certain
>>>> architectures and compiler flag combinations, result in code with
>> SIMD
>>>> load or store instructions with restrictions on data alignment.
>>>>
>>>> This patch keeps the old algorithm, but data is read using memcpy()
>>>> instead of direct pointer access, forcing the compiler to always
>>>> generate code that handles unaligned input. The __may_alias__ GCC
>>>> attribute is no longer needed.
>>>>
>>>> The data on which the Internet checksum functions operates are
>> almost
>>>> always 16-bit aligned, but there are exceptions. In particular, the
>>>> PDCP protocol header may (literally) have an odd size.
>>>>
>>>> Performance impact seems to range from none to a very slight
>>>> regression.
>>>>
>>>> Bugzilla ID: 1035
>>>> Cc: sta...@dpdk.org
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>>>> ---
>>>>    lib/net/rte_ip.h | 19 ++++++++++++-------
>>>>    1 file changed, 12 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>>>> index b502481670..a9e6251f14 100644
>>>> --- a/lib/net/rte_ip.h
>>>> +++ b/lib/net/rte_ip.h
>>>> @@ -160,18 +160,23 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
>>>> *ipv4_hdr)
>>>>    static inline uint32_t
>>>>    __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>>>    {
>>>> -  /* extend strict-aliasing rules */
>>>> -  typedef uint16_t __attribute__((__may_alias__)) u16_p;
>>>> -  const u16_p *u16_buf = (const u16_p *)buf;
>>>> -  const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>>>> +  const void *end;
>>>
>>> I would set "end" here instead, possibly making the pointer const
>> too. And add spaces around '/'.
>>> const void * const end = RTE_PTR_ADD(buf, (len / sizeof(uint16_t)) *
>> sizeof(uint16_t));
>>>
>>
>> I don't think that makes the code more readable.
> 
> It's only a matter of taste... Your code, your decision. :-)
> 
> I think the spaces are required by the coding standard; not sure, though.
> 

If it isn't in the coding standard, it should be. But if you add spaces, 
you have to break the line, to fit into 80 characters. A net loss, IMO.

>>
>>>>
>>>> -  for (; u16_buf != end; ++u16_buf)
>>>> -          sum += *u16_buf;
>>>> +  for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) *
>>>> sizeof(uint16_t));
>>>> +       buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
>>>> +          uint16_t v;
>>>> +
>>>> +          memcpy(&v, buf, sizeof(uint16_t));
>>>> +          sum += v;
>>>> +  }
>>>>
>>>>            /* if length is odd, keeping it byte order independent */
>>>>            if (unlikely(len % 2)) {
>>>> +          uint8_t last;
>>>>                    uint16_t left = 0;
>>>> -          *(unsigned char *)&left = *(const unsigned char *)end;
>>>> +
>>>> +          memcpy(&last, end, 1);
>>>> +          *(unsigned char *)&left = last;
>>>
>>> Couldn't you just memcpy(&left, end, 1), and omit the temporary
>> variable "last"?
>>>
>>
>> Good point.
>>
>> I don't like how this code is clever vis-à-vis byte order, but then I
>> also don't have a better suggestion.
> 
> The byte ordering cleverness has its roots in RFC 1071.
> 
> Stephen suggested using a union, although in a slightly different context. 
> I'm not sure it will be more readable here, because it will require #ifdef to 
> support byte ordering. Just thought I'd mention it, for your consideration.
> 
> Your patch v2 just reached my inbox, and it looks good. No further response 
> to this email is expected.
> 

Reply via email to