Re: [PATCH 2/2] string: improve strlen performance

Andy Shevchenko Thu, 02 May 2024 08:00:55 -0700

On Thu, May 2, 2024 at 5:14 PM Hsin-Yu.Chen <[email protected]> wrote:
>
> Port `strlen` in gcc,


This is the Linux kernel project. What does this mean?
Also note we refer to the function as strlen() (and no [back]quotes).

> which enhance performance over 10 times

>
> Please refer to these following articles
> 1. [Determine if a word has a byte less than n]
>    (https://graphics.stanford.edu/~seander/bithacks.html#HasLessInWord)
> 2. [Determine if a word has a zero byte]
>    (https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord)

Make these proper Link: tags

Link: URL#1 [1]
Link: URL#2 [2]

...

> -       const char *sc;
> +       const char *char_ptr;

No need to change the var name for this.

> +       const unsigned long *longword_ptr;
> +       unsigned long longword, himagic, lomagic;

Keep it in reversed xmas tree order.

...

> +       /* Handle the first few characters by reading one character at a time.
> +        * Do this until CHAR_PTR is aligned on a longword boundary.
> +        */

/*
 * This is the wrong comment style. You may
 * use this example.
 */

...

> +       for (char_ptr = s; ((unsigned long) char_ptr
> +               & (sizeof(longword) - 1)) != 0;
> +               ++char_ptr)

This is too verbose (too many unneeded symbols) and why pre-increment?
What is special about it?

...

> +       /* All these elucidatory comments refer to 4-byte longwords,
> +        * but the theory applies equally well to 8-byte longwords.
> +        */

Use proper style.

> +       longword_ptr = (unsigned long *) char_ptr;

No space after casting and why do you need it?

...

> +       /* Bits 31, 24, 16, and 8 of this number are zero.
> +        * Call these bits the "holes."
> +        * Note that there is a hole just to the left of
> +        * each byte, with an extra at the end:
> +        * bits:  01111110 11111110 11111110 11111111
> +        * bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD
> +        * The 1-bits make sure that carries propagate to the next 0-bit.
> +        * The 0-bits provide holes for carries to fall into.
> +        */

Use proper style.

...

> +               /* 64-bit version of the magic. */
> +               /* Do the shift in two steps to avoid a warning if long has 
> 32 bits.
> +                */

Ditto.

...

> +       if (sizeof(longword) > 8)
> +               abort();

Huh?!

...

> +       /* Instead of the traditional loop which tests each character,
> +        * we will test a longword at a time.  The tricky part is testing
> +        * if *any of the four* bytes in the longword in question are zero.
> +        */

Proper style, please.

...

> +       for (;;) {
> +               longword = *longword_ptr++;
> +               if (((longword - lomagic) & ~longword & himagic) != 0) {
> +
> +                       /* Which of the bytes was the zero?
> +                        * If none of them were, it was a misfire; continue 
> the search.
> +                        */
> +                       const char *cp = (const char *) (longword_ptr - 1);

> +                       if (cp[0] == 0)
> +                               return cp - s;
> +                       else if (cp[1] == 0)
> +                               return cp - s + 1;
> +                       else if (cp[2] == 0)
> +                               return cp - s + 2;
> +                       else if (cp[3] == 0)
> +                               return cp - s + 3;

> +                       if (sizeof(longword) > 4) {

  if (... <= 4)
    continue;

> +                               if (cp[4] == 0)
> +                                       return cp - s + 4;
> +                               else if (cp[5] == 0)
> +                                       return cp - s + 5;
> +                               else if (cp[6] == 0)
> +                                       return cp - s + 6;
> +                               else if (cp[7] == 0)
> +                                       return cp - s + 7;

A lot of redundant 'else':s.

> +                       }
> +               }
> +       }

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH 2/2] string: improve strlen performance

Reply via email to