Hi.

Em qua., 15 de jan. de 2025 às 07:57, John Naylor <johncnaylo...@gmail.com>
escreveu:

> On Wed, Jan 15, 2025 at 2:14 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> > Couple of thoughts:
> >
> > 1. I was actually hoping for a comment on the constant's definition,
> > perhaps along the lines of
> >
> > /*
> >  * The hex expansion of each possible byte value (two chars per value).
> >  */
>
> Works for me. With that, did you mean we then wouldn't need a comment
> in the code?
>
> > 2. Since "src" is defined as "const char *", I'm pretty sure that
> > pickier compilers will complain that
> >
> > +               unsigned char usrc = *((unsigned char *) src);
> >
> > results in casting away const.  Recommend
> >
> > +               unsigned char usrc = *((const unsigned char *) src);
>
> Thanks for the reminder!
>
> > 3.  I really wonder if
> >
> > +               memcpy(dst, &hextbl[2 * usrc], 2);
> >
> > is faster than copying the two bytes manually, along the lines of
> >
> > +               *dst++ = hextbl[2 * usrc];
> > +               *dst++ = hextbl[2 * usrc + 1];
> >
> > Compilers that inline memcpy() may arrive at the same machine code,
> > but why rely on the compiler to make that optimization?  If the
> > compiler fails to do so, an out-of-line memcpy() call will surely
> > be a loser.
>
> See measurements at the end. As for compilers, gcc 3.4.6 and clang
> 3.0.0 can inline the memcpy. The manual copy above only gets combined
> to a single word starting with gcc 12 and clang 15, and latest MSVC
> still can't do it (4A in the godbolt link below). Are there any
> buildfarm animals around that may not inline memcpy for word-sized
> input?
>
> > A variant could be
> >
> > +               const char *hexptr = &hextbl[2 * usrc];
> > +               *dst++ = hexptr[0];
> > +               *dst++ = hexptr[1];
> >
> > but this supposes that the compiler fails to see the common
> > subexpression in the other formulation, which I believe
> > most modern compilers will see.
>
> This combines to a single word starting with clang 5, but does not
> work on gcc 14.2 or gcc trunk (4B below). I have gcc 14.2 handy, and
> on my machine bytewise load/stores are somewhere in the middle:
>
> master    1158.969 ms
> v3         776.791 ms
> variant 4A 775.777 ms
> variant 4B 969.945 ms
>
> https://godbolt.org/z/ajToordKq

Your example from godbolt, has a
have an important difference, which modifies the assembler result.

-static const char hextbl[] =
"000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff"
;
+static const char hextbl[512] =
"000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff"
;


best regards,
Ranier Vilela

Reply via email to