Re: [PATCH] Hex-coding optimizations using SVE on ARM.

John Naylor Tue, 14 Jan 2025 22:57:53 -0800

On Tue, Jan 14, 2025 at 11:57 PM Nathan Bossart
<[email protected]> wrote:
>
> On Tue, Jan 14, 2025 at 12:59:04AM -0500, Tom Lane wrote:
> > John Naylor <[email protected]> writes:
> >> We can do about as well simply by changing the nibble lookup to a byte
> >> lookup, which works on every compiler and architecture:
>
> Nice.  I tried enabling auto-vectorization and loop unrolling on top of
> this patch, and the numbers looked the same.  I think we'd need CPU
> intrinsics or an even bigger lookup table to do any better.


Thanks for looking further! Yeah, I like that the table is still only 512 bytes.

> > I didn't attempt to verify your patch, but I do prefer addressing
> > this issue in a machine-independent fashion.  I also like the brevity
> > of the patch (though it could do with some comments perhaps, not that
> > the existing code has any).
>
> +1

Okay, I added a comment. I also agree with Michael that my quick
one-off was a bit hard to read so I've cleaned it up a bit. I plan to
commit the attached by Friday, along with any bikeshedding that
happens by then.

--
John Naylor
Amazon Web Services

From a62aea5fdbfbd215435ddc4c294897caa292b6f7 Mon Sep 17 00:00:00 2001
From: John Naylor <[email protected]>
Date: Wed, 15 Jan 2025 13:28:26 +0700
Subject: [PATCH v3] Speed up hex_encode with bytewise lookup

Previously, hex_encode looked up each nibble of the input
separately. We now use a larger lookup table containing the two-byte
encoding of every possible input byte, resulting in a 1/3 reduction
in encoding time.

Reviewed by Michael Paquier, Tom Lane, and Nathan Bossart

Discussion: https://postgr.es/m/CANWCAZZvXuJMgqMN4u068Yqa19CEjS31tQKZp_qFFFbgYfaXqQ%40mail.gmail.com
---
 src/backend/utils/adt/encode.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index 4a6fcb56cd..7fee154b0d 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -145,7 +145,23 @@ binary_decode(PG_FUNCTION_ARGS)
  * HEX
  */
 
-static const char hextbl[] = "0123456789abcdef";
+static const char hextbl[512] =
+"000102030405060708090a0b0c0d0e0f"
+"101112131415161718191a1b1c1d1e1f"
+"202122232425262728292a2b2c2d2e2f"
+"303132333435363738393a3b3c3d3e3f"
+"404142434445464748494a4b4c4d4e4f"
+"505152535455565758595a5b5c5d5e5f"
+"606162636465666768696a6b6c6d6e6f"
+"707172737475767778797a7b7c7d7e7f"
+"808182838485868788898a8b8c8d8e8f"
+"909192939495969798999a9b9c9d9e9f"
+"a0a1a2a3a4a5a6a7a8a9aaabacadaeaf"
+"b0b1b2b3b4b5b6b7b8b9babbbcbdbebf"
+"c0c1c2c3c4c5c6c7c8c9cacbcccdcecf"
+"d0d1d2d3d4d5d6d7d8d9dadbdcdddedf"
+"e0e1e2e3e4e5e6e7e8e9eaebecedeeef"
+"f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff";
 
 static const int8 hexlookup[128] = {
 	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
@@ -165,9 +181,16 @@ hex_encode(const char *src, size_t len, char *dst)
 
 	while (src < end)
 	{
-		*dst++ = hextbl[(*src >> 4) & 0xF];
-		*dst++ = hextbl[*src & 0xF];
+		unsigned char usrc = *((unsigned char *) src);
+
+		/*
+		 * Each input byte results in two output bytes, so we use the unsigned
+		 * input byte multiplied by two as the lookup key.
+		 */
+		memcpy(dst, &hextbl[2 * usrc], 2);
+
 		src++;
+		dst += 2;
 	}
 	return (uint64) len * 2;
 }
-- 
2.47.1

Re: [PATCH] Hex-coding optimizations using SVE on ARM.

Reply via email to