Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions

Aleksander Alekseev Wed, 18 Feb 2026 06:58:23 -0800

Hi,

> I only rebased v3 and improved the commit messages, but I didn't
> account for Masahiko Sawada's feedback for 0002. Andrey, are you still
> working on this or others can pick it up?
>
> The patch is not on the commitfest, so I'm about to add it.


Here is patch v5 where I accounted for the previous feedback from
Masahiko Sawada and also made some other changes, see below.

> How about the error message like "invalid input length for type uuid"?
> I think "uuid" should be lower case as it indicates PostgreSQL uuid
> data type, and it's better to use %s format instead of directly
> writing "uuid" (see string_to_uuid() for example).

Makes sense. Fixed.

> As for the errdetail message, should we add "bytea" also after "got %d"?

You probably meant "got %d bytes", not "got %d bytea". I believe the
current message is fine, but maybe native speakers will correct us.

> We already have tests for casting bytes to integer data types in
> strings.sql. I suggest moving the casting tests from bytea to uuid
> into therel.

I disagree on the grounds that there are zero tests related to UUID in
strings.sql; uuid.sql is a more appropriate place for these tests IMO.
However if someone seconds the idea we can easily move the tests at
any time.

> For the uuid.sql file, we could add a test to verify that
> a UUID value remains unchanged when it's cast to bytea and back to
> UUID. For example,
>
> SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;

Good point. Added.

> base32hex_encode() doesn't seem to add '=' paddings, but is it
> intentional? I don't see any description in RFC 4648 that we can omit
> '=' paddings.

You are right, both base32 and base32hex should add paddings;
substring() can be used if necessary. Fixed.

> I think the patch should add tests not only for uuid data type but
> also for general cases like other encodings.

Yes, and the good place for these tests would be closer to other tests
for encode() and decode() i.e. strings.sql. Fixed.

While working on it I noticed some inconsistencies between base32hex
implementation and our current implementation of base64. As an
example, we don't allow `=` input:

```
=# SELECT decode('=', 'base64');
ERROR:  unexpected "=" while decoding base64 sequence
```

... while base32hex did. I fixed such inconsistencies too.

> In uuid.sql tests, how about adding some tests to check if base32hex
> maintains the sortability of UUIDv7 data?

Agree. Added.

> I think we should update the documentation in the uuid section about
> casting data between bytea and uuid. For references, we have a similar
> description for bytea and integer[1].

Fair point. Fixed.

-- 
Best regards,
Aleksander Alekseev

From 02bb2101b393587294cb3a93bd766091ca1cd32b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <[email protected]>
Date: Tue, 28 Oct 2025 16:33:17 +0000
Subject: [PATCH v5 1/2] Allow explicit casting between bytea and UUID
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This enables using encode() and decode() to convert UUIDs to and from
alternative formats, such as base64.

Author:	Dagfinn Ilmari Mannsåker <[email protected]>
Author: Aleksander Alekseev <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Reviewed-by: Jelte Fennema-Nio <[email protected]>
Discussion: https://postgr.es/m/CAJ7c6TOramr1UTLcyB128LWMqita1Y7%3Darq3KHaU%3Dqikf5yKOQ%40mail.gmail.com
---
 doc/src/sgml/datatype.sgml         | 11 +++++++++++
 src/backend/utils/adt/bytea.c      | 27 +++++++++++++++++++++++++++
 src/include/catalog/pg_cast.dat    |  6 ++++++
 src/include/catalog/pg_proc.dat    |  7 +++++++
 src/test/regress/expected/uuid.out | 22 ++++++++++++++++++++++
 src/test/regress/sql/uuid.sql      |  5 +++++
 6 files changed, 78 insertions(+)

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 3017c674040..f8264b119ab 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4439,6 +4439,17 @@ a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
     Output is always in the standard form.
    </para>
 
+   <para>
+   It is possible to cast <type>uuid</type> values to and from type
+   <type>bytea</type>. This allows using <literal>encode()</literal>
+   and <literal>decode()</literal> functions for <type>uuid</type>.
+   Some examples:
+<programlisting>
+encode('1ea3d64c-bc40-4cc3-84bb-6b11ee31e5c2'::uuid::bytea, 'base64')
+decode('HqPWTLxATMOEu2sR7jHlwg==', 'base64')::uuid
+</programlisting>
+   </para>
+
    <para>
     See <xref linkend="functions-uuid"/> for how to generate a UUID in
     <productname>PostgreSQL</productname>.
diff --git a/src/backend/utils/adt/bytea.c b/src/backend/utils/adt/bytea.c
index fd7662d41ee..4dc83671aa5 100644
--- a/src/backend/utils/adt/bytea.c
+++ b/src/backend/utils/adt/bytea.c
@@ -28,6 +28,7 @@
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/sortsupport.h"
+#include "utils/uuid.h"
 #include "varatt.h"
 
 /* GUC variable */
@@ -1340,3 +1341,29 @@ int8_bytea(PG_FUNCTION_ARGS)
 {
 	return int8send(fcinfo);
 }
+
+/* Cast bytea -> uuid */
+Datum
+bytea_uuid(PG_FUNCTION_ARGS)
+{
+	bytea	   *v = PG_GETARG_BYTEA_PP(0);
+	int			len = VARSIZE_ANY_EXHDR(v);
+	pg_uuid_t  *uuid;
+
+	if (len != UUID_LEN)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+				 errmsg("invalid input length for type %s", "uuid"),
+				 errdetail("Expected %d bytes, got %d.", UUID_LEN, len)));
+
+	uuid = (pg_uuid_t *) palloc(sizeof(pg_uuid_t));
+	memcpy(uuid->data, VARDATA_ANY(v), UUID_LEN);
+	PG_RETURN_UUID_P(uuid);
+}
+
+/* Cast uuid -> bytea; can just use uuid_send() */
+Datum
+uuid_bytea(PG_FUNCTION_ARGS)
+{
+	return uuid_send(fcinfo);
+}
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index 9b1cfb1b590..a7b6d812c5a 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -362,6 +362,12 @@
 { castsource => 'bytea', casttarget => 'int8', castfunc => 'int8(bytea)',
   castcontext => 'e', castmethod => 'f' },
 
+# Allow explicit coercions between bytea and uuid type
+{ castsource => 'bytea', casttarget => 'uuid', castfunc => 'uuid(bytea)',
+  castcontext => 'e', castmethod => 'f' },
+{ castsource => 'uuid', casttarget => 'bytea', castfunc => 'bytea(uuid)',
+  castcontext => 'e', castmethod => 'f' },
+
 # Allow explicit coercions between int4 and "char"
 { castsource => 'char', casttarget => 'int4', castfunc => 'int4(char)',
   castcontext => 'e', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 83f6501df38..d2b30390671 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1208,6 +1208,13 @@
   proname => 'int8', prorettype => 'int8', proargtypes => 'bytea',
   prosrc => 'bytea_int8' },
 
+{ oid => '9880', descr => 'convert uuid to bytea',
+  proname => 'bytea', prorettype => 'bytea', proargtypes => 'uuid',
+  prosrc => 'uuid_bytea' },
+{ oid => '9881', descr => 'convert bytea to uuid',
+  proname => 'uuid', prorettype => 'uuid', proargtypes => 'bytea',
+  prosrc => 'bytea_uuid' },
+
 { oid => '449', descr => 'hash',
   proname => 'hashint2', prorettype => 'int4', proargtypes => 'int2',
   prosrc => 'hashint2' },
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index 95392003b86..d157ef7d0b3 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -305,5 +305,27 @@ SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
  
 (1 row)
 
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+               bytea                
+------------------------------------
+ \x5b35380a714349129b55f322699c6770
+(1 row)
+
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+                 uuid                 
+--------------------------------------
+ 019a2f85-9ced-7225-b99d-9c55044a2563
+(1 row)
+
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error
+ERROR:  invalid input length for type uuid
+DETAIL:  Expected 16 bytes, got 8.
+SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;
+ matched 
+---------
+ t
+(1 row)
+
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
diff --git a/src/test/regress/sql/uuid.sql b/src/test/regress/sql/uuid.sql
index 465153a0341..f512f4dea1d 100644
--- a/src/test/regress/sql/uuid.sql
+++ b/src/test/regress/sql/uuid.sql
@@ -146,6 +146,11 @@ SELECT uuid_extract_timestamp('017F22E2-79B0-7CC3-98C4-DC0C0C07398F') = 'Tuesday
 SELECT uuid_extract_timestamp(gen_random_uuid());  -- null
 SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
 
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error
+SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;
 
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.43.0

From 857ec382cf2a220748e11d6bed28c8b006282589 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Wed, 29 Oct 2025 15:53:12 +0400
Subject: [PATCH v5 2/2] Add base32hex encoding support to encode() and
 decode()

Implement base32hex encoding/decoding per RFC 4648 Section 7 for
encode() and decode() functions. This encoding uses the extended hex
alphabet (0-9, A-V) which preserves sort order.

The encode() function produces padded output, while decode() accepts
both padded and unpadded input. Decoding is case-insensitive.

Author: Andrey Borodin <[email protected]>
Author: Aleksander Alekseev <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Suggested-by: Sergey Prokhorenko <[email protected]>
Discussion: https://postgr.es/m/CAJ7c6TOramr1UTLcyB128LWMqita1Y7%3Darq3KHaU%3Dqikf5yKOQ%40mail.gmail.com
---
 doc/src/sgml/func/func-binarystring.sgml |  25 ++++
 src/backend/utils/adt/encode.c           | 158 +++++++++++++++++++++++
 src/test/regress/expected/strings.out    | 107 ++++++++++++++-
 src/test/regress/expected/uuid.out       |  16 +++
 src/test/regress/sql/strings.sql         |  29 ++++-
 src/test/regress/sql/uuid.sql            |   9 ++
 6 files changed, 342 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/func/func-binarystring.sgml b/doc/src/sgml/func/func-binarystring.sgml
index b256381e01f..51be0463eec 100644
--- a/doc/src/sgml/func/func-binarystring.sgml
+++ b/doc/src/sgml/func/func-binarystring.sgml
@@ -729,6 +729,7 @@
        <parameter>format</parameter> values are:
        <link linkend="encode-format-base64"><literal>base64</literal></link>,
        <link linkend="encode-format-base64url"><literal>base64url</literal></link>,
+       <link linkend="encode-format-base32hex"><literal>base32hex</literal></link>,
        <link linkend="encode-format-escape"><literal>escape</literal></link>,
        <link linkend="encode-format-hex"><literal>hex</literal></link>.
       </para>
@@ -804,6 +805,30 @@
      </listitem>
     </varlistentry>
 
+    <varlistentry id="encode-format-base32hex">
+     <term>base32hex
+      <indexterm>
+       <primary>base32hex format</primary>
+      </indexterm></term>
+     <listitem>
+      <para>
+       The <literal>base32hex</literal> format is that of
+       <ulink url="https://datatracker.ietf.org/doc/html/rfc4648#section-7";>
+       RFC 4648 Section 7</ulink>.  It uses the extended hex alphabet
+       (0-9, A-V) which preserves sort order when encoding binary data.
+       The <function>encode</function> function produces padded output,
+       while <function>decode</function> accepts both padded and unpadded
+       input. Decoding is case-insensitive and ignores whitespace characters.
+      </para>
+      <para>
+       This format can be used for encoding UUIDs in a compact, sortable format:
+       <literal>substring(encode(uuid_value :: bytea, 'base32hex') from 1 for 26)</literal>
+       produces a 26-character string compared to the standard 36-character
+       UUID representation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry id="encode-format-escape">
      <term>escape
      <indexterm>
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index f5f835e944a..6153a5cb5f2 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -825,6 +825,158 @@ esc_dec_len(const char *src, size_t srclen)
 	return len;
 }
 
+/*
+ * BASE32HEX
+ */
+
+static const char base32hex_table[] = "0123456789ABCDEFGHIJKLMNOPQRSTUV";
+
+static uint64
+base32hex_enc_len(const char *src, size_t srclen)
+{
+	/* 5 bytes encode to 8 characters, round up to multiple of 8 for padding */
+	return ((uint64) srclen + 4) / 5 * 8;
+}
+
+static uint64
+base32hex_dec_len(const char *src, size_t srclen)
+{
+	/* Decode length is (srclen * 5) / 8, but we may have padding */
+	return ((uint64) srclen * 5) / 8;
+}
+
+static uint64
+base32hex_encode(const char *src, size_t srclen, char *dst)
+{
+	const unsigned char *data = (const unsigned char *) src;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	uint64		output_pos = 0;
+	size_t		i;
+
+	for (i = 0; i < srclen; i++)
+	{
+		/* Add 8 bits to the buffer */
+		bits_buffer = (bits_buffer << 8) | data[i];
+		bits_in_buffer += 8;
+
+		/* Extract 5-bit chunks while we have enough bits */
+		while (bits_in_buffer >= 5)
+		{
+			bits_in_buffer -= 5;
+			/* Extract top 5 bits */
+			dst[output_pos++] = base32hex_table[(bits_buffer >> bits_in_buffer) & 0x1F];
+			/* Clear the extracted bits by masking */
+			bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+		}
+	}
+
+	/* Handle remaining bits (if any) */
+	if (bits_in_buffer > 0)
+	{
+		dst[output_pos++] = base32hex_table[(bits_buffer << (5 - bits_in_buffer)) & 0x1F];
+	}
+
+	/* Add padding to make length a multiple of 8 (per RFC 4648) */
+	while (output_pos % 8 != 0)
+	{
+		dst[output_pos++] = '=';
+	}
+
+	return output_pos;
+}
+
+static uint64
+base32hex_decode(const char *src, size_t srclen, char *dst)
+{
+	const unsigned char *data = (const unsigned char *) src;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	uint64		output_pos = 0;
+	size_t		i;
+	int			pos = 0;		/* position within 8-character group (0-7) */
+	bool		end = false;	/* have we seen padding? */
+
+	for (i = 0; i < srclen; i++)
+	{
+		unsigned char c = data[i];
+		int			val;
+
+		/* Skip whitespace */
+		if (c == ' ' || c == '\t' || c == '\n' || c == '\r')
+			continue;
+
+		if (c == '=')
+		{
+			/*
+			 * Padding is only valid at positions 2, 4, 5, or 7 within an
+			 * 8-character group (corresponding to 1, 2, 3, or 4 input bytes).
+			 * We only check the position for the first '=' character.
+			 */
+			if (!end)
+			{
+				if (pos != 2 && pos != 4 && pos != 5 && pos != 7)
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+							 errmsg("unexpected \"=\" while decoding %s sequence",
+									"base32hex")));
+				end = true;
+			}
+			pos++;
+			continue;
+		}
+
+		/* No data characters allowed after padding */
+		if (end)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("invalid symbol \"%.*s\" found while decoding %s sequence",
+							pg_mblen((const char *) &c), (const char *) &c,
+							"base32hex")));
+
+		/* Decode base32hex character (0-9, A-V, case-insensitive) */
+		if (c >= '0' && c <= '9')
+			val = c - '0';
+		else if (c >= 'A' && c <= 'V')
+			val = c - 'A' + 10;
+		else if (c >= 'a' && c <= 'v')
+			val = c - 'a' + 10;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("invalid symbol \"%.*s\" found while decoding %s sequence",
+							pg_mblen((const char *) &c), (const char *) &c,
+							"base32hex")));
+
+		/* Add 5 bits to buffer */
+		bits_buffer = (bits_buffer << 5) | val;
+		bits_in_buffer += 5;
+		pos++;
+
+		/* Extract 8-bit bytes when we have enough bits */
+		while (bits_in_buffer >= 8)
+		{
+			bits_in_buffer -= 8;
+			dst[output_pos++] = (unsigned char) (bits_buffer >> bits_in_buffer);
+			/* Clear the extracted bits */
+			bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+		}
+
+		/* Reset position after each complete 8-character group */
+		if (pos == 8)
+			pos = 0;
+	}
+
+	/* Verify no extra bits remain (padding bits should be zero) */
+	if (bits_in_buffer > 0 && (bits_buffer & ((1ULL << bits_in_buffer) - 1)) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid base32hex end sequence"),
+				 errhint("Input data has non-zero padding bits.")));
+
+	return output_pos;
+}
+
 /*
  * Common
  */
@@ -854,6 +1006,12 @@ static const struct
 			pg_base64url_enc_len, pg_base64url_dec_len, pg_base64url_encode, pg_base64url_decode
 		}
 	},
+	{
+		"base32hex",
+		{
+			base32hex_enc_len, base32hex_dec_len, base32hex_encode, base32hex_decode
+		}
+	},
 	{
 		"escape",
 		{
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index f38688b5c37..910757537e7 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2605,9 +2605,114 @@ SELECT decode('00', 'invalid');           -- error
 ERROR:  unrecognized encoding: "invalid"
 HINT:  Valid encodings are "base64", "base64url", "escape", and "hex".
 --
--- base64url encoding/decoding
+-- base32hex encoding/decoding
 --
 SET bytea_output TO hex;
+SELECT encode('', 'base32hex');  -- ''
+ encode 
+--------
+ 
+(1 row)
+
+SELECT encode('\x11', 'base32hex');  -- '24======'
+  encode  
+----------
+ 24======
+(1 row)
+
+SELECT encode('\x1122', 'base32hex');  -- '24H0===='
+  encode  
+----------
+ 24H0====
+(1 row)
+
+SELECT encode('\x112233', 'base32hex');  -- '24H36==='
+  encode  
+----------
+ 24H36===
+(1 row)
+
+SELECT encode('\x11223344', 'base32hex');  -- '24H36H0='
+  encode  
+----------
+ 24H36H0=
+(1 row)
+
+SELECT encode('\x1122334455', 'base32hex');  -- '24H36H2L'
+  encode  
+----------
+ 24H36H2L
+(1 row)
+
+SELECT encode('\x112233445566', 'base32hex');  -- '24H36H2LCO======'
+      encode      
+------------------
+ 24H36H2LCO======
+(1 row)
+
+SELECT decode('', 'base32hex');  -- ''
+ decode 
+--------
+ \x
+(1 row)
+
+SELECT decode('24======', 'base32hex');  -- \x11
+ decode 
+--------
+ \x11
+(1 row)
+
+SELECT decode('24H0====', 'base32hex');  -- \x1122
+ decode 
+--------
+ \x1122
+(1 row)
+
+SELECT decode('24H36===', 'base32hex');  -- \x112233
+  decode  
+----------
+ \x112233
+(1 row)
+
+SELECT decode('24H36H0=', 'base32hex');  -- \x11223344
+   decode   
+------------
+ \x11223344
+(1 row)
+
+SELECT decode('24H36H2L', 'base32hex');  -- \x1122334455
+    decode    
+--------------
+ \x1122334455
+(1 row)
+
+SELECT decode('24H36H2LCO======', 'base32hex');  -- \x112233445566
+     decode     
+----------------
+ \x112233445566
+(1 row)
+
+SELECT decode('24', 'base32hex');  -- OK, padding `=` are optional
+ decode 
+--------
+ \x11
+(1 row)
+
+SELECT decode('24h36h2lco', 'base32hex');  -- OK, the encoding is case-insensitive
+     decode     
+----------------
+ \x112233445566
+(1 row)
+
+SELECT decode('=', 'base32hex');  -- error
+ERROR:  unexpected "=" while decoding base32hex sequence
+SELECT decode('W', 'base32hex');  -- error
+ERROR:  invalid symbol "W" found while decoding base32hex sequence
+SELECT decode('24H36H0=24', 'base32hex'); -- error
+ERROR:  invalid symbol "2" found while decoding base32hex sequence
+--
+-- base64url encoding/decoding
+--
 -- Simple encoding/decoding
 SELECT encode('\x69b73eff', 'base64url');  -- abc-_w
  encode 
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index d157ef7d0b3..df25a6aa5c5 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -327,5 +327,21 @@ SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;
  t
 (1 row)
 
+-- make sure base32hex encoding works with UUIDs and preserves ordering
+SELECT orig, encode(orig :: bytea, 'base32hex') AS enc
+FROM unnest(ARRAY[
+    '123e4567-e89b-12d3-a456-426614174000',
+    '00000000-0000-0000-0000-000000000000',
+    '11111111-1111-1111-1111-111111111111',
+    'ffffffff-ffff-ffff-ffff-ffffffffffff'] :: uuid[]
+) AS orig ORDER BY enc;
+                 orig                 |               enc                
+--------------------------------------+----------------------------------
+ 00000000-0000-0000-0000-000000000000 | 00000000000000000000000000======
+ 11111111-1111-1111-1111-111111111111 | 248H248H248H248H248H248H24======
+ 123e4567-e89b-12d3-a456-426614174000 | 28V4APV8JC9D792M89J185Q000======
+ ffffffff-ffff-ffff-ffff-ffffffffffff | VVVVVVVVVVVVVVVVVVVVVVVVVS======
+(4 rows)
+
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index d8a09737668..b5237e85172 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -835,10 +835,37 @@ SELECT encode('\x01'::bytea, 'invalid');  -- error
 SELECT decode('00', 'invalid');           -- error
 
 --
--- base64url encoding/decoding
+-- base32hex encoding/decoding
 --
 SET bytea_output TO hex;
 
+SELECT encode('', 'base32hex');  -- ''
+SELECT encode('\x11', 'base32hex');  -- '24======'
+SELECT encode('\x1122', 'base32hex');  -- '24H0===='
+SELECT encode('\x112233', 'base32hex');  -- '24H36==='
+SELECT encode('\x11223344', 'base32hex');  -- '24H36H0='
+SELECT encode('\x1122334455', 'base32hex');  -- '24H36H2L'
+SELECT encode('\x112233445566', 'base32hex');  -- '24H36H2LCO======'
+
+SELECT decode('', 'base32hex');  -- ''
+SELECT decode('24======', 'base32hex');  -- \x11
+SELECT decode('24H0====', 'base32hex');  -- \x1122
+SELECT decode('24H36===', 'base32hex');  -- \x112233
+SELECT decode('24H36H0=', 'base32hex');  -- \x11223344
+SELECT decode('24H36H2L', 'base32hex');  -- \x1122334455
+SELECT decode('24H36H2LCO======', 'base32hex');  -- \x112233445566
+
+SELECT decode('24', 'base32hex');  -- OK, padding `=` are optional
+SELECT decode('24h36h2lco', 'base32hex');  -- OK, the encoding is case-insensitive
+SELECT decode('=', 'base32hex');  -- error
+SELECT decode('W', 'base32hex');  -- error
+SELECT decode('24H36H0=24', 'base32hex'); -- error
+
+
+--
+-- base64url encoding/decoding
+--
+
 -- Simple encoding/decoding
 SELECT encode('\x69b73eff', 'base64url');  -- abc-_w
 SELECT decode('abc-_w', 'base64url');      -- \x69b73eff
diff --git a/src/test/regress/sql/uuid.sql b/src/test/regress/sql/uuid.sql
index f512f4dea1d..278a5773ada 100644
--- a/src/test/regress/sql/uuid.sql
+++ b/src/test/regress/sql/uuid.sql
@@ -152,5 +152,14 @@ SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
 SELECT '\x1234567890abcdef'::bytea::uuid; -- error
 SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;
 
+-- make sure base32hex encoding works with UUIDs and preserves ordering
+SELECT orig, encode(orig :: bytea, 'base32hex') AS enc
+FROM unnest(ARRAY[
+    '123e4567-e89b-12d3-a456-426614174000',
+    '00000000-0000-0000-0000-000000000000',
+    '11111111-1111-1111-1111-111111111111',
+    'ffffffff-ffff-ffff-ffff-ffffffffffff'] :: uuid[]
+) AS orig ORDER BY enc;
+
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.43.0

Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions

Reply via email to