Hi all,

I would like to report an issue with the pg_trgm extension on 
cross-architecture replication scenarios. When an x86_64 standby server is 
replicating from an aarch64 primary server or vice versa, the gist_trgm_ops 
opclass returns different results on the primary and standby. Masahiko 
previously reported a similar issue affecting the pg_bigm extension [1].

To reproduce, execute the following on the x86_64 primary server:

CREATE EXTENSION pg_trgm;
CREATE TABLE tbl (c text);
CREATE INDEX ON tbl USING gist (c gist_trgm_ops);
INSERT INTO tbl VALUES ('Bóbr');

On the x86_64 primary server:

postgres=> select * from tbl where c like '%Bób%';
  c
------
Bóbr
(1 row)

On the aarch64 replica server:

postgres=> select * from tbl where c like '%Bób%';
c
---
(0 rows)

The root cause looks the same as the pg_bigm issue that Masahiko reported. To 
compare trigrams, pg_trgm uses a numerical comparison of chars [2]. On x86_64 a 
char is signed by default, whereas on aarch64 it is unsigned by default. 
gist_trgm_ops expects the trigram list to be sorted, but due to the different 
signedness of chars, the sort order is broken when replicating the values 
across architectures.

The different sort behaviour can be demonstrated using show_trgm.

On the x86_64 primary server:

postgres=> SELECT show_trgm('Bóbr');
                show_trgm
------------------------------------------
{0x89194c,"  b","br ",0x707c72,0x7f7849}
(1 row)

On the aarch64 replica server:

postgres=> SELECT show_trgm('Bóbr');
                show_trgm
------------------------------------------
{"  b","br ",0x707c72,0x7f7849,0x89194c}
(1 row)

One simple solution for this specific case is to declare the char signedness in 
the CMPPCHAR macro.

--- a/contrib/pg_trgm/trgm.h
+++ b/contrib/pg_trgm/trgm.h
@@ -42,7 +42,7 @@
typedef char trgm[3];
 #define CMPCHAR(a,b) ( ((a)==(b)) ? 0 : ( ((a)<(b)) ? -1 : 1 ) )
-#define CMPPCHAR(a,b,i)  CMPCHAR( *(((const char*)(a))+i), *(((const 
char*)(b))+i) )
+#define CMPPCHAR(a,b,i)  CMPCHAR( *(((unsigned char*)(a))+i), *(((unsigned 
char*)(b))+i) )
#define CMPTRGM(a,b) ( CMPPCHAR(a,b,0) ? CMPPCHAR(a,b,0) : ( CMPPCHAR(a,b,1) ? 
CMPPCHAR(a,b,1) : CMPPCHAR(a,b,2) ) )
 #define CPTRGM(a,b) do {                                                       
    \

Alternatively, Postgres can be compiled with -funsigned-char or -fsigned-char. 
I came across a code comment suggesting that this may not be a good idea in 
general [3].

Given that this has problem has come up before and seems likely to come up 
again, I'm curious what other broad solutions there might be to resolve it? 
Looking forward to any feedback, thanks!

Best,

Adam Guo
Amazon Web Services: https://aws.amazon.com

[1] 
https://osdn.net/projects/pgbigm/lists/archive/hackers/2024-February/000370.html
[2] 
https://github.com/postgres/postgres/blob/480bc6e3ed3a5719cdec076d4943b119890e8171/contrib/pg_trgm/trgm.h#L45
[3] 
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/cash.c#L114-L123

Reply via email to