On Mon, Jul 1, 2024 at 2:06 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.mu...@gmail.com> writes: > > I don't have any great ideas about what to do about this. > > Cybersquatting system facilities is a messy business, so maybe the > > proposed grotty solution is actually appropriate! We did bring this > > duelling Henry Spencers problem upon ourselves. Longer term, > > pg_regex_t seems to make a lot of sense, except IIUC we want to keep > > this code in sync with TCL so perhaps a configurable prefix could be > > done with macrology? > > Yeah. I'd do pg_regex_t in a minute except that it'd break existing > extensions using our facilities. However, your mention of macrology > stirred an idea: could we have our regex/regex.h intentionally > #include the system regex.h and then do > #define regex_t pg_regex_t > ? If that works, our struct is really pg_regex_t, but we don't have > to change any existing calling code. It might get a bit messy > undef'ing and redef'ing all the other macros in regex/regex.h, but > I think we could make it fly without any changes in other files.
Good idea. Here's an attempt at that. I don't have a Mac with beta SDK 15 yet, but I think this should work?
From 180850212314d0e368ccdd7ed382decec1fca2d8 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Thu, 4 Jul 2024 17:31:20 +1200 Subject: [PATCH v1] Cope with <regex.h> name clashes. macOS 15's SDK pulls in headers related to <regex.h> when we include <xlocale.h>. This causes our own regex_t implementation to clash with the OS's regex_t implementation. Luckily our function names already had pg_ prefixes, but the macros and typenames did not. Include <regex.h> explicitly on all POSIX systems, and fix everything that breaks. Then we can prove that we are capable of fully hiding and replacing the system regex API with our own. 1. Deal with standard-clobbering macros by undefining them all first. POSIX says they are "symbolic constants". If they are macros, this allows us to redefine them. If they are enums or variables, our macros will hide them. 2. Deal with standard-clobbering types by giving our types pg_ prefixes, and then using macros to redirect xxx_t -> pg_xxx_t. After including our "regex/regex.h", the system <regex.h> is hidden, because we've replaced all the standard names. The PostgreSQL source tree and extensions can continue to use standard prefix-less type and macro names, but reach our implementation, if they included our "regex/regex.h" header. Back-patch to all supported branches, so that macOS 15's tool chain can build them. Suggested-by: Tom Lane <t...@sss.pgh.pa.us> Reported-by: Stan Hu <sta...@gmail.com> Discussion: https://postgr.es/m/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com --- src/include/regex/regex.h | 100 ++++++++++++++++++++++++++++++++++---- 1 file changed, 91 insertions(+), 9 deletions(-) diff --git a/src/include/regex/regex.h b/src/include/regex/regex.h index d08113724f6..94297e7939c 100644 --- a/src/include/regex/regex.h +++ b/src/include/regex/regex.h @@ -1,5 +1,5 @@ -#ifndef _REGEX_H_ -#define _REGEX_H_ /* never again */ +#ifndef _PG_REGEX_H_ +#define _PG_REGEX_H_ /* never again */ /* * regular expressions * @@ -32,6 +32,83 @@ * src/include/regex/regex.h */ +/* + * This is an implementation of POSIX regex_t, so it clashes with the + * system-provided <regex.h> header. That header might be unintentionally + * included already, so we force that to happen now on all systems to show that + * we can cope and that we completely replace the system regex interfaces. + * + * Note that we avoided using _REGEX_H_ as an include guard, as that confuses + * matters on BSD-derivatives including macOS that use the same include guard. + */ +#ifndef _WIN32 +#include <regex.h> +#endif + +/* Avoid redefinition errors due to the system header. */ +#undef REG_UBACKREF +#undef REG_ULOOKAROUND +#undef REG_UBOUNDS +#undef REG_UBRACES +#undef REG_UBSALNUM +#undef REG_UPBOTCH +#undef REG_UBBS +#undef REG_UNONPOSIX +#undef REG_UUNSPEC +#undef REG_UUNPORT +#undef REG_ULOCALE +#undef REG_UEMPTYMATCH +#undef REG_UIMPOSSIBLE +#undef REG_USHORTEST +#undef REG_BASIC +#undef REG_EXTENDED +#undef REG_ADVF +#undef REG_ADVANCED +#undef REG_QUOTE +#undef REG_NOSPEC +#undef REG_ICASE +#undef REG_NOSUB +#undef REG_EXPANDED +#undef REG_NLSTOP +#undef REG_NLANCH +#undef REG_NEWLINE +#undef REG_PEND +#undef REG_EXPECT +#undef REG_BOSONLY +#undef REG_DUMP +#undef REG_FAKE +#undef REG_PROGRESS +#undef REG_NOTBOL +#undef REG_NOTEOL +#undef REG_STARTEND +#undef REG_FTRACE +#undef REG_MTRACE +#undef REG_SMALL +#undef REG_OKAY +#undef REG_NOMATCH +#undef REG_BADPAT +#undef REG_ECOLLATE +#undef REG_ECTYPE +#undef REG_EESCAPE +#undef REG_ESUBREG +#undef REG_EBRACK +#undef REG_EPAREN +#undef REG_EBRACE +#undef REG_BADBR +#undef REG_ERANGE +#undef REG_ESPACE +#undef REG_BADRPT +#undef REG_ASSERT +#undef REG_INVARG +#undef REG_MIXED +#undef REG_BADOPT +#undef REG_ETOOBIG +#undef REG_ECOLORS +#undef REG_ATOI +#undef REG_ITOA +#undef REG_PREFIX +#undef REG_EXACT + /* * Add your own defines, if needed, here. */ @@ -45,7 +122,7 @@ * regoff_t has to be large enough to hold either off_t or ssize_t, * and must be signed; it's only a guess that long is suitable. */ -typedef long regoff_t; +typedef long pg_regoff_t; /* * other interface types @@ -79,19 +156,19 @@ typedef struct /* the rest is opaque pointers to hidden innards */ char *re_guts; /* `char *' is more portable than `void *' */ char *re_fns; -} regex_t; +} pg_regex_t; /* result reporting (may acquire more fields later) */ typedef struct { - regoff_t rm_so; /* start of substring */ - regoff_t rm_eo; /* end of substring */ -} regmatch_t; + pg_regoff_t rm_so; /* start of substring */ + pg_regoff_t rm_eo; /* end of substring */ +} pg_regmatch_t; /* supplementary control and reporting */ typedef struct { - regmatch_t rm_extend; /* see REG_EXPECT */ + pg_regmatch_t rm_extend; /* see REG_EXPECT */ } rm_detail_t; @@ -164,6 +241,11 @@ typedef struct #define REG_EXACT (-2) /* identified an exact match */ +/* Redirect the standard typenames to our typenames. */ +#define regoff_t pg_regoff_t +#define regex_t pg_regex_t +#define regmatch_t pg_regmatch_t + /* * the prototypes for exported functions @@ -186,4 +268,4 @@ extern bool RE_compile_and_execute(text *text_re, char *dat, int dat_len, int cflags, Oid collation, int nmatch, regmatch_t *pmatch); -#endif /* _REGEX_H_ */ +#endif /* _PG_REGEX_H_ */ -- 2.45.2