Re: gai_strerror() is not thread-safe on Windows
On Tue, Jan 16, 2024 at 8:52 AM Robert Haas wrote: > On Wed, Dec 6, 2023 at 8:45 PM Kyotaro Horiguchi > wrote: > > > So I think we should just hard-code the error messages in English and > > > move on. However, English is my language so perhaps I should abstain > > > and leave it to others to decide how important that is. > > > > I also think that would be a good way. > > Considering this remark from Kyotaro Horiguchi, I think the > previously-posted patch could be committed. > > Thomas, do you plan to do that, or are there outstanding issues here? Pushed. I went with FreeBSD's error messages (I assume it'd be OK to take glibc's too under fair use but I didn't want to think about that).
Re: gai_strerror() is not thread-safe on Windows
On Tue, 5 Dec 2023 at 00:57, Thomas Munro wrote: > > On second thoughts, I guess it would make more sense to use the exact > messages Windows' own implementation would return instead of whatever > we had in the past (probably cribbed from some other OS or just made > up?). I asked CI to spit those out[1]. Updated patch attached. Will > add to CF. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID 376c216138c75e161d39767650ea30536f23b482 === === applying patch ./v2-0001-Fix-gai_strerror-thread-safety-on-Windows.patch patching file configure Hunk #1 succeeded at 16388 (offset 34 lines). patching file configure.ac Hunk #1 succeeded at 1885 (offset 7 lines). patching file src/include/port/win32/sys/socket.h patching file src/port/meson.build patching file src/port/win32gai_strerror.c can't find file to patch at input line 134 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -- |diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm |index 46df01cc8d..c51296bdb6 100644 |--- a/src/tools/msvc/Mkvcbuild.pm |+++ b/src/tools/msvc/Mkvcbuild.pm -- No file to patch. Skipping patch. 1 out of 1 hunk ignored Please have a look and post an updated version. [1] - http://cfbot.cputube.org/patch_46_4682.log Regards, Vignesh
Re: gai_strerror() is not thread-safe on Windows
On Wed, Dec 6, 2023 at 8:45 PM Kyotaro Horiguchi wrote: > > So I think we should just hard-code the error messages in English and > > move on. However, English is my language so perhaps I should abstain > > and leave it to others to decide how important that is. > > I also think that would be a good way. Considering this remark from Kyotaro Horiguchi, I think the previously-posted patch could be committed. Thomas, do you plan to do that, or are there outstanding issues here? -- Robert Haas EDB: http://www.enterprisedb.com
Re: gai_strerror() is not thread-safe on Windows
At Thu, 7 Dec 2023 09:43:37 +1300, Thomas Munro wrote in > On Tue, Dec 5, 2023 at 3:43 PM Kyotaro Horiguchi > wrote: > > Windows' gai_strerror outputs messages that correspond to the language > > environment. Similarly, I think that the messages that the messages > > returned by our version should be translatable. > > Hmm, that is a good point. Wow, POSIX has given us a terrible > interface here, in terms of resource management. Let's see what glibc > does: > > https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror.c > https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror-strs.h It is quite a sight for sore eyes... > It doesn't look like it knows about locales at all. And a test > program seems to confirm: .. > setlocale(LC_MESSAGES, "ja_JP.UTF-8"); > printf("%s\n", gai_strerror(EAI_MEMORY)); > > That prints: > > Memory allocation failure > > FreeBSD tries harder, and prints: > > メモリ割り当て失敗 > > We can see that it has a thread-local variable that holds a copy of > that localised string until the next call to gai_strerror() in the > same thread: > > https://github.com/freebsd/freebsd-src/blob/main/lib/libc/net/gai_strerror.c > https://github.com/freebsd/freebsd-src/blob/main/lib/libc/nls/ja_JP.UTF-8.msg > > FreeBSD's message catalogues would provide a read-made source of > translations, bu... hmm, if glibc doesn't bother and the POSIX > interface is unhelpful and Windows' own implementation is so willfully > unusable, I don't really feel inclined to build a whole thread-local > cache thing on our side just to support this mess. I agree, I wouldn't want to do it either. > So I think we should just hard-code the error messages in English and > move on. However, English is my language so perhaps I should abstain > and leave it to others to decide how important that is. I also think that would be a good way. > > These messages may add extra line-end periods to the parent (or > > cotaining) messages when appended. This looks as follows. > > > > (auth.c:517 : errdetail_log() : sub (detail) message) > > > Could not translate client host name "hoge" to IP address: An address > > > incompatible with the requested protocol was used.. > > > > (hba.c:1562 : errmsg() : main message) > > > invalid IP address "192.0.2.1": This is usually a temporary error during > > > hostname resolution and means that the local server did not receive a > > > response from an authoritative server. > > > > When I first saw the first version, I thought it would be better to > > use Windows' own messages, just like you did. However, considering the > > content of the message above, wouldn't it be better to adhere to > > Linux-style messages overall? > > Yeah, I agree that either the glibc or the FreeBSD messages would be > better than those now that I've seen them. They are short and sweet. > > > A slightly subtler point is that the second example seems to have a > > misalignment between the descriptions before and after the colon, but > > do you think it's not something to be concerned about to this extent? > > I didn't understand what you meant here. If it was just a temporary error that couldn't be resolved, it doesn't mean that the IP address is invalid. If such a cause is possible, then probabyly an error message saying "failed to resolve" would be more appropriate. However, I wrote it meaning that there is no need to go to great length to ensure consistency with this message. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: gai_strerror() is not thread-safe on Windows
On Tue, Dec 5, 2023 at 3:43 PM Kyotaro Horiguchi wrote: > At Tue, 5 Dec 2023 08:26:54 +1300, Thomas Munro > wrote in > > On second thoughts, I guess it would make more sense to use the exact > > messages Windows' own implementation would return instead of whatever > > we had in the past (probably cribbed from some other OS or just made > > up?). I asked CI to spit those out[1]. Updated patch attached. Will > > add to CF. > > > > [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15 > > Windows' gai_strerror outputs messages that correspond to the language > environment. Similarly, I think that the messages that the messages > returned by our version should be translatable. Hmm, that is a good point. Wow, POSIX has given us a terrible interface here, in terms of resource management. Let's see what glibc does: https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror.c https://github.com/lattera/glibc/blob/master/sysdeps/posix/gai_strerror-strs.h It doesn't look like it knows about locales at all. And a test program seems to confirm: #include #include #include int main() { setlocale(LC_MESSAGES, "ja_JP.UTF-8"); printf("%s\n", gai_strerror(EAI_MEMORY)); } That prints: Memory allocation failure FreeBSD tries harder, and prints: メモリ割り当て失敗 We can see that it has a thread-local variable that holds a copy of that localised string until the next call to gai_strerror() in the same thread: https://github.com/freebsd/freebsd-src/blob/main/lib/libc/net/gai_strerror.c https://github.com/freebsd/freebsd-src/blob/main/lib/libc/nls/ja_JP.UTF-8.msg FreeBSD's message catalogues would provide a read-made source of translations, bu... hmm, if glibc doesn't bother and the POSIX interface is unhelpful and Windows' own implementation is so willfully unusable, I don't really feel inclined to build a whole thread-local cache thing on our side just to support this mess. So I think we should just hard-code the error messages in English and move on. However, English is my language so perhaps I should abstain and leave it to others to decide how important that is. > These messages may add extra line-end periods to the parent (or > cotaining) messages when appended. This looks as follows. > > (auth.c:517 : errdetail_log() : sub (detail) message) > > Could not translate client host name "hoge" to IP address: An address > > incompatible with the requested protocol was used.. > > (hba.c:1562 : errmsg() : main message) > > invalid IP address "192.0.2.1": This is usually a temporary error during > > hostname resolution and means that the local server did not receive a > > response from an authoritative server. > > When I first saw the first version, I thought it would be better to > use Windows' own messages, just like you did. However, considering the > content of the message above, wouldn't it be better to adhere to > Linux-style messages overall? Yeah, I agree that either the glibc or the FreeBSD messages would be better than those now that I've seen them. They are short and sweet. > A slightly subtler point is that the second example seems to have a > misalignment between the descriptions before and after the colon, but > do you think it's not something to be concerned about to this extent? I didn't understand what you meant here.
Re: gai_strerror() is not thread-safe on Windows
At Tue, 5 Dec 2023 08:26:54 +1300, Thomas Munro wrote in > On second thoughts, I guess it would make more sense to use the exact > messages Windows' own implementation would return instead of whatever > we had in the past (probably cribbed from some other OS or just made > up?). I asked CI to spit those out[1]. Updated patch attached. Will > add to CF. > > [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15 Windows' gai_strerror outputs messages that correspond to the language environment. Similarly, I think that the messages that the messages returned by our version should be translatable. These messages may add extra line-end periods to the parent (or cotaining) messages when appended. This looks as follows. (auth.c:517 : errdetail_log() : sub (detail) message) > Could not translate client host name "hoge" to IP address: An address > incompatible with the requested protocol was used.. (hba.c:1562 : errmsg() : main message) > invalid IP address "192.0.2.1": This is usually a temporary error during > hostname resolution and means that the local server did not receive a > response from an authoritative server. When I first saw the first version, I thought it would be better to use Windows' own messages, just like you did. However, considering the content of the message above, wouldn't it be better to adhere to Linux-style messages overall? A slightly subtler point is that the second example seems to have a misalignment between the descriptions before and after the colon, but do you think it's not something to be concerned about to this extent? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: gai_strerror() is not thread-safe on Windows
On second thoughts, I guess it would make more sense to use the exact messages Windows' own implementation would return instead of whatever we had in the past (probably cribbed from some other OS or just made up?). I asked CI to spit those out[1]. Updated patch attached. Will add to CF. [1] https://cirrus-ci.com/task/5816802207334400?logs=main#L15 From 490102fd52e20e67c4a862c1e889182328e115fa Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 4 Dec 2023 14:52:57 +1300 Subject: [PATCH v2] Fix gai_strerror() thread-safety on Windows. Commit 5579388d removed code that supplied a fallback implementation of getaddrinfo(), which was dead code on modern systems. One tiny piece of the removed code was still doing something useful on Windows, though: that OS's own gai_strerror()/gai_strerrorA() function returns a pointer to a static buffer that it overwrites each time, so it's not thread-safe. In rare circumstances, a multi-threaded client program could get an incorrect or corrupted error message. Restore the replacement function, though now that it's only for Windows we can put it into a win32-specific file, cut it down to the errors that Windows documents and change the messages to match what the system gai_strerror() returns on that OS. Back-patch to 16. --- configure | 6 + configure.ac| 1 + src/include/port/win32/sys/socket.h | 8 +++ src/port/meson.build| 1 + src/port/win32gai_strerror.c| 36 + src/tools/msvc/Mkvcbuild.pm | 1 + 6 files changed, 53 insertions(+) create mode 100644 src/port/win32gai_strerror.c diff --git a/configure b/configure index 217704e9ca..1d4cd5caf3 100755 --- a/configure +++ b/configure @@ -16354,6 +16354,12 @@ esac ;; esac + case " $LIBOBJS " in + *" win32gai_strerror.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS win32gai_strerror.$ac_objext" + ;; +esac + case " $LIBOBJS " in *" win32getrusage.$ac_objext "* ) ;; *) LIBOBJS="$LIBOBJS win32getrusage.$ac_objext" diff --git a/configure.ac b/configure.ac index e49de9e4f0..a7a6b9a0ba 100644 --- a/configure.ac +++ b/configure.ac @@ -1878,6 +1878,7 @@ if test "$PORTNAME" = "win32"; then AC_LIBOBJ(win32env) AC_LIBOBJ(win32error) AC_LIBOBJ(win32fdatasync) + AC_LIBOBJ(win32gai_strerror) AC_LIBOBJ(win32getrusage) AC_LIBOBJ(win32link) AC_LIBOBJ(win32ntdll) diff --git a/src/include/port/win32/sys/socket.h b/src/include/port/win32/sys/socket.h index 0c32c0f7b2..f2b475df5e 100644 --- a/src/include/port/win32/sys/socket.h +++ b/src/include/port/win32/sys/socket.h @@ -23,4 +23,12 @@ #define ERROR PGERROR #endif +/* + * We don't use the Windows gai_strerror[A] function because it is not + * thread-safe. We define our own in src/port/win32gai_strerror.c. + */ +#undef gai_strerror + +extern const char *gai_strerror(int ecode); + #endif /* WIN32_SYS_SOCKET_H */ diff --git a/src/port/meson.build b/src/port/meson.build index 576a48b48c..c559c732e7 100644 --- a/src/port/meson.build +++ b/src/port/meson.build @@ -35,6 +35,7 @@ if host_system == 'windows' 'win32error.c', 'win32fdatasync.c', 'win32fseek.c', +'win32gai_strerror.c', 'win32getrusage.c', 'win32link.c', 'win32ntdll.c', diff --git a/src/port/win32gai_strerror.c b/src/port/win32gai_strerror.c new file mode 100644 index 00..f117831122 --- /dev/null +++ b/src/port/win32gai_strerror.c @@ -0,0 +1,36 @@ +#include + +/* + * Windows has gai_strerrorA(), but it is not thread-safe. + * + * These are the documented error values and the messages returned by the + * system library (observed on Windows Server 2022), but our function returns + * pointers to string constants for thread-safety. + * + * https://learn.microsoft.com/en-us/windows/win32/api/ws2tcpip/nf-ws2tcpip-getaddrinfo + */ +const char * +gai_strerror(int errcode) +{ + switch (errcode) + { + case EAI_AGAIN: + return "This is usually a temporary error during hostname resolution and means that the local server did not receive a response from an authoritative server."; + case EAI_BADFLAGS: + return "An invalid argument was supplied."; + case EAI_FAIL: + return "A non-recoverable error occurred during a database lookup."; + case EAI_FAMILY: + return "An address incompatible with the requested protocol was used."; + case EAI_MEMORY: + return "Not enough memory resources are available to process this command."; + case EAI_NONAME: + return "No such host is known."; + case EAI_SERVICE: + return "The specified class was not found."; + case EAI_SOCKTYPE: + return "The support for the specified socket type does not exist in this address family."; + default: + return "Unknown server error"; + } +} diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index 46df01cc8d..c51296bdb6 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -113,6 +113,7 @@ sub mkvcbuild
gai_strerror() is not thread-safe on Windows
Hi, Commit 5579388d removed a bunch of dead code, formerly needed for old systems that lacked getaddrinfo() in the early days of IPv6. We already used the system getaddrinfo() via either configure-time tests (Unix) or runtime tests (Windows using attempt-to-find-with-dlsym that always succeeded on modern systems), so no modern system needed the fallback code, except for one small detail: getaddrinfo() has a companion function to spit out human readable error messages, and although Windows has that too, it's not thread safe[1]. libpq shouldn't call it, or else an unlucky multi-threaded program might see an error message messed up by another thread. Here's a patch to put that bit back. It's simpler than before: the original replacement had a bunch of #ifdefs for various historical reasons, but now we can just handle the 8 documented EAI errors on Windows. Noticed while wondering why the list of symbols reported in bug #18219 didn't include gai_strerrorA. That turned out to be because it is static inline in ws2tcpip.h, and its definition set alarm bells ringing. Avoid. [1] https://learn.microsoft.com/en-us/windows/win32/api/ws2tcpip/nf-ws2tcpip-getaddrinfo From 09aedd60882686f85517d8698e03bbcd342338d5 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 4 Dec 2023 14:52:57 +1300 Subject: [PATCH] Fix gai_strerror() thread-safety on Windows. Commit 5579388d removed code that supplied a fallback implementation of getaddrinfo(), which was dead code on modern systems. One tiny piece of the removed code was still doing something useful on Windows, though: that OS's own gai_strerror()/gai_strerrorA() function returns a pointer to a static buffer that it overwrites each time, so it's not thread-safe. In rare circumstances, a multi-threaded client program could get an incorrect or corrupted error message. Restore the replacement function, though now that it's only for Windows we can put it into a win32-specific file and cut it down to the errors that Windows documents. Back-patch to 16. --- configure | 6 + configure.ac| 1 + src/include/port/win32/sys/socket.h | 8 +++ src/port/meson.build| 1 + src/port/win32gai_strerror.c| 34 + src/tools/msvc/Mkvcbuild.pm | 1 + 6 files changed, 51 insertions(+) create mode 100644 src/port/win32gai_strerror.c diff --git a/configure b/configure index 217704e9ca..1d4cd5caf3 100755 --- a/configure +++ b/configure @@ -16354,6 +16354,12 @@ esac ;; esac + case " $LIBOBJS " in + *" win32gai_strerror.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS win32gai_strerror.$ac_objext" + ;; +esac + case " $LIBOBJS " in *" win32getrusage.$ac_objext "* ) ;; *) LIBOBJS="$LIBOBJS win32getrusage.$ac_objext" diff --git a/configure.ac b/configure.ac index e49de9e4f0..a7a6b9a0ba 100644 --- a/configure.ac +++ b/configure.ac @@ -1878,6 +1878,7 @@ if test "$PORTNAME" = "win32"; then AC_LIBOBJ(win32env) AC_LIBOBJ(win32error) AC_LIBOBJ(win32fdatasync) + AC_LIBOBJ(win32gai_strerror) AC_LIBOBJ(win32getrusage) AC_LIBOBJ(win32link) AC_LIBOBJ(win32ntdll) diff --git a/src/include/port/win32/sys/socket.h b/src/include/port/win32/sys/socket.h index 0c32c0f7b2..f2b475df5e 100644 --- a/src/include/port/win32/sys/socket.h +++ b/src/include/port/win32/sys/socket.h @@ -23,4 +23,12 @@ #define ERROR PGERROR #endif +/* + * We don't use the Windows gai_strerror[A] function because it is not + * thread-safe. We define our own in src/port/win32gai_strerror.c. + */ +#undef gai_strerror + +extern const char *gai_strerror(int ecode); + #endif /* WIN32_SYS_SOCKET_H */ diff --git a/src/port/meson.build b/src/port/meson.build index 576a48b48c..c559c732e7 100644 --- a/src/port/meson.build +++ b/src/port/meson.build @@ -35,6 +35,7 @@ if host_system == 'windows' 'win32error.c', 'win32fdatasync.c', 'win32fseek.c', +'win32gai_strerror.c', 'win32getrusage.c', 'win32link.c', 'win32ntdll.c', diff --git a/src/port/win32gai_strerror.c b/src/port/win32gai_strerror.c new file mode 100644 index 00..962e1659c1 --- /dev/null +++ b/src/port/win32gai_strerror.c @@ -0,0 +1,34 @@ +#include + +/* + * Windows has gai_strerrorA(), but it is not thread-safe. + * + * These are the error values documented by: + * + * https://learn.microsoft.com/en-us/windows/win32/api/ws2tcpip/nf-ws2tcpip-getaddrinfo + */ +const char * +gai_strerror(int errcode) +{ + switch (errcode) + { + case EAI_AGAIN: + return "A temporary failure in name resolution occurred"; + case EAI_BADFLAGS: + return "An invalid value was provided for ai_flags"; + case EAI_FAIL: + return "A nonrecoverable failure in name resolution occurred"; + case EAI_FAMILY: + return "Address family not supported"; + case EAI_MEMORY: + return "Not enough memory"; + case EAI_NONAME: + return "Unknown host"; + case EAI_SERVICE: + return "Class type not found"; + case