I just realized that there is a Windows-specific mailing list, so forwarding here.
---------- Forwarded message --------- From: Costas Argyris <[email protected]> Date: Sat, 18 Mar 2023 at 16:37 Subject: [PATCH] Use UTF-8 active code page for Windows host. To: <[email protected]> Hi This is a proposed patch to enable UTF-8 support in GNU Make running on Windows host. Today, the make process on Windows is using the legacy system code page because of the "A" functions called in the source code. This means that any UTF-8 input to make on Windows will break. A few examples follow: ###################### C:\Users\cargyris\temp>cat utf8Makefile.mk hello : @echo ﹏ @echo ❎ C:\Users\cargyris\temp>mingw32-make -f utf8Makefile.mk ï¹ âŽ C:\Users\cargyris\temp>mingw32-make -f ❎\utf8Makefile.mk mingw32-make: ?\utf8Makefile.mk: Invalid argument mingw32-make: *** No rule to make target '?\utf8Makefile.mk'. Stop. C:\Users\cargyris\temp>cd ❎ C:\Users\cargyris\temp\❎>mingw32-make -f utf8Makefile.mk mingw32-make: *** INTERNAL: readdir: Invalid argument. Stop. C:\Users\cargyris\temp\❎>mingw32-make -f ❎\utf8Makefile.mk mingw32-make: ?\utf8Makefile.mk: Invalid argument mingw32-make: *** INTERNAL: readdir: Invalid argument. Stop. ###################### Hopefully the Unicode symbols are showing correctly in the email. I used these: https://www.compart.com/en/unicode/U+FE4F https://www.compart.com/en/unicode/U+274E The attached patch incorporates the UTF-8 manifest into the build process of GNU Make when hosted on Windows, and forces the built executable to use UTF-8 as its active code page, solving all problems shown above because this has a global effect in the process. All existing "A" calls use the UTF-8 code page now instead of the legacy one. This is the relevant Microsoft doc: https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page With the patch, after building make, the above cases now work on Windows: ###################### C:\Users\cargyris\temp>cat utf8Makefile.mk hello : @echo ﹏ @echo ❎ C:\Users\cargyris\temp>make -f utf8Makefile.mk ﹏ ❎ C:\Users\cargyris\temp>make -f ❎\utf8Makefile.mk ﹏ ❎ C:\Users\cargyris\temp>cd ❎ C:\Users\cargyris\temp\❎>make -f utf8Makefile.mk ﹏ ❎ C:\Users\cargyris\temp\❎>make -f ❎\utf8Makefile.mk ﹏ ❎ ###################### This change might also fix other existing issues on Windows having to do with filenames and paths, but I can't point at something particular right now. Would a patch like that be considered? Thanks, Costas
From f042d4b82111624dfe84bb85758c9b61c76ece5f Mon Sep 17 00:00:00 2001 From: Costas Argyris <[email protected]> Date: Sat, 18 Mar 2023 14:13:21 +0000 Subject: [PATCH] Use UTF-8 active code page for Windows host. This allows the make process on Windows to work in UTF-8 in multiple levels: 1) Accept a Makefile that is encoded in UTF-8 (with or without the BOM, since it already gets ignored anyway). 2) Accept a UTF-8 path (input -f argument) to a Makefile (that could itself be encoded in UTF-8, as per #1). 3) Launch make from a current directory that has UTF-8 characters. and any combination of the above, since the entire make process will use UTF-8. This is already the case in Linux-based systems, but on Windows this change is required in order to support Unicode because the "A" APIs currently used will assume the legacy system code page, destroying any UTF-8 input. This change sets the code page to be used by the "A" APIs to the UTF-8 code page, thereby eliminating the need to update all calls of "A" functions to "W" functions to support Unicode. That is, the source code can stay the same with the "A" functions, but instead of them using the legacy code page they will be using the UTF-8 code page. --- Makefile.am | 10 ++++++++++ configure.ac | 5 +++++ src/w32/utf8.manifest | 8 ++++++++ src/w32/utf8.rc | 3 +++ 4 files changed, 26 insertions(+) create mode 100644 src/w32/utf8.manifest create mode 100644 src/w32/utf8.rc diff --git a/Makefile.am b/Makefile.am index c29c235..60a4e5b 100644 --- a/Makefile.am +++ b/Makefile.am @@ -46,6 +46,8 @@ w32_SRCS = src/w32/pathstuff.c src/w32/w32os.c src/w32/compat/dirent.c \ src/w32/subproc/misc.c src/w32/subproc/proc.h \ src/w32/subproc/sub_proc.c src/w32/subproc/w32err.c +w32_utf8_SRCS = src/w32/utf8.rc src/w32/utf8.manifest + vms_SRCS = src/vms_exit.c src/vms_export_symbol.c src/vms_progname.c \ src/vmsdir.h src/vmsfunctions.c src/vmsify.c @@ -90,6 +92,14 @@ else make_SOURCES += src/posixos.c endif +if HAVE_WINDRES + UTF8OBJ = src/w32/utf8.$(OBJEXT) + make_LDADD += $(UTF8OBJ) +endif + +$(UTF8OBJ) : $(w32_utf8_SRCS) + $(WINDRES) $< -o $@ + if USE_CUSTOMS make_SOURCES += src/remote-cstms.c else diff --git a/configure.ac b/configure.ac index cd78575..8cbf986 100644 --- a/configure.ac +++ b/configure.ac @@ -444,6 +444,7 @@ AC_SUBST([MAKE_HOST]) w32_target_env=no AM_CONDITIONAL([WINDOWSENV], [false]) +AM_CONDITIONAL([HAVE_WINDRES], [false]) AS_CASE([$host], [*-*-mingw32], @@ -451,6 +452,10 @@ AS_CASE([$host], w32_target_env=yes AC_DEFINE([WINDOWS32], [1], [Build for the WINDOWS32 API.]) AC_DEFINE([HAVE_DOS_PATHS], [1], [Support DOS-style pathnames.]) + # Windows host tools. + # If windres is available, make will use UTF-8. + AC_CHECK_TOOL([WINDRES], [windres], [:]) + AM_CONDITIONAL([HAVE_WINDRES], [test "$WINDRES" != ':']) ]) AC_DEFINE_UNQUOTED([PATH_SEPARATOR_CHAR],['$PATH_SEPARATOR'], diff --git a/src/w32/utf8.manifest b/src/w32/utf8.manifest new file mode 100644 index 0000000..dab929e --- /dev/null +++ b/src/w32/utf8.manifest @@ -0,0 +1,8 @@ +<?xml version="1.0" encoding="UTF-8" standalone="yes"?> +<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1"> + <application> + <windowsSettings> + <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage> + </windowsSettings> + </application> +</assembly> diff --git a/src/w32/utf8.rc b/src/w32/utf8.rc new file mode 100644 index 0000000..62bdbdc --- /dev/null +++ b/src/w32/utf8.rc @@ -0,0 +1,3 @@ +#include <winuser.h> + +CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "utf8.manifest" -- 2.30.2
