Re: [PATCH] Use UTF-8 active code page for Windows host.
On Sun, 2023-06-18 at 21:33 +0100, Costas Argyris wrote: > Just checking to see if there is still interest in this feature. I had that locally but hadn't had time to test it fully. Pushed now.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sun, 18 Jun 2023 21:33:03 +0100 > Cc: bug-make@gnu.org > > Just checking to see if there is still interest in this feature. Nothing's changed, so yes.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Just checking to see if there is still interest in this feature. On Thu, 18 May 2023 at 17:49, Costas Argyris wrote: > Please find attached the latest patch with everything done so far, > including the inconsistency I mentioned in my previous email. > > This now has all 3 build approaches, namely: > > 1) configure > 2) Basic.mk > 3) build_w32.bat > > treating the resource compiler optionally, and building and > embedding the UTF-8 resource if it is found.This applies > for all supported compilers. > > I think this has everything discussed and agreed so far. > > Please let me know what you think. > > On Thu, 18 May 2023 at 12:38, Costas Argyris > wrote: > >> I think this should be added to README.git. Without these >> explanations, the purpose of Basic.mk and its auxiliary files, and of >> their intended usage, is completely unclear. >> >> I believe this was going to Paul.From my side, these explanations >> were really helpful. >> >> On to the Basic.mk patch, please see the latest one attached in this >> email. >> >> After regenerating Basic.mk from .\bootstrap.bat, I tried running it with >> both msvc and gcc values for TOOLCHAIN and they both worked fine. >> >> I also tried the 'no resource compiler' case by temporarily renaming >> 'rc.exe' (msvc) and 'windres.exe' (gcc) to something else so they are >> not found on the Windows path, and the build just went through with >> no errors and produced non-utf8 GNU Make binaries. >> >> As you will see, in mk/Windows32.mk I used: >> >> ifneq (, $(shell where $(RC) 2>nul)) >> >> to tell if a program is on the Windows path.It seems to be working >> fine >> in both cases of 'found' and 'not-found', but I am no GNU Make expert so >> please let me know if this is correct. >> >> A little inconsistency is that in build_w32.bat I didn't implement a check >> for 'rc.exe' because I assumed it's always going to be in the Windows >> path if the compiler 'cl.exe' is, but in mk/Windows32.mk I did implement >> the check even for 'rc.exe' - I can add the check in build_w32.bat to be >> consistent in a next update, it should be easy. >> >> Also just checking - the configury approach, when building for Windows >> host, can't be used with msvc or tcc, right?It needs to find gcc >> targeting Windows, and therefore checking for windres (already >> implemented) should be sufficient, right? >> >> On Thu, 18 May 2023 at 06:31, Eli Zaretskii wrote: >> >>> > From: Paul Smith >>> > Cc: bug-make@gnu.org >>> > Date: Wed, 17 May 2023 18:04:55 -0400 >>> > >>> > To remind: the purpose of these is to provide a makefile-based way to >>> > _develop_ GNU Make itself, on platforms where we can't run ./configure >>> > to get an automake-generated makefile. >>> > >>> > If you need to build GNU Make from scratch there's not much benefit >>> > from using Basic.mk, because it will just rebuild everything every time >>> > just like the build_w32.bat etc. files. You don't save anything. >>> > >>> > But if you're doing ongoing development (edit/build/test cycle) and you >>> > don't want to have to recompile all files every time you change >>> > something, and you can't run ./configure, then you can use an already- >>> > built GNU Make and these makefiles to shorten your development cycle. >>> >>> I think this should be added to README.git. Without these >>> explanations, the purpose of Basic.mk and its auxiliary files, and of >>> their intended usage, is completely unclear. >>> >>
Re: [PATCH] Use UTF-8 active code page for Windows host.
Please find attached the latest patch with everything done so far, including the inconsistency I mentioned in my previous email. This now has all 3 build approaches, namely: 1) configure 2) Basic.mk 3) build_w32.bat treating the resource compiler optionally, and building and embedding the UTF-8 resource if it is found.This applies for all supported compilers. I think this has everything discussed and agreed so far. Please let me know what you think. On Thu, 18 May 2023 at 12:38, Costas Argyris wrote: > I think this should be added to README.git. Without these > explanations, the purpose of Basic.mk and its auxiliary files, and of > their intended usage, is completely unclear. > > I believe this was going to Paul.From my side, these explanations > were really helpful. > > On to the Basic.mk patch, please see the latest one attached in this email. > > After regenerating Basic.mk from .\bootstrap.bat, I tried running it with > both msvc and gcc values for TOOLCHAIN and they both worked fine. > > I also tried the 'no resource compiler' case by temporarily renaming > 'rc.exe' (msvc) and 'windres.exe' (gcc) to something else so they are > not found on the Windows path, and the build just went through with > no errors and produced non-utf8 GNU Make binaries. > > As you will see, in mk/Windows32.mk I used: > > ifneq (, $(shell where $(RC) 2>nul)) > > to tell if a program is on the Windows path.It seems to be working fine > in both cases of 'found' and 'not-found', but I am no GNU Make expert so > please let me know if this is correct. > > A little inconsistency is that in build_w32.bat I didn't implement a check > for 'rc.exe' because I assumed it's always going to be in the Windows > path if the compiler 'cl.exe' is, but in mk/Windows32.mk I did implement > the check even for 'rc.exe' - I can add the check in build_w32.bat to be > consistent in a next update, it should be easy. > > Also just checking - the configury approach, when building for Windows > host, can't be used with msvc or tcc, right?It needs to find gcc > targeting Windows, and therefore checking for windres (already > implemented) should be sufficient, right? > > On Thu, 18 May 2023 at 06:31, Eli Zaretskii wrote: > >> > From: Paul Smith >> > Cc: bug-make@gnu.org >> > Date: Wed, 17 May 2023 18:04:55 -0400 >> > >> > To remind: the purpose of these is to provide a makefile-based way to >> > _develop_ GNU Make itself, on platforms where we can't run ./configure >> > to get an automake-generated makefile. >> > >> > If you need to build GNU Make from scratch there's not much benefit >> > from using Basic.mk, because it will just rebuild everything every time >> > just like the build_w32.bat etc. files. You don't save anything. >> > >> > But if you're doing ongoing development (edit/build/test cycle) and you >> > don't want to have to recompile all files every time you change >> > something, and you can't run ./configure, then you can use an already- >> > built GNU Make and these makefiles to shorten your development cycle. >> >> I think this should be added to README.git. Without these >> explanations, the purpose of Basic.mk and its auxiliary files, and of >> their intended usage, is completely unclear. >> > From f7d5cf83e5ceaa0d008f9b6b0e57d05ec541ef9a Mon Sep 17 00:00:00 2001 From: Costas Argyris Date: Sat, 25 Mar 2023 21:51:41 + Subject: [PATCH] Add UTF-8 resource when building for Windows host, if a resource compiler is available. As a result, the produced GNU Make Windows executable will use UTF-8 as its ANSI code page, enabling it to work with UTF-8 encoded Makefiles, understand UTF-8 paths passed to it, etc. These build process changes apply to all 3 ways that GNU Make can be built for Windows: 1) configure 2) Basic.mk 3) build_w32.bat When building with VS the resource compiler should always be available. When building with GCC or TCC, it depends on the availability of 'windres'. If a resource compiler is not available, don't fail the build but just proceed without the UTF-8 resource, effectively ignoring this feature. The UTF-8 resource only has an effect when GNU Make is running on a minimum target version of: Windows Version 1903 (May 2019 Update). When the built GNU Make is running on an earlier version of Windows, the embedded UTF-8 resource has no effect at all. Code page information gets added to --version output to tell users what code pages are being used by any combination of GNU Make build (with or without the UTF-8 resource) and Windows version that GNU Make is running on (earlier than 1903 or not). Signed-off-by: Costas Argyris --- .gitignore| 2 ++ Basic.mk.template | 6 -- Makefile.am | 11 +++ README.git| 2 +- build_w32.bat | 46 --- configure.ac | 5 + mk/Windows32.mk | 17 src/main.c| 5 + src/w32/utf8.manifest | 8
Re: [PATCH] Use UTF-8 active code page for Windows host.
I think this should be added to README.git. Without these explanations, the purpose of Basic.mk and its auxiliary files, and of their intended usage, is completely unclear. I believe this was going to Paul.From my side, these explanations were really helpful. On to the Basic.mk patch, please see the latest one attached in this email. After regenerating Basic.mk from .\bootstrap.bat, I tried running it with both msvc and gcc values for TOOLCHAIN and they both worked fine. I also tried the 'no resource compiler' case by temporarily renaming 'rc.exe' (msvc) and 'windres.exe' (gcc) to something else so they are not found on the Windows path, and the build just went through with no errors and produced non-utf8 GNU Make binaries. As you will see, in mk/Windows32.mk I used: ifneq (, $(shell where $(RC) 2>nul)) to tell if a program is on the Windows path.It seems to be working fine in both cases of 'found' and 'not-found', but I am no GNU Make expert so please let me know if this is correct. A little inconsistency is that in build_w32.bat I didn't implement a check for 'rc.exe' because I assumed it's always going to be in the Windows path if the compiler 'cl.exe' is, but in mk/Windows32.mk I did implement the check even for 'rc.exe' - I can add the check in build_w32.bat to be consistent in a next update, it should be easy. Also just checking - the configury approach, when building for Windows host, can't be used with msvc or tcc, right?It needs to find gcc targeting Windows, and therefore checking for windres (already implemented) should be sufficient, right? On Thu, 18 May 2023 at 06:31, Eli Zaretskii wrote: > > From: Paul Smith > > Cc: bug-make@gnu.org > > Date: Wed, 17 May 2023 18:04:55 -0400 > > > > To remind: the purpose of these is to provide a makefile-based way to > > _develop_ GNU Make itself, on platforms where we can't run ./configure > > to get an automake-generated makefile. > > > > If you need to build GNU Make from scratch there's not much benefit > > from using Basic.mk, because it will just rebuild everything every time > > just like the build_w32.bat etc. files. You don't save anything. > > > > But if you're doing ongoing development (edit/build/test cycle) and you > > don't want to have to recompile all files every time you change > > something, and you can't run ./configure, then you can use an already- > > built GNU Make and these makefiles to shorten your development cycle. > > I think this should be added to README.git. Without these > explanations, the purpose of Basic.mk and its auxiliary files, and of > their intended usage, is completely unclear. > diff --git a/Basic.mk.template b/Basic.mk.template index e3a83a20..ce273a1f 100644 --- a/Basic.mk.template +++ b/Basic.mk.template @@ -59,6 +59,8 @@ BUILT_SOURCES = OBJECTS = $(patsubst %.c,$(OUTDIR)%.$(OBJEXT),$(prog_SOURCES)) +RESOURCE_OBJECTS = + OBJDIRS = $(addsuffix .,$(sort $(dir $(OBJECTS # Use the default value of CC @@ -99,7 +101,7 @@ RM.cmd = rm -f $1 # $(call CP.cmd,,) CP.cmd = cp $1 $2 -CLEANSPACE = $(call RM.cmd,$(OBJECTS) $(PROG) $(BUILT_SOURCES)) +CLEANSPACE = $(call RM.cmd,$(OBJECTS) $(RESOURCE_OBJECTS) $(PROG) $(BUILT_SOURCES)) # Load overrides for the above variables. include $(firstword $(wildcard $(SRCDIR)/mk/$(lastword $(subst -, ,$(MAKE_HOST)).mk))) @@ -108,7 +110,7 @@ VPATH = $(SRCDIR) all: $(PROG) -$(PROG): $(OBJECTS) +$(PROG): $(OBJECTS) $(RESOURCE_OBJECTS) $(call LINK.cmd,$^) $(OBJECTS): $(OUTDIR)%.$(OBJEXT): %.c diff --git a/mk/Windows32.mk b/mk/Windows32.mk index 57226eb1..6e357ea7 100644 --- a/mk/Windows32.mk +++ b/mk/Windows32.mk @@ -30,6 +30,8 @@ P2W = $(subst /,\,$1) prog_SOURCES += $(loadavg_SOURCES) $(glob_SOURCES) $(w32_SOURCES) +utf8_SOURCES = $(src)w32/utf8.rc $(src)w32/utf8.manifest + BUILT_SOURCES += $(lib)alloca.h $(lib)fnmatch.h $(lib)glob.h w32_LIBS = kernel32 user32 gdi32 winspool comdlg32 advapi32 shell32 ole32 \ @@ -41,6 +43,7 @@ LDFLAGS = # --- Visual Studio msvc_CC = cl.exe +msvc_RC = rc.exe msvc_LD = link.exe msvc_CPPFLAGS = /DHAVE_CONFIG_H /DMK_OS_W32=1 /DWIN32 /D_CONSOLE @@ -54,6 +57,7 @@ msvc_LDFLAGS = /nologo /SUBSYSTEM:console /PDB:$(BASE_PROG).pdb msvc_LDLIBS = $(addsuffix .lib,$(w32_LIBS)) msvc_C_SOURCE = /c +msvc_RC_SOURCE = msvc_OUTPUT_OPTION = /Fo$@ msvc_LINK_OUTPUT = /OUT:$@ @@ -68,6 +72,7 @@ debug_msvc_LDFLAGS = /DEBUG # --- GCC gcc_CC = gcc +gcc_RC = windres gcc_LD = $(gcc_CC) release_gcc_OUTDIR = ./GccRel/ @@ -79,6 +84,7 @@ gcc_LDFLAGS = -mthreads -gdwarf-2 -g3 gcc_LDLIBS = $(addprefix -l,$(w32_libs)) gcc_C_SOURCE = -c +gcc_RC_SOURCE = -i gcc_OUTPUT_OPTION = -o $@ gcc_LINK_OUTPUT = -o $@ @@ -87,6 +93,7 @@ release_gcc_CFLAGS = -O2 # --- +RES_COMPILE.cmd = $(RC) $(OUTPUT_OPTION) $(RC_SOURCE) $1 LINK.cmd = $(LD) $(extra_LDFLAGS) $(LDFLAGS) $(TARGET_ARCH) $1 $(LDLIBS) $(LINK_OUTPUT) CHECK.cmd = cmd /c cd tests \& .\run_make_tests.bat -make ../$(PROG) @@ -96,9 +103,11 @@ RM.cmd
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Paul Smith > Cc: bug-make@gnu.org > Date: Wed, 17 May 2023 18:04:55 -0400 > > To remind: the purpose of these is to provide a makefile-based way to > _develop_ GNU Make itself, on platforms where we can't run ./configure > to get an automake-generated makefile. > > If you need to build GNU Make from scratch there's not much benefit > from using Basic.mk, because it will just rebuild everything every time > just like the build_w32.bat etc. files. You don't save anything. > > But if you're doing ongoing development (edit/build/test cycle) and you > don't want to have to recompile all files every time you change > something, and you can't run ./configure, then you can use an already- > built GNU Make and these makefiles to shorten your development cycle. I think this should be added to README.git. Without these explanations, the purpose of Basic.mk and its auxiliary files, and of their intended usage, is completely unclear.
Re: [PATCH] Use UTF-8 active code page for Windows host.
No; those makefiles can only work with GNU Make. You can easily tell by looking at them since they use make functions like call, etc. all over the place, plus pattern rules and all sorts of fancy things :). I see - thanks for pointing that out (I don't have much experience with Makefiles as you can probably tell).I guess that makes the task of telling whether 'windres' is on the Windows path easier then, because GNU Make extensions can be used in mk/Windows32.mk to do it. I'll try and get something working and update the Basic.mk patch with windres being optional, like in configure and build_w32.bat. If you don't already have GNU Make, you would use one of the other methods (like build_w32.bat) to create one. Then after that you can use these makefiles. Indeed, it makes sense, thanks for the overall explanation. On Wed, 17 May 2023 at 23:04, Paul Smith wrote: > On Wed, 2023-05-17 at 22:55 +0100, Costas Argyris wrote: > > From a quick search there appear to be many ways > > to do this, but some of them are GNU Make-specific, > > and I believe these Makefiles (Basic.mk and those > > included by it) have to work with any Make, not just > > GNU Make. > > No; those makefiles can only work with GNU Make. You can easily tell > by looking at them since they use make functions like call, etc. all > over the place, plus pattern rules and all sorts of fancy things :). > > If you don't already have GNU Make, you would use one of the other > methods (like build_w32.bat) to create one. Then after that you can > use these makefiles. > > To remind: the purpose of these is to provide a makefile-based way to > _develop_ GNU Make itself, on platforms where we can't run ./configure > to get an automake-generated makefile. > > If you need to build GNU Make from scratch there's not much benefit > from using Basic.mk, because it will just rebuild everything every time > just like the build_w32.bat etc. files. You don't save anything. > > But if you're doing ongoing development (edit/build/test cycle) and you > don't want to have to recompile all files every time you change > something, and you can't run ./configure, then you can use an already- > built GNU Make and these makefiles to shorten your development cycle. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Wed, 2023-05-17 at 22:55 +0100, Costas Argyris wrote: > From a quick search there appear to be many ways > to do this, but some of them are GNU Make-specific, > and I believe these Makefiles (Basic.mk and those > included by it) have to work with any Make, not just > GNU Make. No; those makefiles can only work with GNU Make. You can easily tell by looking at them since they use make functions like call, etc. all over the place, plus pattern rules and all sorts of fancy things :). If you don't already have GNU Make, you would use one of the other methods (like build_w32.bat) to create one. Then after that you can use these makefiles. To remind: the purpose of these is to provide a makefile-based way to _develop_ GNU Make itself, on platforms where we can't run ./configure to get an automake-generated makefile. If you need to build GNU Make from scratch there's not much benefit from using Basic.mk, because it will just rebuild everything every time just like the build_w32.bat etc. files. You don't save anything. But if you're doing ongoing development (edit/build/test cycle) and you don't want to have to recompile all files every time you change something, and you can't run ./configure, then you can use an already- built GNU Make and these makefiles to shorten your development cycle.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Actually I think windres should be optional here too, even if it is just for being consistent across all build approaches. If you agree, please let me know of what would be a good portable way to check if windres is on the path from within a Makefile (mk/Windows32.mk I guess). >From a quick search there appear to be many ways to do this, but some of them are GNU Make-specific, and I believe these Makefiles (Basic.mk and those included by it) have to work with any Make, not just GNU Make. Also note that this can be a Windows-specific way, possibly involving 'where windres' or 'windres --version'. But I'm not sure how this is best done from within a Makefile that has to be processed on Windows by any Make program. On Wed, 17 May 2023 at 19:34, Costas Argyris wrote: > Here is the patch with the Basic.mk.template and mk/Windows32.mk > changes.I tried to keep most of the changes in the Windows-specific > file, but something had to happen in the general one as well, as far as > I could tell. > > I deliberately sent only the changes relevant to the Basic.mk approach > here to facilitate review. > > The changes are working as far as I can tell, with both > > .\gnumake.exe -f Basic.mk TOOLCHAIN=msvc > > and > > .\gnumake.exe -f Basic.mk TOOLCHAIN=gcc > > (and without TOOLCHAIN at all, of course, defaulting to msvc) > > First I had to run > > .\bootstrap.bat > > to re-generate Basic.mk after changing Basic.mk.template and > mk/Windows32.mk > > These produce UTF-8-supporting gnumake.exe binaries in their corresponding > WinRel and GccRel folders. > > As you will see, I haven't implemented checking if windres is available on > the path for the TOOLCHAIN=gcc case, as I have never come across a > case where a gcc distribution for Windows doesn't include windres (in all > cases I have seen, it gets distributed with binutils because it needs the > assembler and linker that are not part of gcc, and windres is part of > binutils > so it's also there). > > Tcc is not supported with the Basic.mk approach anyway, so no need to > worry about the fact that it doesn't ship with windres. > > MSVC always has its own resource compiler available. > > If you still need the check for windres for the gcc case, please let me > know. > > On Wed, 17 May 2023 at 14:10, Paul Smith wrote: > >> On Wed, 2023-05-17 at 12:47 +0100, Costas Argyris wrote: >> > However, when trying to prepare the new patch I realized that >> > Basic.mk is an untracked file which is listed in .gitignore, so how >> > would you like me to show you these latest changes? >> >> The file to be changed is Basic.mk.template >> >> Sorry I have been out of touch this week; I likely will not have much >> time until the weekend. I'm looking forward to the patch though! >> >> For your initial question I understand the issue and I'm not sure how >> it works for me I'll have to try it again. It does seem wrong that we >> replace the running instance of make; that can never work on Windows >> (it works on most POSIX systems). >> >
Re: [PATCH] Use UTF-8 active code page for Windows host.
Here is the patch with the Basic.mk.template and mk/Windows32.mk changes.I tried to keep most of the changes in the Windows-specific file, but something had to happen in the general one as well, as far as I could tell. I deliberately sent only the changes relevant to the Basic.mk approach here to facilitate review. The changes are working as far as I can tell, with both .\gnumake.exe -f Basic.mk TOOLCHAIN=msvc and .\gnumake.exe -f Basic.mk TOOLCHAIN=gcc (and without TOOLCHAIN at all, of course, defaulting to msvc) First I had to run .\bootstrap.bat to re-generate Basic.mk after changing Basic.mk.template and mk/Windows32.mk These produce UTF-8-supporting gnumake.exe binaries in their corresponding WinRel and GccRel folders. As you will see, I haven't implemented checking if windres is available on the path for the TOOLCHAIN=gcc case, as I have never come across a case where a gcc distribution for Windows doesn't include windres (in all cases I have seen, it gets distributed with binutils because it needs the assembler and linker that are not part of gcc, and windres is part of binutils so it's also there). Tcc is not supported with the Basic.mk approach anyway, so no need to worry about the fact that it doesn't ship with windres. MSVC always has its own resource compiler available. If you still need the check for windres for the gcc case, please let me know. On Wed, 17 May 2023 at 14:10, Paul Smith wrote: > On Wed, 2023-05-17 at 12:47 +0100, Costas Argyris wrote: > > However, when trying to prepare the new patch I realized that > > Basic.mk is an untracked file which is listed in .gitignore, so how > > would you like me to show you these latest changes? > > The file to be changed is Basic.mk.template > > Sorry I have been out of touch this week; I likely will not have much > time until the weekend. I'm looking forward to the patch though! > > For your initial question I understand the issue and I'm not sure how > it works for me I'll have to try it again. It does seem wrong that we > replace the running instance of make; that can never work on Windows > (it works on most POSIX systems). > Basic-mk.patch Description: Binary data
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Wed, 2023-05-17 at 12:47 +0100, Costas Argyris wrote: > However, when trying to prepare the new patch I realized that > Basic.mk is an untracked file which is listed in .gitignore, so how > would you like me to show you these latest changes? The file to be changed is Basic.mk.template Sorry I have been out of touch this week; I likely will not have much time until the weekend. I'm looking forward to the patch though! For your initial question I understand the issue and I'm not sure how it works for me I'll have to try it again. It does seem wrong that we replace the running instance of make; that can never work on Windows (it works on most POSIX systems).
Re: [PATCH] Use UTF-8 active code page for Windows host.
I side-stepped this issue by copying .\WinRel\gnumake.exe to the top-level folder and running this instead: .\gnumake.exe -f Basic.mk TOOLCHAIN={msvc(default),gcc} which worked. I implemented the changes in Basic.mk and mk\Windows32.mk (which is being sourced by Basic.mk) and they seem to be working fine: I got UTF-8-working executables in WinRel\gnumake.exe and GccRel\gnumake.exe for the two options of TOOLCHAIN respectively.Note that there is no "tcc" option here, in contrast to the build_w32.bat file approach. However, when trying to prepare the new patch I realized that Basic.mk is an untracked file which is listed in .gitignore, so how would you like me to show you these latest changes? On Tue, 16 May 2023 at 19:05, Costas Argyris wrote: > Thanks for that info - I tried doing exactly as you said and I'd say I'm > almost there, except for the final link step: > > When running > > .\WinRel\gnumake.exe > > (with or without -f Basic.mk, doesn't matter) > > it seems that the final executable is attempted to be created in > the exact same file that started the build, leading to the error: > > LINK : fatal error LNK1104: cannot open file 'WinRel\gnumake.exe' > > I tried taking a quick look at how the output directory is decided, > but it's not so easy because apparently Basic.mk sources other > platform-specific files under the 'mk' folder, mk\Windows32.mk > in this case, and it's not so easy to tell what is happening there > with the OUTDIR variable.The variable that appears to be > honored is > > release_msvc_OUTDIR = ./WinRel/ > > but that seems to be unused in the file (?). > > There is even this comment in mk\Windows32.mk that seems > relevant: > > # I'm not sure why this builds gnumake rather than make...? > PROG = $(OUTDIR)gnumake$(EXEEXT) > > Before I dig deeper into it, am I missing something obvious? > > I could try and change the output dir, but afaict this was > supposed to work as-is. > > On Mon, 15 May 2023 at 19:14, Paul Smith wrote: > >> On Mon, 2023-05-15 at 17:48 +0100, Costas Argyris wrote: >> > As I have said before, I wasn't successful in getting the Basic.mk >> > approach to work on Windows, as I was getting various errors all >> > over the place.They started with CC being undefined, but even >> > after I defined it to 'gcc' this just took me to various link errors, >> > at which point I thought that this approach is not really maintained. >> > That was in contrast with the other two approaches on Windows >> > host, namely configure and .bat file, both of which worked as >> > expected. >> >> I think I tried to say before, but probably failed to be clear, that >> Basic.mk is used _in conjunction with_ one of the alternatives to >> running configure. >> >> By that I mean you FIRST have to use one of the alternatives to running >> configure, THEN you can use Build.mk. The Basic.mk framework doesn't, >> in particular, set up config.h etc. >> >> So, the following recipe works for me; first: >> >> .\build_w32.bat >> >> This sets up config.h and copies the Basic.mk file to be Makefile so >> that it's available for GNU make to use. >> >> .\WinRel\gnumake.exe >> >> This invokes the just-built GNU Make and uses the Makefile copy of >> Basic.mk (of course you can use .\WinRel\gnumake.exe -f Basic.mk >> instead if you prefer). >> >> By default, Basic.mk uses Visual Studio as the compiler, and it expects >> the invoking shell has set up MSVC using vcvarsall or whatever. If you >> set TOOLCHAIN=gcc on the make command line it should use GCC. I admit >> I haven't tried this one recently. >> >> > So, can this feature proceed without changes in Basic.mk? >> >> It's fine with me if you want to submit a patch that doesn't provide >> these updates. I can add them myself, or not. >> >
Re: [PATCH] Use UTF-8 active code page for Windows host.
Thanks for that info - I tried doing exactly as you said and I'd say I'm almost there, except for the final link step: When running .\WinRel\gnumake.exe (with or without -f Basic.mk, doesn't matter) it seems that the final executable is attempted to be created in the exact same file that started the build, leading to the error: LINK : fatal error LNK1104: cannot open file 'WinRel\gnumake.exe' I tried taking a quick look at how the output directory is decided, but it's not so easy because apparently Basic.mk sources other platform-specific files under the 'mk' folder, mk\Windows32.mk in this case, and it's not so easy to tell what is happening there with the OUTDIR variable.The variable that appears to be honored is release_msvc_OUTDIR = ./WinRel/ but that seems to be unused in the file (?). There is even this comment in mk\Windows32.mk that seems relevant: # I'm not sure why this builds gnumake rather than make...? PROG = $(OUTDIR)gnumake$(EXEEXT) Before I dig deeper into it, am I missing something obvious? I could try and change the output dir, but afaict this was supposed to work as-is. On Mon, 15 May 2023 at 19:14, Paul Smith wrote: > On Mon, 2023-05-15 at 17:48 +0100, Costas Argyris wrote: > > As I have said before, I wasn't successful in getting the Basic.mk > > approach to work on Windows, as I was getting various errors all > > over the place.They started with CC being undefined, but even > > after I defined it to 'gcc' this just took me to various link errors, > > at which point I thought that this approach is not really maintained. > > That was in contrast with the other two approaches on Windows > > host, namely configure and .bat file, both of which worked as > > expected. > > I think I tried to say before, but probably failed to be clear, that > Basic.mk is used _in conjunction with_ one of the alternatives to > running configure. > > By that I mean you FIRST have to use one of the alternatives to running > configure, THEN you can use Build.mk. The Basic.mk framework doesn't, > in particular, set up config.h etc. > > So, the following recipe works for me; first: > > .\build_w32.bat > > This sets up config.h and copies the Basic.mk file to be Makefile so > that it's available for GNU make to use. > > .\WinRel\gnumake.exe > > This invokes the just-built GNU Make and uses the Makefile copy of > Basic.mk (of course you can use .\WinRel\gnumake.exe -f Basic.mk > instead if you prefer). > > By default, Basic.mk uses Visual Studio as the compiler, and it expects > the invoking shell has set up MSVC using vcvarsall or whatever. If you > set TOOLCHAIN=gcc on the make command line it should use GCC. I admit > I haven't tried this one recently. > > > So, can this feature proceed without changes in Basic.mk? > > It's fine with me if you want to submit a patch that doesn't provide > these updates. I can add them myself, or not. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Mon, 2023-05-15 at 17:48 +0100, Costas Argyris wrote: > As I have said before, I wasn't successful in getting the Basic.mk > approach to work on Windows, as I was getting various errors all > over the place. They started with CC being undefined, but even > after I defined it to 'gcc' this just took me to various link errors, > at which point I thought that this approach is not really maintained. > That was in contrast with the other two approaches on Windows > host, namely configure and .bat file, both of which worked as > expected. I think I tried to say before, but probably failed to be clear, that Basic.mk is used _in conjunction with_ one of the alternatives to running configure. By that I mean you FIRST have to use one of the alternatives to running configure, THEN you can use Build.mk. The Basic.mk framework doesn't, in particular, set up config.h etc. So, the following recipe works for me; first: .\build_w32.bat This sets up config.h and copies the Basic.mk file to be Makefile so that it's available for GNU make to use. .\WinRel\gnumake.exe This invokes the just-built GNU Make and uses the Makefile copy of Basic.mk (of course you can use .\WinRel\gnumake.exe -f Basic.mk instead if you prefer). By default, Basic.mk uses Visual Studio as the compiler, and it expects the invoking shell has set up MSVC using vcvarsall or whatever. If you set TOOLCHAIN=gcc on the make command line it should use GCC. I admit I haven't tried this one recently. > So, can this feature proceed without changes in Basic.mk? It's fine with me if you want to submit a patch that doesn't provide these updates. I can add them myself, or not.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> > On Tue, 2023-04-11 at 15:29 +0300, Eli Zaretskii wrote: > > > I agree with the list. As for Basic.mk, we can forget about it > > > from my POV. Paul should make the call, but from my POV that > > > file was unmaintained and therefore unsupported. > > > > Why do we think it's unmaintained / unsupported? > > Because I never use it when building the MinGW port of Make. I see. I never use MinGW, as I don't have it installed. The Basic.mk model is only useful for systems where you can't run configure and generate a Makefile that way. For systems where configure can't be invoked it provides a makefile framework (versus a bat file) that can be used if you already have a previous GNU Make available (or once you build one once with the bat file). As I have said before, I wasn't successful in getting the Basic.mk approach to work on Windows, as I was getting various errors all over the place.They started with CC being undefined, but even after I defined it to 'gcc' this just took me to various link errors, at which point I thought that this approach is not really maintained. That was in contrast with the other two approaches on Windows host, namely configure and .bat file, both of which worked as expected. So my question now is:Is the Basic.mk approach a mandatory prerequisite for the UTF-8 feature?Do the UTF-8 build changes for Windows host have to be extended over there as well, or can we do without it, and say that a UTF-8 build for Windows works only with the configure and .bat file approaches (assuming there is a resource compiler available, of course). Note that, as agreed, in the latest patch I made the resource compiler (and hence building with UTF-8 manifest) optional, and added this information to --version output.This means that even if Basic.mk doesn't support the UTF-8 feature, the user would still know it by means of --version.That would be the same as if one of the other approaches was used and there was no resource compiler available (gcc or tcc without windres on the path). So, can this feature proceed without changes in Basic.mk? On Sat, 6 May 2023 at 19:54, Paul Smith wrote: > On Sun, 2023-04-30 at 17:27 +0300, Eli Zaretskii wrote: > > > From: Paul Smith > > > From: Paul Smith Date: Sun, 30 Apr 2023 09:55:55 - > > > 0400 > > > > > > On Tue, 2023-04-11 at 15:29 +0300, Eli Zaretskii wrote: > > > > I agree with the list. As for Basic.mk, we can forget about it > > > > from my POV. Paul should make the call, but from my POV that > > > > file was unmaintained and therefore unsupported. > > > > > > Why do we think it's unmaintained / unsupported? > > > > Because I never use it when building the MinGW port of Make. > > I see. I never use MinGW, as I don't have it installed. The Basic.mk > model is only useful for systems where you can't run configure and > generate a Makefile that way. For systems where configure can't be > invoked it provides a makefile framework (versus a bat file) that can > be used if you already have a previous GNU Make available (or once you > build one once with the bat file). >
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Sun, 2023-04-30 at 17:27 +0300, Eli Zaretskii wrote: > > From: Paul Smith > > From: Paul Smith Date: Sun, 30 Apr 2023 09:55:55 - > > 0400 > > > > On Tue, 2023-04-11 at 15:29 +0300, Eli Zaretskii wrote: > > > I agree with the list. As for Basic.mk, we can forget about it > > > from my POV. Paul should make the call, but from my POV that > > > file was unmaintained and therefore unsupported. > > > > Why do we think it's unmaintained / unsupported? > > Because I never use it when building the MinGW port of Make. I see. I never use MinGW, as I don't have it installed. The Basic.mk model is only useful for systems where you can't run configure and generate a Makefile that way. For systems where configure can't be invoked it provides a makefile framework (versus a bat file) that can be used if you already have a previous GNU Make available (or once you build one once with the bat file).
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Paul Smith > Cc: bug-make@gnu.org > Date: Sun, 30 Apr 2023 09:55:55 -0400 > > On Tue, 2023-04-11 at 15:29 +0300, Eli Zaretskii wrote: > > I agree with the list. As for Basic.mk, we can forget about it from > > my POV. Paul should make the call, but from my POV that file was > > unmaintained and therefore unsupported. > > Why do we think it's unmaintained / unsupported? Because I never use it when building the MinGW port of Make.
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Tue, 2023-04-11 at 15:29 +0300, Eli Zaretskii wrote: > I agree with the list. As for Basic.mk, we can forget about it from > my POV. Paul should make the call, but from my POV that file was > unmaintained and therefore unsupported. Why do we think it's unmaintained / unsupported? It worked the last time I tried it. The advantage it has over run build_w32.bat is that, since it uses make, it will only rebuild things that have changed and not everything, which can be helpful if you're doing iterative development.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Hi This is the latest patch based on the agreed remaining items: > 1) Make build optional with respect to UTF-8:If windres is available, > use it, if not, just build without UTF-8 support (current behavior). > 2) Implement Paul's suggestion above to avoid having an empty target > if HAVE_WINDRES is not set. > 3) Add active code page used in "make --version" output, for Windows. > Potentially also check Windows version. > 4) Can we officially forget about bringing the UTF-8 changes to Basic.mk? > As I have said before, I haven't managed to build using these Makefiles. > Actually, having the code page output by --version would greatly help with > this as well - if one built GNU Make using Basic.mk, they wouldn't get > UTF-8 support but this would still be readable in --version so no surprises. I remind that we are waiting for Paul's decision on whether we can forget about Basic.mk and the like regarding UTF-8. Note how the info added to --version actually helps with this. For the rest of the items 1-3, I made the changes and tested building using both build_w32.bat (all 3 compilers: VS, GCC and TCC) and the Unix configury method (GCC). The case of 'no windres' was simulated by simply temporarily renaming the 'windres' binary to 'windres1' so that it wasn't found, and Make was indeed built without the UTF-8 resource (and --version correctly reported the local code page instead of UTF-8). I also tried running a UTF-8 Make in a version of Windows earlier than 1903 and the --version output returned the local code page (not UTF-8), which is what Eli was expecting, so no surprises there. Please let me know what you think about this patch. Thanks On Tue, 11 Apr 2023 at 14:56, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Tue, 11 Apr 2023 14:50:53 +0100 > > Cc: bug-make@gnu.org, psm...@gnu.org > > > > > AFAIK, GetACP can never return UTF-8, except if the program was > > > compiled with those resources. > > > > In the scenario I am describing, Make was compiled with the resource, > > so GetACP should return UTF-8 on the one hand.On the other hand > > though, since Make is running in Windows version < 1903, it shouldn't > > return UTF-8 because this functionality is not supported in that version. > > That is what will happen, AFAIK. > 0001-Add-UTF-8-resource-when-building-for-Windows-host.patch Description: Binary data
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Tue, 11 Apr 2023 14:50:53 +0100 > Cc: bug-make@gnu.org, psm...@gnu.org > > > AFAIK, GetACP can never return UTF-8, except if the program was > > compiled with those resources. > > In the scenario I am describing, Make was compiled with the resource, > so GetACP should return UTF-8 on the one hand.On the other hand > though, since Make is running in Windows version < 1903, it shouldn't > return UTF-8 because this functionality is not supported in that version. That is what will happen, AFAIK.
Re: [PATCH] Use UTF-8 active code page for Windows host.
AFAIK, GetACP can never return UTF-8, except if the program was compiled with those resources. In the scenario I am describing, Make was compiled with the resource, so GetACP should return UTF-8 on the one hand.On the other hand though, since Make is running in Windows version < 1903, it shouldn't return UTF-8 because this functionality is not supported in that version. But yes, I will make sure to check this in practice. On Tue, 11 Apr 2023 at 14:46, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Tue, 11 Apr 2023 14:42:30 +0100 > > Cc: bug-make@gnu.org, Paul Smith > > > >> I don't think this is needed: if GetACP returns the UTF-8 codepage, it > >> must be that UTF-8 is supported. I'm not aware of any way of > >> affecting GetACP other than by a manifest such as this one (or perhaps > >> making UTF-8 a system-wide default, which is fine by us). > > > > This is the scenario I am concerned about: > > > > Assume Make was built with UTF-8 support, and downloaded by a > > user running Windows < 1903.I am not sure what GetACP would > > return in this case - If it returns the legacy code page, despite the > > fact that the UTF-8 manifest is embedded in Make, then we are good. > > But if GetACP returns UTF-8, because of the manifest that was > > embedded at build time, this will be confusing because --version will > > say UTF-8 but Make will actually work in the legacy encoding because > > of the < 1903 Windows version. > > AFAIK, GetACP can never return UTF-8, except if the program was > compiled with those resources. > > > I haven't tested this though, so it might not even be a real issue, just > > noting it down to check it later when I have the implementation. > > Yes, verifying this would be good, thanks. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Tue, 11 Apr 2023 14:42:30 +0100 > Cc: bug-make@gnu.org, Paul Smith > >> I don't think this is needed: if GetACP returns the UTF-8 codepage, it >> must be that UTF-8 is supported. I'm not aware of any way of >> affecting GetACP other than by a manifest such as this one (or perhaps >> making UTF-8 a system-wide default, which is fine by us). > > This is the scenario I am concerned about: > > Assume Make was built with UTF-8 support, and downloaded by a > user running Windows < 1903.I am not sure what GetACP would > return in this case - If it returns the legacy code page, despite the > fact that the UTF-8 manifest is embedded in Make, then we are good. > But if GetACP returns UTF-8, because of the manifest that was > embedded at build time, this will be confusing because --version will > say UTF-8 but Make will actually work in the legacy encoding because > of the < 1903 Windows version. AFAIK, GetACP can never return UTF-8, except if the program was compiled with those resources. > I haven't tested this though, so it might not even be a real issue, just > noting it down to check it later when I have the implementation. Yes, verifying this would be good, thanks.
Re: [PATCH] Use UTF-8 active code page for Windows host.
I don't think this is needed: if GetACP returns the UTF-8 codepage, it must be that UTF-8 is supported. I'm not aware of any way of affecting GetACP other than by a manifest such as this one (or perhaps making UTF-8 a system-wide default, which is fine by us). This is the scenario I am concerned about: Assume Make was built with UTF-8 support, and downloaded by a user running Windows < 1903.I am not sure what GetACP would return in this case - If it returns the legacy code page, despite the fact that the UTF-8 manifest is embedded in Make, then we are good. But if GetACP returns UTF-8, because of the manifest that was embedded at build time, this will be confusing because --version will say UTF-8 but Make will actually work in the legacy encoding because of the < 1903 Windows version. I haven't tested this though, so it might not even be a real issue, just noting it down to check it later when I have the implementation. If Paul is also OK with forgetting about Basic.mk for UTF-8 support, sounds like we have a plan. On Tue, 11 Apr 2023 at 13:29, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Tue, 11 Apr 2023 12:01:20 +0100 > > Cc: Eli Zaretskii , bug-make@gnu.org > > > > > Being able to know whether UTF-8 is supported or not is a valid > > > concern. How about adding this information to what "make --version" > > > shows? > > > > I agreed with that suggestion and proposed a plan, but didn't receive > > final confirmation on it. > > > > As far as I can tell, the only scenario where GNU Make is not built > > with UTF-8 is if it gets built with tcc, which doesn't have a resource > > compiler.Both gcc and msvc have resource compilers (gcc through > > binutils which gcc depends on anyway).But since tcc is a supported > > compiler as well, and we don't want to break it for the sake of UTF-8, > > then we must provide users with a way to tell if Make was built with > > UTF-8 support or not. > > > > I think outputting this info can be as simple as adding a call to GetACP > > in some appropriate place in the source code. > > Yes, I think so. > > > Note that this is going > > to be a windows-specific call.If you can point me at some candidate > > locations in the source code for adding that call, it would greatly speed > > things up.Otherwise I would just try to find where the --version > output > > is computed and try and add a windows-specific branch somewhere > > there. > > I think Windows-specific code in print_version (in main.c) should be > fine, but perhaps just call a function there, and the function itself > should be in a Windows-specific file, like w32/w32os.c. > > > There is one more complication about this: As we have stated before, > > this work only has a positive effect on Windows Version 1903 or later. > > Earlier versions will still work, but won't get UTF-8.So would it > make > > any sense if we reported UTF-8 in --version for versions of Windows > > earlier than 1903?Perhaps the check should include both Windows > > version and GetACP - thoughts? > > I don't think this is needed: if GetACP returns the UTF-8 codepage, it > must be that UTF-8 is supported. I'm not aware of any way of > affecting GetACP other than by a manifest such as this one (or perhaps > making UTF-8 a system-wide default, which is fine by us). > > > To summarize, I think the list of things do be done is: > > > > 1) Make build optional with respect to UTF-8:If windres is available, > > use it, if not, just build without UTF-8 support (current behavior). > > 2) Implement Paul's suggestion above to avoid having an empty target > > if HAVE_WINDRES is not set. > > 3) Add active code page used in "make --version" output, for Windows. > > Potentially also check Windows version. > > 4) Can we officially forget about bringing the UTF-8 changes to Basic.mk? > > As I have said before, I haven't managed to build using these Makefiles. > > Actually, having the code page output by --version would greatly help > with > > this as well - if one built GNU Make using Basic.mk, they wouldn't get > > UTF-8 support but this would still be readable in --version so no > surprises. > > I agree with the list. As for Basic.mk, we can forget about it from > my POV. Paul should make the call, but from my POV that file was > unmaintained and therefore unsupported. > > Thanks. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Tue, 11 Apr 2023 12:01:20 +0100 > Cc: Eli Zaretskii , bug-make@gnu.org > > > Being able to know whether UTF-8 is supported or not is a valid > > concern. How about adding this information to what "make --version" > > shows? > > I agreed with that suggestion and proposed a plan, but didn't receive > final confirmation on it. > > As far as I can tell, the only scenario where GNU Make is not built > with UTF-8 is if it gets built with tcc, which doesn't have a resource > compiler.Both gcc and msvc have resource compilers (gcc through > binutils which gcc depends on anyway).But since tcc is a supported > compiler as well, and we don't want to break it for the sake of UTF-8, > then we must provide users with a way to tell if Make was built with > UTF-8 support or not. > > I think outputting this info can be as simple as adding a call to GetACP > in some appropriate place in the source code. Yes, I think so. > Note that this is going > to be a windows-specific call.If you can point me at some candidate > locations in the source code for adding that call, it would greatly speed > things up.Otherwise I would just try to find where the --version output > is computed and try and add a windows-specific branch somewhere > there. I think Windows-specific code in print_version (in main.c) should be fine, but perhaps just call a function there, and the function itself should be in a Windows-specific file, like w32/w32os.c. > There is one more complication about this: As we have stated before, > this work only has a positive effect on Windows Version 1903 or later. > Earlier versions will still work, but won't get UTF-8.So would it make > any sense if we reported UTF-8 in --version for versions of Windows > earlier than 1903?Perhaps the check should include both Windows > version and GetACP - thoughts? I don't think this is needed: if GetACP returns the UTF-8 codepage, it must be that UTF-8 is supported. I'm not aware of any way of affecting GetACP other than by a manifest such as this one (or perhaps making UTF-8 a system-wide default, which is fine by us). > To summarize, I think the list of things do be done is: > > 1) Make build optional with respect to UTF-8:If windres is available, > use it, if not, just build without UTF-8 support (current behavior). > 2) Implement Paul's suggestion above to avoid having an empty target > if HAVE_WINDRES is not set. > 3) Add active code page used in "make --version" output, for Windows. > Potentially also check Windows version. > 4) Can we officially forget about bringing the UTF-8 changes to Basic.mk? > As I have said before, I haven't managed to build using these Makefiles. > Actually, having the code page output by --version would greatly help with > this as well - if one built GNU Make using Basic.mk, they wouldn't get > UTF-8 support but this would still be readable in --version so no surprises. I agree with the list. As for Basic.mk, we can forget about it from my POV. Paul should make the call, but from my POV that file was unmaintained and therefore unsupported. Thanks.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Thank you both for your input. I think you should move the assignment of UTF8OBJ out of the if- statement here, and only put the update of make_LDADD into the if- statement. That should be sufficient to have make ignore that target, if there's no WINDRES available. I'll make that change, thanks. I will say that I think this is at the very edge of what we can accept without copyright assignment. If this change gets larger, or if you want to contribute more in the future, we'll need to have paperwork. I think this change will get larger, because of Eli's suggestion: Being able to know whether UTF-8 is supported or not is a valid concern. How about adding this information to what "make --version" shows? I agreed with that suggestion and proposed a plan, but didn't receive final confirmation on it. As far as I can tell, the only scenario where GNU Make is not built with UTF-8 is if it gets built with tcc, which doesn't have a resource compiler.Both gcc and msvc have resource compilers (gcc through binutils which gcc depends on anyway).But since tcc is a supported compiler as well, and we don't want to break it for the sake of UTF-8, then we must provide users with a way to tell if Make was built with UTF-8 support or not. I think outputting this info can be as simple as adding a call to GetACP in some appropriate place in the source code.Note that this is going to be a windows-specific call.If you can point me at some candidate locations in the source code for adding that call, it would greatly speed things up.Otherwise I would just try to find where the --version output is computed and try and add a windows-specific branch somewhere there. There is one more complication about this: As we have stated before, this work only has a positive effect on Windows Version 1903 or later. Earlier versions will still work, but won't get UTF-8.So would it make any sense if we reported UTF-8 in --version for versions of Windows earlier than 1903?Perhaps the check should include both Windows version and GetACP - thoughts? If that change is made, which sounds like it will, this will push us over the edge it seems, thereby requiring paperwork. To summarize, I think the list of things do be done is: 1) Make build optional with respect to UTF-8:If windres is available, use it, if not, just build without UTF-8 support (current behavior). 2) Implement Paul's suggestion above to avoid having an empty target if HAVE_WINDRES is not set. 3) Add active code page used in "make --version" output, for Windows. Potentially also check Windows version. 4) Can we officially forget about bringing the UTF-8 changes to Basic.mk? As I have said before, I haven't managed to build using these Makefiles. Actually, having the code page output by --version would greatly help with this as well - if one built GNU Make using Basic.mk, they wouldn't get UTF-8 support but this would still be readable in --version so no surprises. If we all agree on that list, I can make these changes when I find some time and post the updated patch.Once we have the patch approved, we can start on the paperwork before actually committing anything. Does that sound reasonable? Thanks, Costas On Sat, 8 Apr 2023 at 05:32, Paul Smith wrote: > On Fri, 2023-04-07 at 08:29 -0400, Paul Smith wrote: > > > Also, I'm still waiting for Paul to approve the changes to Posix > > > configury part of your patch. > > > > Sorry I will make every effort to get to it today. > > The configure part looks OK to me. This change to Makefile.am seems > not quite correct to me however: > > @@ -90,6 +92,14 @@ else >make_SOURCES += src/posixos.c > endif > > +if HAVE_WINDRES > + UTF8OBJ = src/w32/utf8.$(OBJEXT) > + make_LDADD += $(UTF8OBJ) > +endif > + > +$(UTF8OBJ) : $(w32_utf8_SRCS) > + $(WINDRES) $< -o $@ > + > if USE_CUSTOMS >make_SOURCES += src/remote-cstms.c > else > > Here if HAVE_WINDRES is not set, then UTF8OBJ is empty, and you create > a recipe with an empty target. Now, this works in GNU Make (and is > ignored) but some other versions of make may not like it. > > I think you should move the assignment of UTF8OBJ out of the if- > statement here, and only put the update of make_LDADD into the if- > statement. That should be sufficient to have make ignore that target, > if there's no WINDRES available. > > After that if Eli is happy with it he can push it to Git. > > I will say that I think this is at the very edge of what we can accept > without copyright assignment. If this change gets larger, or if you > want to contribute more in the future, we'll need to have paperwork. > > Thanks for your contribution to GNU Make! >
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Fri, 2023-04-07 at 08:29 -0400, Paul Smith wrote: > > Also, I'm still waiting for Paul to approve the changes to Posix > > configury part of your patch. > > Sorry I will make every effort to get to it today. The configure part looks OK to me. This change to Makefile.am seems not quite correct to me however: @@ -90,6 +92,14 @@ else make_SOURCES += src/posixos.c endif +if HAVE_WINDRES + UTF8OBJ = src/w32/utf8.$(OBJEXT) + make_LDADD += $(UTF8OBJ) +endif + +$(UTF8OBJ) : $(w32_utf8_SRCS) + $(WINDRES) $< -o $@ + if USE_CUSTOMS make_SOURCES += src/remote-cstms.c else Here if HAVE_WINDRES is not set, then UTF8OBJ is empty, and you create a recipe with an empty target. Now, this works in GNU Make (and is ignored) but some other versions of make may not like it. I think you should move the assignment of UTF8OBJ out of the if- statement here, and only put the update of make_LDADD into the if- statement. That should be sufficient to have make ignore that target, if there's no WINDRES available. After that if Eli is happy with it he can push it to Git. I will say that I think this is at the very edge of what we can accept without copyright assignment. If this change gets larger, or if you want to contribute more in the future, we'll need to have paperwork. Thanks for your contribution to GNU Make!
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Thu, 2023-04-06 at 22:32 +0300, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Thu, 6 Apr 2023 22:04:48 +0300 > > Cc: bug-make@gnu.org, Paul Smith > > > > Hi and sorry to bother you again. > > > > I haven't received any response so > > I was wondering if there is still > > interest in doing this. I do think we should do it. I think Eli agrees. > It's on my todo, but I'm busy these days. > > Also, I'm still waiting for Paul to approve the changes to Posix > configury part of your patch. Sorry I will make every effort to get to it today. FYI, I have some events happening in my "real life" which means I won't be much available starting tomorrow, for a few weeks. I know I owe some people reviews and some patches are still outstanding that need to be applied. I will try to get to as much of it tonight as possible: if I don't I do apologize in advance and I will get back to it when my time frees up a bit.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Thu, 6 Apr 2023 22:04:48 +0300 > Cc: bug-make@gnu.org, Paul Smith > > Hi and sorry to bother you again. > > I haven't received any response so > I was wondering if there is still > interest in doing this. It's on my todo, but I'm busy these days. Also, I'm still waiting for Paul to approve the changes to Posix configury part of your patch.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Hi and sorry to bother you again. I haven't received any response so I was wondering if there is still interest in doing this. On Tue, 28 Mar 2023, 17:43 Costas Argyris, wrote: > Being able to know whether UTF-8 is supported or not is a valid > concern. How about adding this information to what "make --version" > shows? > > I think this is a great idea, despite the fact that I expect most > build environments to actually have a resource compiler (either > windres or the msvc one) and therefore build with UTF-8. > > It certainly doesn't hurt to have this information output by Make > itself - I can only see this being useful, even if the answer is > UTF-8 most of the time. > > One comment about having it as part of --version is that it probably > won't be easy to parse the used encoding out of the --version > output, say, if a script wants to query for the encoding and pass a > UTF-8 Makefile conditional on Make actually supporting UTF-8. > > Whether we want to care for this scenario or not is a different > question though. > > If we decide to do this, it should simply be a call to the Windows > GetACP() function - it will return 65001 if Make has been built > with the UTF-8 manifest, or the identifier of the legacy system > encoding otherwise. > > Actually, it will return 65001 (the UTF-8 identifier) in a couple other > scenarios, even if Make hadn't been built with the manifest: > > 1) The user embedded the manifest manually using mt.exe > This is officially documented and can very well happen. > 2) The user has switched Windows to use UTF-8 as the > active code page globally, for the entire OS.This is possible > to do with a checkbox that is marked as "beta" by MS. > > So having Make return the output of GetACP would be useful > because it would capture all these scenarios, including it > having been built with the manifest of course. > > This is an example from ninja doing the same thing: > > https://github.com/ninja-build/ninja/pull/1918 > > They did it by adding a new flag, so not part of --version.I like > putting it in --version better because this does sound like the type > of information one would expect to see there, the only problem > being that it may not be easy to parse, unless we add it in a standard > easy-to-parse format, like in a line of its own as: > > <--version output> > ... > Encoding: 65001 > or > Encoding: 1252 > ... > > But it isn't "full UTF-8", as we have established during the > discussions. MS-Windows is not yet ready for that, even in its latest > versions. > > Yes, well, I guess what I really meant was "to the extent that Make > itself can affect things".We can't do anything about Windows > limitations anyway.I think there should be no problems at all > if Make is linked against UCRT, and I hope that we won't hit many > functions in MSVCRT that are not UTF-8-aware, but we couldn't > do anything about these anyway. > > > That is, I am more inclined to make the configure version also error > > out if windres was not found, than to make build_w32.bat optional. > > I'm of the opposite opinion. > > Sure, it shouldn't be hard to change build_w32.bat to make it optional. > There is no precedent of such behavior in that batch file, as far as I can > tell, so I'd have to come up with something like: > > if HAVE_WINDRES > compile resource file > else > just skip resource file with no error, and don't try to link it > end > > Then this is a non-issue: the error will not happen except in some > situations we cannot imagine. > > That's what I think, but I could be wrong since I really can't imagine > all the build environments people might be using, and, as I learned > from this thread, there are quite a few ways to build so it's hard to > anticipate for every possible scenario. > > Maybe it is best to just make compilation of the resource file optional, > but, very importantly, with the addition of your suggestion about printing > the encoding used in --version.That way, everyone will manage to > build as they currently do with no surprises about this, and they will > also be able to query Make any time for its encoding. > > I'd like to avoid annoying users with requests to install something > they did well without previously. Some would consider this a > regression. > > Makes sense, see above response. > > On Tue, 28 Mar 2023 at 12:22, Eli Zaretskii wrote: > >> > From: Costas Argyris >> > Date: Mon, 27 Mar 2023 23:04:52 +0100 >> > Cc: bug-make@gnu.org >> > >> > > Should we fail here? Or should we build without UTF-8 support since >> we >> > > don't have a resource compiler? I think that's what the configure >> > > version does, right? >> > >> > You are right, that was an inconsistency on my part, sorry about that. >> > It's true that the configure version is optional on this, whereas >> > build_w32.bat errors out. >> > >> > I think the answer depends on what is going to be the policy regarding >> > Make on Windows and UTF-8.If we want to claim
Re: [PATCH] Use UTF-8 active code page for Windows host.
Being able to know whether UTF-8 is supported or not is a valid concern. How about adding this information to what "make --version" shows? I think this is a great idea, despite the fact that I expect most build environments to actually have a resource compiler (either windres or the msvc one) and therefore build with UTF-8. It certainly doesn't hurt to have this information output by Make itself - I can only see this being useful, even if the answer is UTF-8 most of the time. One comment about having it as part of --version is that it probably won't be easy to parse the used encoding out of the --version output, say, if a script wants to query for the encoding and pass a UTF-8 Makefile conditional on Make actually supporting UTF-8. Whether we want to care for this scenario or not is a different question though. If we decide to do this, it should simply be a call to the Windows GetACP() function - it will return 65001 if Make has been built with the UTF-8 manifest, or the identifier of the legacy system encoding otherwise. Actually, it will return 65001 (the UTF-8 identifier) in a couple other scenarios, even if Make hadn't been built with the manifest: 1) The user embedded the manifest manually using mt.exe This is officially documented and can very well happen. 2) The user has switched Windows to use UTF-8 as the active code page globally, for the entire OS.This is possible to do with a checkbox that is marked as "beta" by MS. So having Make return the output of GetACP would be useful because it would capture all these scenarios, including it having been built with the manifest of course. This is an example from ninja doing the same thing: https://github.com/ninja-build/ninja/pull/1918 They did it by adding a new flag, so not part of --version.I like putting it in --version better because this does sound like the type of information one would expect to see there, the only problem being that it may not be easy to parse, unless we add it in a standard easy-to-parse format, like in a line of its own as: <--version output> ... Encoding: 65001 or Encoding: 1252 ... But it isn't "full UTF-8", as we have established during the discussions. MS-Windows is not yet ready for that, even in its latest versions. Yes, well, I guess what I really meant was "to the extent that Make itself can affect things".We can't do anything about Windows limitations anyway.I think there should be no problems at all if Make is linked against UCRT, and I hope that we won't hit many functions in MSVCRT that are not UTF-8-aware, but we couldn't do anything about these anyway. > That is, I am more inclined to make the configure version also error > out if windres was not found, than to make build_w32.bat optional. I'm of the opposite opinion. Sure, it shouldn't be hard to change build_w32.bat to make it optional. There is no precedent of such behavior in that batch file, as far as I can tell, so I'd have to come up with something like: if HAVE_WINDRES compile resource file else just skip resource file with no error, and don't try to link it end Then this is a non-issue: the error will not happen except in some situations we cannot imagine. That's what I think, but I could be wrong since I really can't imagine all the build environments people might be using, and, as I learned from this thread, there are quite a few ways to build so it's hard to anticipate for every possible scenario. Maybe it is best to just make compilation of the resource file optional, but, very importantly, with the addition of your suggestion about printing the encoding used in --version.That way, everyone will manage to build as they currently do with no surprises about this, and they will also be able to query Make any time for its encoding. I'd like to avoid annoying users with requests to install something they did well without previously. Some would consider this a regression. Makes sense, see above response. On Tue, 28 Mar 2023 at 12:22, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Mon, 27 Mar 2023 23:04:52 +0100 > > Cc: bug-make@gnu.org > > > > > Should we fail here? Or should we build without UTF-8 support since we > > > don't have a resource compiler? I think that's what the configure > > > version does, right? > > > > You are right, that was an inconsistency on my part, sorry about that. > > It's true that the configure version is optional on this, whereas > > build_w32.bat errors out. > > > > I think the answer depends on what is going to be the policy regarding > > Make on Windows and UTF-8.If we want to claim that Make on Windows > > has gone UTF-8, matching fully the Unix-based platforms, then it has to > > be an error if it can't be built as such.My personal opinion is that > this > > is the way forward, because it may be confusing if we end up in a > > situation where some users have a UTF-8 version of Make and some > > others don't. > > Being able to know whether UTF-8 is
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Mon, 27 Mar 2023 23:04:52 +0100 > Cc: bug-make@gnu.org > > > Should we fail here? Or should we build without UTF-8 support since we > > don't have a resource compiler? I think that's what the configure > > version does, right? > > You are right, that was an inconsistency on my part, sorry about that. > It's true that the configure version is optional on this, whereas > build_w32.bat errors out. > > I think the answer depends on what is going to be the policy regarding > Make on Windows and UTF-8.If we want to claim that Make on Windows > has gone UTF-8, matching fully the Unix-based platforms, then it has to > be an error if it can't be built as such.My personal opinion is that this > is the way forward, because it may be confusing if we end up in a > situation where some users have a UTF-8 version of Make and some > others don't. Being able to know whether UTF-8 is supported or not is a valid concern. How about adding this information to what "make --version" shows? > I think just go full UTF-8 like the other systems. But it isn't "full UTF-8", as we have established during the discussions. MS-Windows is not yet ready for that, even in its latest versions. > Of course, users on versions of Windows earlier than the target version > that supports this feature still won't get UTF-8, but that would be because > of their version of Windows, not because of the way Make was built. Right. > That is, I am more inclined to make the configure version also error > out if windres was not found, than to make build_w32.bat optional. I'm of the opposite opinion. > This is mostly based on the fact that windres is part of binutils which is > pretty much ubiquitous because gcc itself relies on its tools (most > notably the assembler and linker).So if someone is building with > gcc, they will almost certainly already have windres.For building > with MSVC that's a non-issue because MSVC comes bundled with its > own resource compiler, so it is always going to be there. Then this is a non-issue: the error will not happen except in some situations we cannot imagine. > So I think it is reasonable to expect that there is always going to be a > resource compiler available.Even if not, say, when building with tcc, > it is always possible to error out with a message saying to install binutils. I'd like to avoid annoying users with requests to install something they did well without previously. Some would consider this a regression.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Paul Smith > Cc: bug-make@gnu.org > Date: Mon, 27 Mar 2023 16:35:42 -0400 > > +:FindWindres > +:: Show the resource compiler version that we found > +echo. > +call %RC% --version > +if not ERRORLEVEL 1 goto Build > +echo No %RC% found. > +exit 1 > > Should we fail here? Or should we build without UTF-8 support since we > don't have a resource compiler? I think that's what the configure > version does, right? This is up to you and Eli, I can see both sides. I think building without URF-8 support is better than a total failure, yes. Paul, what about the changes to the Posix configury? I don't feel I'm familiar enough with how building GNU Make uses that, so I'd like your approval before I install that part of the patch, lest I somehow break the other systems.
Re: [PATCH] Use UTF-8 active code page for Windows host.
Thanks for looking into this, and good to hear that you are feeling better! I still can't get Basic.mk to work, so not much to say on that... Should we fail here? Or should we build without UTF-8 support since we don't have a resource compiler? I think that's what the configure version does, right? You are right, that was an inconsistency on my part, sorry about that. It's true that the configure version is optional on this, whereas build_w32.bat errors out. I think the answer depends on what is going to be the policy regarding Make on Windows and UTF-8.If we want to claim that Make on Windows has gone UTF-8, matching fully the Unix-based platforms, then it has to be an error if it can't be built as such.My personal opinion is that this is the way forward, because it may be confusing if we end up in a situation where some users have a UTF-8 version of Make and some others don't.I think just go full UTF-8 like the other systems. Of course, users on versions of Windows earlier than the target version that supports this feature still won't get UTF-8, but that would be because of their version of Windows, not because of the way Make was built. That is, I am more inclined to make the configure version also error out if windres was not found, than to make build_w32.bat optional. This is mostly based on the fact that windres is part of binutils which is pretty much ubiquitous because gcc itself relies on its tools (most notably the assembler and linker).So if someone is building with gcc, they will almost certainly already have windres.For building with MSVC that's a non-issue because MSVC comes bundled with its own resource compiler, so it is always going to be there. So I think it is reasonable to expect that there is always going to be a resource compiler available.Even if not, say, when building with tcc, it is always possible to error out with a message saying to install binutils. For this (and the GCC version) shouldn't we be testing for failure and failing the build if it doesn't work? This is done by the existing CompileDone subroutine.The code you flagged and the GCC version right under it (and the TCC version for that matter) calls CompileDone as their last command which is an existing function that checks for the presence of the compiled object file and errors out if it wasn't found. On Mon, 27 Mar 2023 at 21:35, Paul Smith wrote: > I apologize for being incommunicado: for the last week I've been > fighting the mother of all head colds. The only good thing I can say > about it is that it was not COVID. > > I'm sort of back on my feet. > > > On Sat, 2023-03-25 at 23:12 +, Costas Argyris wrote: > > and then from README.W32 -> Building with (MinGW-) > > GCC using GNU Make: > > > > make -f Basic.mk > > or > > make -f Basic.mk TOOLCHAIN=gcc > > This makefile environment relies on the value of the MAKE_HOST variable > being some supported value. Since I never try to build with minGW, I > have no idea what the MAKE_HOST variable will be set to and that > probably means it's not working. Definitely the makefiles should > behave better, rather than just silently failing to find the > configuration and failing later. > > It will work if you use a GNU make built using build_w32.bat (or > anyway, it does for me). Also, that script will copy Build.mk to > Makefile so once you run it you can just run the make executable you > just built, without needing "-f Basic.mk", if you want. > > Regarding the patch: > > +:FindWindres > +:: Show the resource compiler version that we found > +echo. > +call %RC% --version > +if not ERRORLEVEL 1 goto Build > +echo No %RC% found. > +exit 1 > > Should we fail here? Or should we build without UTF-8 support since we > don't have a resource compiler? I think that's what the configure > version does, right? This is up to you and Eli, I can see both sides. > > +:: MSVC Resource Compile > +if "%VERBOSE%" == "Y" echo on > +call %RC% /fo %OUTDIR%\%1.%O% %1.rc > +@echo off > +goto CompileDone > > For this (and the GCC version) shouldn't we be testing for failure and > failing the build if it doesn't work? >
Re: [PATCH] Use UTF-8 active code page for Windows host.
I apologize for being incommunicado: for the last week I've been fighting the mother of all head colds. The only good thing I can say about it is that it was not COVID. I'm sort of back on my feet. On Sat, 2023-03-25 at 23:12 +, Costas Argyris wrote: > and then from README.W32 -> Building with (MinGW-) > GCC using GNU Make: > > make -f Basic.mk > or > make -f Basic.mk TOOLCHAIN=gcc This makefile environment relies on the value of the MAKE_HOST variable being some supported value. Since I never try to build with minGW, I have no idea what the MAKE_HOST variable will be set to and that probably means it's not working. Definitely the makefiles should behave better, rather than just silently failing to find the configuration and failing later. It will work if you use a GNU make built using build_w32.bat (or anyway, it does for me). Also, that script will copy Build.mk to Makefile so once you run it you can just run the make executable you just built, without needing "-f Basic.mk", if you want. Regarding the patch: +:FindWindres +:: Show the resource compiler version that we found +echo. +call %RC% --version +if not ERRORLEVEL 1 goto Build +echo No %RC% found. +exit 1 Should we fail here? Or should we build without UTF-8 support since we don't have a resource compiler? I think that's what the configure version does, right? This is up to you and Eli, I can see both sides. +:: MSVC Resource Compile +if "%VERBOSE%" == "Y" echo on +call %RC% /fo %OUTDIR%\%1.%O% %1.rc +@echo off +goto CompileDone For this (and the GCC version) shouldn't we be testing for failure and failing the build if it doesn't work?
Re: [PATCH] Use UTF-8 active code page for Windows host.
Here is the new patch that includes the changes in build_w32.bat required to build GNU Make with UTF-8 on Windows. I have tested it with all 3 supported toolchains: .\build_w32.bat (uses MSVC) .\build_w32.bat gcc .\build_w32.bat tcc with/without --debug and with/without --x86 for MSVC. In all cases, it produces the expected UTF-8 Make executable. gcc and tcc must find 'windres' from binutils to compile the UTF-8 resource file into an object file. I have also tested the Autotools approach (on Windows this time, as the above, in contrast to the cross-compile I did initially).From an MSYS2 shell: ./bootstrap (generates configure) ./configure (generates Makefiles) make MAKE_MAINTAINER_MODE= MAKE_CFLAGS= -j6 and we get a UTF-8 make.exe in the source dir. I also gave the Basic.mk approach a shot and, despite my best efforts, I couldn't get it to work with either toolchain. There were just so many problems that it seems broken to me, unless I wasn't using it correctly.I followed the steps in README.git -> Building From Git for Windows: .\bootstrap.bat and then from README.W32 -> Building with (MinGW-) GCC using GNU Make: make -f Basic.mk or make -f Basic.mk TOOLCHAIN=gcc but I was hitting some pretty fundamental errors like CC not being defined, I manually defined it and then got tons of link errors at which point I gave up.If these are functional and we want to support them with UTF-8, I would need some help to get them working first. For convenience, I have carried the previous changes along so this patch has everything done so far, that is, the changes in both the Autotools approach (tested both in cross & native compile) and the build_w32.bat approach.Also, this time everything was done from the current master branch of the Git repo (disabling the maintainer mode when building allowed me to build from Git by not turning warnings into errors). On Sat, 25 Mar 2023 at 13:36, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Sat, 25 Mar 2023 13:19:00 + > > Cc: psm...@gnu.org, bug-make@gnu.org > > > > > Also, I'd name the files slightly differently, something like > > > w32-utf8.*, to make their relation to Windows more evident. > > > > Note that in the patch, these files are put under the w32 dir: > > > > src/w32/utf8.manifest > > src/w32/utf8.rc > > > > so their relation to Windows is already showing from that. > > I'll leave that to Paul. > > > 1) In build_w32.bat, there are 3 compilers supported: > > MSVC, gcc and tcc (Tiny C Compiler).I can see the changes > > working for MSVC and gcc, but what about tcc? > > > > AFAICT, it has no resource compiler to compile the .rc file > > to an object file.It can link against the object file though, > > assuming that it was produced by windres in the first place: > > > > https://github.com/TinyCC/tinycc/blob/mob/win32/tcc-win32.txt#L92 > > > > But if one has tcc on its own and no windres, it doesn't > > seem possible to do it, so we need to decide if we are > > still going to build but without UTF-8 support, or error > > out, or try to find windres for compiling the .rc file and > > still use tcc for the rest (kind of a mixed approach). > > Either the TCC build will not support this feature, or installing the > manifest (at least in some cases) is not such a bad idea after all. > > > 2) From README.W32, there is another way to build > > > > Basic.mk > > > > which sources mk/Windows32.mk when building for Windows. > > > > mk/Windows32.mk has a TOOLCHAIN variable that can be > > either "msvc" or "gcc" (no tcc option here). > > > > Should mk/Windows32.mk also be updated for UTF-8 for > > both toolchains? > > I don't know what is the status of these *.mk files and whether we > want to keep supporting them. I have never used them. Paul? > 0001-Use-UTF-8-active-code-page-when-building-for-Windows.patch Description: Binary data
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sat, 25 Mar 2023 13:19:00 + > Cc: psm...@gnu.org, bug-make@gnu.org > > > Also, I'd name the files slightly differently, something like > > w32-utf8.*, to make their relation to Windows more evident. > > Note that in the patch, these files are put under the w32 dir: > > src/w32/utf8.manifest > src/w32/utf8.rc > > so their relation to Windows is already showing from that. I'll leave that to Paul. > 1) In build_w32.bat, there are 3 compilers supported: > MSVC, gcc and tcc (Tiny C Compiler).I can see the changes > working for MSVC and gcc, but what about tcc? > > AFAICT, it has no resource compiler to compile the .rc file > to an object file.It can link against the object file though, > assuming that it was produced by windres in the first place: > > https://github.com/TinyCC/tinycc/blob/mob/win32/tcc-win32.txt#L92 > > But if one has tcc on its own and no windres, it doesn't > seem possible to do it, so we need to decide if we are > still going to build but without UTF-8 support, or error > out, or try to find windres for compiling the .rc file and > still use tcc for the rest (kind of a mixed approach). Either the TCC build will not support this feature, or installing the manifest (at least in some cases) is not such a bad idea after all. > 2) From README.W32, there is another way to build > > Basic.mk > > which sources mk/Windows32.mk when building for Windows. > > mk/Windows32.mk has a TOOLCHAIN variable that can be > either "msvc" or "gcc" (no tcc option here). > > Should mk/Windows32.mk also be updated for UTF-8 for > both toolchains? I don't know what is the status of these *.mk files and whether we want to keep supporting them. I have never used them. Paul?
Re: [PATCH] Use UTF-8 active code page for Windows host.
Just some comments on the non-configury questions: Also, I'd name the files slightly differently, something like w32-utf8.*, to make their relation to Windows more evident. Note that in the patch, these files are put under the w32 dir: src/w32/utf8.manifest src/w32/utf8.rc so their relation to Windows is already showing from that. Most of the files under src/w32/ don't repeat the 'w32' in their name, but some do, so I can see this going either way. Finally, would we want to install the manifest file together with the executable, and if so, should its installation name be make.exe.manifest? I think the answer here is 'no'. The manifest file gets embedded in the executable, so it is literally part of it, so there is no need to have it next to it as a separate file.In fact, that is the strong point of this approach, that the manifest gets embedded at build time so the user doesn't know or see anything about it and Make just works in UTF-8 encoding. Having embedded the manifest in the executable, I am not even sure if it's a good idea to repeat the manifest by having it next to the executable as a separate file (I guess Windows just ends up ignoring it anyway). On the remaining work: I have started to do some work on bringing the changes over to build_w32.bat.This is much simpler than the configure approach so it shouldn't take long, I do have a couple general questions though: 1) In build_w32.bat, there are 3 compilers supported: MSVC, gcc and tcc (Tiny C Compiler).I can see the changes working for MSVC and gcc, but what about tcc? AFAICT, it has no resource compiler to compile the .rc file to an object file.It can link against the object file though, assuming that it was produced by windres in the first place: https://github.com/TinyCC/tinycc/blob/mob/win32/tcc-win32.txt#L92 But if one has tcc on its own and no windres, it doesn't seem possible to do it, so we need to decide if we are still going to build but without UTF-8 support, or error out, or try to find windres for compiling the .rc file and still use tcc for the rest (kind of a mixed approach). 2) From README.W32, there is another way to build Basic.mk which sources mk/Windows32.mk when building for Windows. mk/Windows32.mk has a TOOLCHAIN variable that can be either "msvc" or "gcc" (no tcc option here). Should mk/Windows32.mk also be updated for UTF-8 for both toolchains? On Sat, 25 Mar 2023 at 12:09, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Tue, 21 Mar 2023 15:08:52 + > > Cc: bug-make@gnu.org, Paul Smith > > > > > You can submit diffs against the last released version here as well. > > > > In that case, I am simply re-attaching the patch I originally sent in > > this thread, because that was already developed and built on 4.4.1 > > tarball which is still the latest AFAICT. > > > > Just reminding that these changes are in Makefile.am and configure.ac > > so you would have to build using that approach to actually get a > > UTF-8 Make on Windows host. > > > > The other two files of the patch, utf8.manifest and utf8.rc will be > > useful for the build_w32.bat approach as well because they will > > be reused by it (I don't see a reason why not). > > OK. > > Paul, I'd appreciate your review as well, as I'm less familiar with > the Posix configury of Make, and could easily miss some subtle issue. > > Looking at the patch now, I have a few minor comments: > > > --- a/Makefile.am > > +++ b/Makefile.am > > @@ -46,6 +46,8 @@ w32_SRCS = src/w32/pathstuff.c src/w32/w32os.c > src/w32/compat/dirent.c \ > > src/w32/subproc/misc.c src/w32/subproc/proc.h \ > > src/w32/subproc/sub_proc.c src/w32/subproc/w32err.c > > > > +w32_utf8_SRCS = src/w32/utf8.rc src/w32/utf8.manifest > > + > > vms_SRCS = src/vms_exit.c src/vms_export_symbol.c src/vms_progname.c \ > > src/vmsdir.h src/vmsfunctions.c src/vmsify.c > > > > @@ -90,6 +92,14 @@ else > >make_SOURCES += src/posixos.c > > endif > > > > +if HAVE_WINDRES > > + UTF8OBJ = src/w32/utf8.$(OBJEXT) > > + make_LDADD += $(UTF8OBJ) > > +endif > > + > > +$(UTF8OBJ) : $(w32_utf8_SRCS) > > + $(WINDRES) $< -o $@ > > + > > if USE_CUSTOMS > >make_SOURCES += src/remote-cstms.c > > else > > diff --git a/configure.ac b/configure.ac > > index cd78575..8cbf986 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -444,6 +444,7 @@ AC_SUBST([MAKE_HOST]) > > > > w32_target_env=no > > AM_CONDITIONAL([WINDOWSENV], [false]) > > +AM_CONDITIONAL([HAVE_WINDRES], [false]) > > > > AS_CASE([$host], > >[*-*-mingw32], > > @@ -451,6 +452,10 @@ AS_CASE([$host], > > w32_target_env=yes > > AC_DEFINE([WINDOWS32], [1], [Build for the WINDOWS32 API.]) > > AC_DEFINE([HAVE_DOS_PATHS], [1], [Support DOS-style pathnames.]) > > +# Windows host tools. > > +# If windres is available, make will use UTF-8. > > +AC_CHECK_TOOL([WINDRES], [windres], [:]) > > +AM_CONDITIONAL([HAVE_WINDRES], [test
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Tue, 21 Mar 2023 15:08:52 + > Cc: bug-make@gnu.org, Paul Smith > > > You can submit diffs against the last released version here as well. > > In that case, I am simply re-attaching the patch I originally sent in > this thread, because that was already developed and built on 4.4.1 > tarball which is still the latest AFAICT. > > Just reminding that these changes are in Makefile.am and configure.ac > so you would have to build using that approach to actually get a > UTF-8 Make on Windows host. > > The other two files of the patch, utf8.manifest and utf8.rc will be > useful for the build_w32.bat approach as well because they will > be reused by it (I don't see a reason why not). OK. Paul, I'd appreciate your review as well, as I'm less familiar with the Posix configury of Make, and could easily miss some subtle issue. Looking at the patch now, I have a few minor comments: > --- a/Makefile.am > +++ b/Makefile.am > @@ -46,6 +46,8 @@ w32_SRCS = src/w32/pathstuff.c src/w32/w32os.c > src/w32/compat/dirent.c \ > src/w32/subproc/misc.c src/w32/subproc/proc.h \ > src/w32/subproc/sub_proc.c src/w32/subproc/w32err.c > > +w32_utf8_SRCS = src/w32/utf8.rc src/w32/utf8.manifest > + > vms_SRCS = src/vms_exit.c src/vms_export_symbol.c src/vms_progname.c \ > src/vmsdir.h src/vmsfunctions.c src/vmsify.c > > @@ -90,6 +92,14 @@ else >make_SOURCES += src/posixos.c > endif > > +if HAVE_WINDRES > + UTF8OBJ = src/w32/utf8.$(OBJEXT) > + make_LDADD += $(UTF8OBJ) > +endif > + > +$(UTF8OBJ) : $(w32_utf8_SRCS) > + $(WINDRES) $< -o $@ > + > if USE_CUSTOMS >make_SOURCES += src/remote-cstms.c > else > diff --git a/configure.ac b/configure.ac > index cd78575..8cbf986 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -444,6 +444,7 @@ AC_SUBST([MAKE_HOST]) > > w32_target_env=no > AM_CONDITIONAL([WINDOWSENV], [false]) > +AM_CONDITIONAL([HAVE_WINDRES], [false]) > > AS_CASE([$host], >[*-*-mingw32], > @@ -451,6 +452,10 @@ AS_CASE([$host], > w32_target_env=yes > AC_DEFINE([WINDOWS32], [1], [Build for the WINDOWS32 API.]) > AC_DEFINE([HAVE_DOS_PATHS], [1], [Support DOS-style pathnames.]) > +# Windows host tools. > +# If windres is available, make will use UTF-8. > +AC_CHECK_TOOL([WINDRES], [windres], [:]) > +AM_CONDITIONAL([HAVE_WINDRES], [test "$WINDRES" != ':']) >]) > > AC_DEFINE_UNQUOTED([PATH_SEPARATOR_CHAR],['$PATH_SEPARATOR'], I think instead of using "if HAVE_WINDRES" it would be better to have UTF8OBJ to be computed by configure, leaving it empty on builds that don't target Windows. That's because AC_CHECK_TOOL test for 'windres' might not be the bets future proof text: users could have that installed for reasons unrelated to the build target. If my suggestion is accepted, the make_LDADD addition will not be needed; instead make_LDADD will always mention $(UTF8OBJ), and the value will be empty when not building for Windows. Also, I'd name the files slightly differently, something like w32-utf8.*, to make their relation to Windows more evident. Finally, would we want to install the manifest file together with the executable, and if so, should its installation name be make.exe.manifest?
Re: [PATCH] Use UTF-8 active code page for Windows host.
Yes I eventually found out about the maintainer mode by searching for where -Werror is coming from, that's how I ended up using the tarball (because there is no maintainer mode there). Didn't know about README.git though - this does have a lot of info indeed, thanks! On Tue, 21 Mar 2023 at 17:24, Paul Smith wrote: > On Tue, 2023-03-21 at 12:52 +, Costas Argyris wrote: > > When trying from git, which was my first attempt, I was getting > > compilation warnings which were turning themselves into errors, > > so I never managed to build. > > > > When I used the sources extracted from the tarball though, this > > simply wasn't the case so I was able to cross-compile just fine. > > FYI when you build from Git, the "maintainer mode" is enabled by > default. This includes extra runtime checks which make "make" slower, > and it also includes extreme compiler warning options, and also adds > the -Werror to turn those into failures (when you're using autotools). > > You can disable this mode, even when compiling from Git, if you need > to. Instructions for building from Git should be available in the > README.git file (this is only part of a Git checkout, it's not > available in the release). > > However as Eli says it's usually not a big deal to apply patches from a > source release, to the Git repository ourselves. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Tue, 2023-03-21 at 12:52 +, Costas Argyris wrote: > When trying from git, which was my first attempt, I was getting > compilation warnings which were turning themselves into errors, > so I never managed to build. > > When I used the sources extracted from the tarball though, this > simply wasn't the case so I was able to cross-compile just fine. FYI when you build from Git, the "maintainer mode" is enabled by default. This includes extra runtime checks which make "make" slower, and it also includes extreme compiler warning options, and also adds the -Werror to turn those into failures (when you're using autotools). You can disable this mode, even when compiling from Git, if you need to. Instructions for building from Git should be available in the README.git file (this is only part of a Git checkout, it's not available in the release). However as Eli says it's usually not a big deal to apply patches from a source release, to the Git repository ourselves.
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Tue, 21 Mar 2023 15:08:52 + Costas Argyris wrote: > I am simply re-attaching the patch I originally sent in > this thread, because that was already developed and built on 4.4.1 > tarball which is still the latest AFAICT. Yes, at the time of this writing version 4.4.1 is the latest release of make and that version with tag 4.4.1 is also the latest commit to master on the git repo. regards Henrik
Re: [PATCH] Use UTF-8 active code page for Windows host.
You can submit diffs against the last released version here as well. In that case, I am simply re-attaching the patch I originally sent in this thread, because that was already developed and built on 4.4.1 tarball which is still the latest AFAICT. Just reminding that these changes are in Makefile.am and configure.ac so you would have to build using that approach to actually get a UTF-8 Make on Windows host. The other two files of the patch, utf8.manifest and utf8.rc will be useful for the build_w32.bat approach as well because they will be reused by it (I don't see a reason why not). I'm not aware of it, but then I don't cross-build Make, and rarely build it from Git anyway. I think it's more of a 'build from Git' problem than a cross-compile problem. There is a maintMakefile in the root which only exists in the Git repository and not in the tarball: https://git.savannah.gnu.org/cgit/make.git/tree/maintMakefile On line 32 it sets -Werror so I think that's what I am facing. In that case, as I said, my patch was already from the 4.4.1 tarball so it should be OK. On Tue, 21 Mar 2023 at 13:32, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Tue, 21 Mar 2023 12:52:54 + > > Cc: bug-make@gnu.org, Paul Smith > > > > > If so, could you please post it again, rebased on the current Git > > > master? > > > > There is an issue here:I noticed that when I was trying to build > > (cross-compile) Make for Windows using a gcc + mingw-w64 > > cross-compiler (using Autotools, not build_w32.bat), there was a > > big difference depending on whether I was using the Make source > > code from git or the tarball. > > > > When trying from git, which was my first attempt, I was getting > > compilation warnings which were turning themselves into errors, > > so I never managed to build. > > > > When I used the sources extracted from the tarball though, this > > simply wasn't the case so I was able to cross-compile just fine. > > > > Then the problem was how to track my changes, since I don't > > have git any more.What I did just to get a patch posted was > > to simply 'git init' a repository in the extracted sources just so > > I could use 'git diff' and so forth.That way, I created the patch > > I originally posted. > > > > The problem now is that in order to rebase on the current Git > > master, I'd have to use Git, so I'll fall back to the original problem > > of not being able to build because of the warnings being treated > > as errors. > > If this gives you so much trouble, just post the diffs against the > last release's tarball, and I will take it from there. > > > Is this a known issue? > > I'm not aware of it, but then I don't cross-build Make, and rarely > build it from Git anyway. > > > Or is it just that not many people are > > cross-compiling for Windows using gcc and autotools (i.e. not > > using build_w32.bat)? > > Not sure. I think the MSYS2 folks do use the configury, but they use > it on Windows, not on GNU/Linux (i.e., they build natively, not by > cross-compiling). > > > That being said, I haven't tried to build the Git source code > > using build_w32.bat, so for all I know maybe that doesn't > > lead to the warnings/errors I got with autotools (mostly because > > that would be a native compile, as opposed to the cross-compile > > I am doing with autotools). > > You can submit diffs against the last released version here as well. > > > > And would you please consider working on changing build_w32.bat as > > > well? > > > > Absolutely, I haven't forgotten about this.I haven't looked into > > that file at all though, so I don't know how it configures its Makefiles, > > how it detects the toolchain etc, so I may need some help to speed > > things up.But definitely planning to do so in the coming days, or > > weekend at worst. > > Thanks. Please don't hesitate to ask questions. > From f042d4b82111624dfe84bb85758c9b61c76ece5f Mon Sep 17 00:00:00 2001 From: Costas Argyris Date: Sat, 18 Mar 2023 14:13:21 + Subject: [PATCH] Use UTF-8 active code page for Windows host. This allows the make process on Windows to work in UTF-8 in multiple levels: 1) Accept a Makefile that is encoded in UTF-8 (with or without the BOM, since it already gets ignored anyway). 2) Accept a UTF-8 path (input -f argument) to a Makefile (that could itself be encoded in UTF-8, as per #1). 3) Launch make from a current directory that has UTF-8 characters. and any combination of the above, since the entire make process will use UTF-8. This is already the case in Linux-based systems, but on Windows this change is required in order to support Unicode because the "A" APIs currently used will assume the legacy system code page, destroying any UTF-8 input. This change sets the code page to be used by the "A" APIs to the UTF-8 code page, thereby eliminating the need to update all calls of "A" functions to "W" functions to support Unicode. That is, the source code can stay the same with the "A"
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Tue, 21 Mar 2023 12:52:54 + > Cc: bug-make@gnu.org, Paul Smith > > > If so, could you please post it again, rebased on the current Git > > master? > > There is an issue here:I noticed that when I was trying to build > (cross-compile) Make for Windows using a gcc + mingw-w64 > cross-compiler (using Autotools, not build_w32.bat), there was a > big difference depending on whether I was using the Make source > code from git or the tarball. > > When trying from git, which was my first attempt, I was getting > compilation warnings which were turning themselves into errors, > so I never managed to build. > > When I used the sources extracted from the tarball though, this > simply wasn't the case so I was able to cross-compile just fine. > > Then the problem was how to track my changes, since I don't > have git any more.What I did just to get a patch posted was > to simply 'git init' a repository in the extracted sources just so > I could use 'git diff' and so forth.That way, I created the patch > I originally posted. > > The problem now is that in order to rebase on the current Git > master, I'd have to use Git, so I'll fall back to the original problem > of not being able to build because of the warnings being treated > as errors. If this gives you so much trouble, just post the diffs against the last release's tarball, and I will take it from there. > Is this a known issue? I'm not aware of it, but then I don't cross-build Make, and rarely build it from Git anyway. > Or is it just that not many people are > cross-compiling for Windows using gcc and autotools (i.e. not > using build_w32.bat)? Not sure. I think the MSYS2 folks do use the configury, but they use it on Windows, not on GNU/Linux (i.e., they build natively, not by cross-compiling). > That being said, I haven't tried to build the Git source code > using build_w32.bat, so for all I know maybe that doesn't > lead to the warnings/errors I got with autotools (mostly because > that would be a native compile, as opposed to the cross-compile > I am doing with autotools). You can submit diffs against the last released version here as well. > > And would you please consider working on changing build_w32.bat as > > well? > > Absolutely, I haven't forgotten about this.I haven't looked into > that file at all though, so I don't know how it configures its Makefiles, > how it detects the toolchain etc, so I may need some help to speed > things up.But definitely planning to do so in the coming days, or > weekend at worst. Thanks. Please don't hesitate to ask questions.
Re: [PATCH] Use UTF-8 active code page for Windows host.
If so, could you please post it again, rebased on the current Git master? There is an issue here:I noticed that when I was trying to build (cross-compile) Make for Windows using a gcc + mingw-w64 cross-compiler (using Autotools, not build_w32.bat), there was a big difference depending on whether I was using the Make source code from git or the tarball. When trying from git, which was my first attempt, I was getting compilation warnings which were turning themselves into errors, so I never managed to build. When I used the sources extracted from the tarball though, this simply wasn't the case so I was able to cross-compile just fine. Then the problem was how to track my changes, since I don't have git any more.What I did just to get a patch posted was to simply 'git init' a repository in the extracted sources just so I could use 'git diff' and so forth.That way, I created the patch I originally posted. The problem now is that in order to rebase on the current Git master, I'd have to use Git, so I'll fall back to the original problem of not being able to build because of the warnings being treated as errors. Is this a known issue?Or is it just that not many people are cross-compiling for Windows using gcc and autotools (i.e. not using build_w32.bat)? That being said, I haven't tried to build the Git source code using build_w32.bat, so for all I know maybe that doesn't lead to the warnings/errors I got with autotools (mostly because that would be a native compile, as opposed to the cross-compile I am doing with autotools). And would you please consider working on changing build_w32.bat as well? Absolutely, I haven't forgotten about this.I haven't looked into that file at all though, so I don't know how it configures its Makefiles, how it detects the toolchain etc, so I may need some help to speed things up.But definitely planning to do so in the coming days, or weekend at worst. On Tue, 21 Mar 2023 at 03:23, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Mon, 20 Mar 2023 21:47:27 + > > Cc: bug-make@gnu.org, Paul Smith > > > > Any thoughts for next steps then? > > If the patch ready to be installed? > > If so, could you please post it again, rebased on the current Git > master? > > And would you please consider working on changing build_w32.bat as > well? > > Thanks. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Mon, 20 Mar 2023 21:47:27 + > Cc: bug-make@gnu.org, Paul Smith > > Any thoughts for next steps then? If the patch ready to be installed? If so, could you please post it again, rebased on the current Git master? And would you please consider working on changing build_w32.bat as well? Thanks.
Re: [PATCH] Use UTF-8 active code page for Windows host.
We won't know, not unless and until some user complains and debugging shows this is the cause. But we can warn about this issue in the documentation up front, so that people don't raise their expectations too high. Makes sense. Any thoughts for next steps then? On Mon, 20 Mar 2023 at 16:17, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Mon, 20 Mar 2023 14:58:39 + > > Cc: bug-make@gnu.org, Paul Smith > > > > Still my concern would be: assuming that we actually learn something > > from this test, how would we know: > > > > 1) Which other functions besides stricmp are affected?Maybe > > letter-case is not the only problematic aspect. > > 2) Which of the above (#1) set of functions are actually called from > > Make source code, directly or indirectly? > > 3) Which of the above (#2) set of functions could be called with UTF-8 > > input that would cause them to break? > > We won't know, not unless and until some user complains and debugging > shows this is the cause. But we can warn about this issue in the > documentation up front, so that people don't raise their expectations > too high. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Mon, 20 Mar 2023 14:58:39 + > Cc: bug-make@gnu.org, Paul Smith > > Still my concern would be: assuming that we actually learn something > from this test, how would we know: > > 1) Which other functions besides stricmp are affected?Maybe > letter-case is not the only problematic aspect. > 2) Which of the above (#1) set of functions are actually called from > Make source code, directly or indirectly? > 3) Which of the above (#2) set of functions could be called with UTF-8 > input that would cause them to break? We won't know, not unless and until some user complains and debugging shows this is the cause. But we can warn about this issue in the documentation up front, so that people don't raise their expectations too high.
Re: [PATCH] Use UTF-8 active code page for Windows host.
It's not easy, AFAICS in the Make sources. Maybe it will be easier to write a simple test program prepare a manifest for it, and see if stricmp compares equal strings with different letter-case when characters outside of the locale are involved. I can do that. Still my concern would be: assuming that we actually learn something from this test, how would we know: 1) Which other functions besides stricmp are affected?Maybe letter-case is not the only problematic aspect. 2) Which of the above (#1) set of functions are actually called from Make source code, directly or indirectly? 3) Which of the above (#2) set of functions could be called with UTF-8 input that would cause them to break? On Mon, 20 Mar 2023 at 14:05, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Mon, 20 Mar 2023 13:45:14 + > > Cc: bug-make@gnu.org, Paul Smith > > > > > That's most probably because $(wildcard) calls a Win32 API that is > > > case-insensitive. So the jury is still out on this matter, and I > > > still believe that the below is true: > > > > In that case, are you aware of any Make construct other than $(wildcard) > > that will lead to calling an API of interest?I'd be happy to test it > against > > the patched UTF-8 version of Make that I have built. > > It's not easy, AFAICS in the Make sources. Maybe it will be easier to > write a simple test program prepare a manifest for it, and see if > stricmp compares equal strings with different letter-case when > characters outside of the locale are involved. > > > In any case, do you see this as a blocking issue for the UTF-8 feature? > > Not a blocking issue, just something that we'd need to document, for > users to be aware of. > > > Is the concern that the UTF-8 feature might break existing things that > > work, or that some things that we might naively expect to work with the > > switch to UTF-8, won't actually work? > > I don't think this will break something that isn't already broken. > But it could trigger expectations that cannot be met, and so we should > document these caveats. > > > > This is about UCRT specifically, so I wonder whether MSVCRT will > > > behave the same. > > > > That's true.I wonder how the examples I did so far worked, given > > that (as you found out) my UTF-8-patched Make is linked against > > MSVCRT.Is it just that everything I tried so far is so simple that it > > doesn't even trigger calls to sensitive functions in MSVCRT? > > I think so, yes. > > Also, you didn't try to set the locale to .UTF-8, which is what that > page was describing. > > > Because > > from what I found online, MSVCRT does not support UTF-8, and yet > > somehow things appear to be working, at least on the surface. > > CRT functions are implemented on top of Win32 APIs, and I think the > manifest affects the latter. That's why it works. Functionality that > is implemented completely in CRT, such as setlocale, for example, does > indeed need UCRT to work. Or at least this is my guess. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
Also, you didn't try to set the locale to .UTF-8, which is what that page was describing. That is because I am linking Make against MSVCRT, and that call is only possible when linking against UCRT, AFAICT, so I didn't see a reason to try. CRT functions are implemented on top of Win32 APIs, and I think the manifest affects the latter. That's why it works. Functionality that is implemented completely in CRT, such as setlocale, for example, does indeed need UCRT to work. Or at least this is my guess. Makes sense.From what I have read, the manifest affects the ANSI code page (ACP) which in turn is being used by the -A APIs (the Make source code is calling those implicitly by not #define'ing UNICODE, i.e. a call to CreateProcess is simply a macro that resolves to CreateProcessA because UNICODE is not defined). Therefore, CRT functions that are relying on those -A APIs will get to work in UTF-8 for free it seems. On Mon, 20 Mar 2023 at 14:05, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Mon, 20 Mar 2023 13:45:14 + > > Cc: bug-make@gnu.org, Paul Smith > > > > > That's most probably because $(wildcard) calls a Win32 API that is > > > case-insensitive. So the jury is still out on this matter, and I > > > still believe that the below is true: > > > > In that case, are you aware of any Make construct other than $(wildcard) > > that will lead to calling an API of interest?I'd be happy to test it > against > > the patched UTF-8 version of Make that I have built. > > It's not easy, AFAICS in the Make sources. Maybe it will be easier to > write a simple test program prepare a manifest for it, and see if > stricmp compares equal strings with different letter-case when > characters outside of the locale are involved. > > > In any case, do you see this as a blocking issue for the UTF-8 feature? > > Not a blocking issue, just something that we'd need to document, for > users to be aware of. > > > Is the concern that the UTF-8 feature might break existing things that > > work, or that some things that we might naively expect to work with the > > switch to UTF-8, won't actually work? > > I don't think this will break something that isn't already broken. > But it could trigger expectations that cannot be met, and so we should > document these caveats. > > > > This is about UCRT specifically, so I wonder whether MSVCRT will > > > behave the same. > > > > That's true.I wonder how the examples I did so far worked, given > > that (as you found out) my UTF-8-patched Make is linked against > > MSVCRT.Is it just that everything I tried so far is so simple that it > > doesn't even trigger calls to sensitive functions in MSVCRT? > > I think so, yes. > > Also, you didn't try to set the locale to .UTF-8, which is what that > page was describing. > > > Because > > from what I found online, MSVCRT does not support UTF-8, and yet > > somehow things appear to be working, at least on the surface. > > CRT functions are implemented on top of Win32 APIs, and I think the > manifest affects the latter. That's why it works. Functionality that > is implemented completely in CRT, such as setlocale, for example, does > indeed need UCRT to work. Or at least this is my guess. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Mon, 20 Mar 2023 13:45:14 + > Cc: bug-make@gnu.org, Paul Smith > > > That's most probably because $(wildcard) calls a Win32 API that is > > case-insensitive. So the jury is still out on this matter, and I > > still believe that the below is true: > > In that case, are you aware of any Make construct other than $(wildcard) > that will lead to calling an API of interest?I'd be happy to test it > against > the patched UTF-8 version of Make that I have built. It's not easy, AFAICS in the Make sources. Maybe it will be easier to write a simple test program prepare a manifest for it, and see if stricmp compares equal strings with different letter-case when characters outside of the locale are involved. > In any case, do you see this as a blocking issue for the UTF-8 feature? Not a blocking issue, just something that we'd need to document, for users to be aware of. > Is the concern that the UTF-8 feature might break existing things that > work, or that some things that we might naively expect to work with the > switch to UTF-8, won't actually work? I don't think this will break something that isn't already broken. But it could trigger expectations that cannot be met, and so we should document these caveats. > > This is about UCRT specifically, so I wonder whether MSVCRT will > > behave the same. > > That's true.I wonder how the examples I did so far worked, given > that (as you found out) my UTF-8-patched Make is linked against > MSVCRT.Is it just that everything I tried so far is so simple that it > doesn't even trigger calls to sensitive functions in MSVCRT? I think so, yes. Also, you didn't try to set the locale to .UTF-8, which is what that page was describing. > Because > from what I found online, MSVCRT does not support UTF-8, and yet > somehow things appear to be working, at least on the surface. CRT functions are implemented on top of Win32 APIs, and I think the manifest affects the latter. That's why it works. Functionality that is implemented completely in CRT, such as setlocale, for example, does indeed need UCRT to work. Or at least this is my guess.
Re: [PATCH] Use UTF-8 active code page for Windows host.
That's most probably because $(wildcard) calls a Win32 API that is case-insensitive. So the jury is still out on this matter, and I still believe that the below is true: In that case, are you aware of any Make construct other than $(wildcard) that will lead to calling an API of interest?I'd be happy to test it against the patched UTF-8 version of Make that I have built. In any case, do you see this as a blocking issue for the UTF-8 feature? Is the concern that the UTF-8 feature might break existing things that work, or that some things that we might naively expect to work with the switch to UTF-8, won't actually work? This is about UCRT specifically, so I wonder whether MSVCRT will behave the same. That's true.I wonder how the examples I did so far worked, given that (as you found out) my UTF-8-patched Make is linked against MSVCRT.Is it just that everything I tried so far is so simple that it doesn't even trigger calls to sensitive functions in MSVCRT?Because from what I found online, MSVCRT does not support UTF-8, and yet somehow things appear to be working, at least on the surface. Only on Windows versions that support this. Yes, this whole feature makes sense only on Windows Version 1903 (May 2019 Update) or later anyway (this is Windows 10). Previous versions will simply be unaffected.Make will still run, but will still break when faced with UTF-8 input in any way. Given that the feature will only work on Windows 10, UCRT will also be available, so if linking against UCRT it will be possible to call setlocale(LC_ALL, ".UTF8") and get full UTF-8 support in the C lib as well. If linking against MSVCRT, we are forced to face the restrictions it has anyway. Which brings me back to my question of whether you see this as a potential blocking issue for Make switching to UTF-8 on Windows by embedding the UTF-8 manifest at build time. On Mon, 20 Mar 2023 at 11:54, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Sun, 19 Mar 2023 21:25:30 + > > Cc: bug-make@gnu.org, Paul Smith > > > > I create a file src.β first: > > > > touch src.β > > > > and then run the following UTF-8 encoded Makefile: > > > > hello : > > @gcc ©\src.c -o ©\src.exe > > > > ifneq ("$(wildcard src.β)","") > > @echo src.β exists > > else > > @echo src.β does NOT exist > > endif > > > > ifneq ("$(wildcard src.Β)","") > > @echo src.Β exists > > else > > @echo src.Β does NOT exist > > endif > > > > ifneq ("$(wildcard src.βΒ)","") > > @echo src.βΒ exists > > else > > @echo src.βΒ does NOT exist > > endif > > > > and the output of Make is: > > > > C:\Users\cargyris\temp>make -f utf8.mk > > src.β exists > > src.Β exists > > src.βΒ does NOT exist > > > > which shows that it finds the one with the upper case extension as well, > > despite the fact that it exists in the file system as a lower case > extension. > > That's most probably because $(wildcard) calls a Win32 API that is > case-insensitive. So the jury is still out on this matter, and I > still believe that the below is true: > > > My guess would be that only characters within the locale, defined by > > the ANSI codepage, are supported by locale-aware functions in the C > > runtime. That's because this is what happens even if you use "wide" > > Unicode APIs and/or functions like _wcsicmp that accept wchar_t > > characters: they all support only the characters of the current locale > > set by 'setlocale'. I don't expect that to change just because UTF-8 > > is used on the outside: internally, everything is converted to UTF-16, > > i.e. to the Windows flavor of wchar_t. > > > > But this one looks most relevant to your point: > > > > > https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support > > > > > > "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C > Runtime supports using a UTF-8 code > > page. The change means that char strings passed to C runtime functions > can expect strings in the UTF-8 > > encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using > setlocale. For example, > > setlocale(LC_ALL, ".UTF8") will use the current default Windows ANSI > code page (ACP) for the locale and > > UTF-8 for the code page." > > This is about UCRT specifically, so I wonder whether MSVCRT will > behave the same. > > > My point is, with the manifest embedded at build time, ACP will be UTF-8 > > already when the program (Make) runs, so no need to do anything more. > > Only on Windows versions that support this. >
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sun, 19 Mar 2023 21:25:30 + > Cc: bug-make@gnu.org, Paul Smith > > I create a file src.β first: > > touch src.β > > and then run the following UTF-8 encoded Makefile: > > hello : > @gcc ©\src.c -o ©\src.exe > > ifneq ("$(wildcard src.β)","") > @echo src.β exists > else > @echo src.β does NOT exist > endif > > ifneq ("$(wildcard src.Β)","") > @echo src.Β exists > else > @echo src.Β does NOT exist > endif > > ifneq ("$(wildcard src.βΒ)","") > @echo src.βΒ exists > else > @echo src.βΒ does NOT exist > endif > > and the output of Make is: > > C:\Users\cargyris\temp>make -f utf8.mk > src.β exists > src.Β exists > src.βΒ does NOT exist > > which shows that it finds the one with the upper case extension as well, > despite the fact that it exists in the file system as a lower case extension. That's most probably because $(wildcard) calls a Win32 API that is case-insensitive. So the jury is still out on this matter, and I still believe that the below is true: > My guess would be that only characters within the locale, defined by > the ANSI codepage, are supported by locale-aware functions in the C > runtime. That's because this is what happens even if you use "wide" > Unicode APIs and/or functions like _wcsicmp that accept wchar_t > characters: they all support only the characters of the current locale > set by 'setlocale'. I don't expect that to change just because UTF-8 > is used on the outside: internally, everything is converted to UTF-16, > i.e. to the Windows flavor of wchar_t. > > But this one looks most relevant to your point: > > https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support > > > "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C Runtime > supports using a UTF-8 code > page. The change means that char strings passed to C runtime functions can > expect strings in the UTF-8 > encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using > setlocale. For example, > setlocale(LC_ALL, ".UTF8") will use the current default Windows ANSI code > page (ACP) for the locale and > UTF-8 for the code page." This is about UCRT specifically, so I wonder whether MSVCRT will behave the same. > My point is, with the manifest embedded at build time, ACP will be UTF-8 > already when the program (Make) runs, so no need to do anything more. Only on Windows versions that support this.
Re: [PATCH] Use UTF-8 active code page for Windows host.
That's not a good experiment, IMO: the only non-ASCII character here is U+274E, which has no case variants. And the characters whose letter-case you tried to change are all ASCII, so their case conversions are unaffected by the locale. OK I think this is a better one, it is using U+03B2 and U+0392 which are the lower and upper case of the same letter (β and Β). I create a file src.β first: touch src.β and then run the following UTF-8 encoded Makefile: hello : @gcc ©\src.c -o ©\src.exe ifneq ("$(wildcard src.β)","") @echo src.β exists else @echo src.β does NOT exist endif ifneq ("$(wildcard src.Β)","") @echo src.Β exists else @echo src.Β does NOT exist endif ifneq ("$(wildcard src.βΒ)","") @echo src.βΒ exists else @echo src.βΒ does NOT exist endif and the output of Make is: C:\Users\cargyris\temp>make -f utf8.mk src.β exists src.Β exists src.βΒ does NOT exist which shows that it finds the one with the upper case extension as well, despite the fact that it exists in the file system as a lower case extension. My guess would be that only characters within the locale, defined by the ANSI codepage, are supported by locale-aware functions in the C runtime. That's because this is what happens even if you use "wide" Unicode APIs and/or functions like _wcsicmp that accept wchar_t characters: they all support only the characters of the current locale set by 'setlocale'. I don't expect that to change just because UTF-8 is used on the outside: internally, everything is converted to UTF-16, i.e. to the Windows flavor of wchar_t. When the manifest is used to set the active code page of the process to UTF-8, the current ANSI code page does become UTF-8, so that might explain why the above example is working. As mentioned in: https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170 "Also, the run-time library might obtain and use the value of the operating system code page, which is constant for the duration of the program's execution." This seems to be offering some kind of confirmation. But this one looks most relevant to your point: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C Runtime supports using a UTF-8 code page. The change means that char strings passed to C runtime functions can expect strings in the UTF-8 encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using setlocale. For example, setlocale(LC_ALL, ".UTF8") will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page." src/main.c:1245 has: setlocale (LC_ALL, ""); so this could be changed to: setlocale (LC_ALL, ".UTF8") conditionally on the Windows version above, but I'm not sure if that is even necessary, given the UTF-8 manifest change. >From reading the above doc my understanding is that embedding the UTF-8 manifest has an effect that covers the C runtime as well.For example: "UTF-8 mode is also enabled for functions that have historically translated char strings using the default Windows *ANSI code page (ACP)*. For example, calling _mkdir("") while using a UTF-8 code page will correctly produce a directory with that emoji as the folder name, *instead of requiring the ACP to be changed to UTF-8 before running your program.* Likewise, calling _getcwd() in that folder will return a UTF-8 encoded string. *For compatibility, the ACP is still used if the C locale code page isn't set to UTF-8.*" I have highlighted the important parts in bold. My point is, with the manifest embedded at build time, ACP will be UTF-8 already when the program (Make) runs, so no need to do anything more. This advice is for how to use UTF-8 in the C runtime if you don't have ACP == UTF-8. The Unicode -W APIs are different compared to the -A APIs in that they don't even look at the current ANSI code page, they just use UTF-16. On Sun, 19 Mar 2023 at 17:01, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Sun, 19 Mar 2023 16:34:54 + > > Cc: bug-make@gnu.org, psm...@gnu.org > > > > > OK, but how is the make.exe you produced built? > > > > I actually did what you suggested but was somewhat confused with the > > result.Usually I do this with 'ldd', but both msvcrt.dll and > ucrtbase.dll > > show up in 'ldd make.exe' output, and I wasn't sure what to think of it. > > > > However, your approach with objdump gives fewer results and only > > lists msvcrt.dll, not ucrtbase.dll: > > > > C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:" > > DLL Name: ADVAPI32.dll > > DLL Name: KERNEL32.dll > > DLL Name: msvcrt.dll > > DLL Name: USER32.dll > > > > So I guess MSVCRT is enough, i.e. no need for UCRT. > > Yes, thanks. > > > > If you try using in a Makefile file names with non-ASCII > > > characters outside of the current ANSI codepage, does Make succeed to > > > recognize files
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sun, 19 Mar 2023 16:34:54 + > Cc: bug-make@gnu.org, psm...@gnu.org > > > OK, but how is the make.exe you produced built? > > I actually did what you suggested but was somewhat confused with the > result.Usually I do this with 'ldd', but both msvcrt.dll and ucrtbase.dll > show up in 'ldd make.exe' output, and I wasn't sure what to think of it. > > However, your approach with objdump gives fewer results and only > lists msvcrt.dll, not ucrtbase.dll: > > C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:" > DLL Name: ADVAPI32.dll > DLL Name: KERNEL32.dll > DLL Name: msvcrt.dll > DLL Name: USER32.dll > > So I guess MSVCRT is enough, i.e. no need for UCRT. Yes, thanks. > > If you try using in a Makefile file names with non-ASCII > > characters outside of the current ANSI codepage, does Make succeed to > > recognize files mentioned in the Makefile whose letter-case is > > different from what is seen in the file system? > > I think it does, here is the experiment: > > C:\Users\cargyris\temp>ls ❎ > src.c > > There is only src.c in that folder. > > Makefile utf8.mk is UTF-8 encoded and has this content that > checks for the existence of: > > ❎\src.c > ❎\src.C > ❎\src.cs > > where ❎ is outside the ANSI codepage (1252). That's not a good experiment, IMO: the only non-ASCII character here is U+274E, which has no case variants. And the characters whose letter-case you tried to change are all ASCII, so their case conversions are unaffected by the locale. > If I understand this correctly, both src.c and src.C should be found, > but not src.cs (just to show a negative case as well). In addition, I'm not sure Make actually compares file names somewhere, I think it just calls 'stat', and that is of course case-insensitive (because the filesystem is on the base level). My guess would be that only characters within the locale, defined by the ANSI codepage, are supported by locale-aware functions in the C runtime. That's because this is what happens even if you use "wide" Unicode APIs and/or functions like _wcsicmp that accept wchar_t characters: they all support only the characters of the current locale set by 'setlocale'. I don't expect that to change just because UTF-8 is used on the outside: internally, everything is converted to UTF-16, i.e. to the Windows flavor of wchar_t. > > Btw, there's one aspect where Make on MS-Windows will probably fall > > short of modern Posix systems: the display of non-ASCII characters on > > the screen. > > Indeed, some thoughts on that: > > 1) As you know, this is only affecting the visual aspect of the logs, not the > inner workings of Make.This could confuse users because they would > be seeing "errors" on the screen, without there being any real errors. > Perhaps a mention in the doc or release notes could remedy that. > > 2) To some extent (maybe even completely, I don't know) this can be > mitigated with using PowerShell instead of the classic Command Prompt. > This seems to be working in this case at least: This could be just sheer luck: PowerShell uses a font that supports that particular character. The basic problem here is that "Command Prompt" windows don't allow to configure more than one font for displaying characters, and a single font can never support more than a few scripts. If PowerShell doesn't allow more than a single font in its windows, it will suffer from the same problem. > If anything, it could be worth a mention in the doc. Yes, of course.
Re: [PATCH] Use UTF-8 active code page for Windows host.
OK, but how is the make.exe you produced built? I actually did what you suggested but was somewhat confused with the result.Usually I do this with 'ldd', but both msvcrt.dll and ucrtbase.dll show up in 'ldd make.exe' output, and I wasn't sure what to think of it. However, your approach with objdump gives fewer results and only lists msvcrt.dll, not ucrtbase.dll: C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:" DLL Name: ADVAPI32.dll DLL Name: KERNEL32.dll DLL Name: msvcrt.dll DLL Name: USER32.dll So I guess MSVCRT is enough, i.e. no need for UCRT. If you try using in a Makefile file names with non-ASCII characters outside of the current ANSI codepage, does Make succeed to recognize files mentioned in the Makefile whose letter-case is different from what is seen in the file system? I think it does, here is the experiment: C:\Users\cargyris\temp>ls ❎ src.c There is only src.c in that folder. Makefile utf8.mk is UTF-8 encoded and has this content that checks for the existence of: ❎\src.c ❎\src.C ❎\src.cs where ❎ is outside the ANSI codepage (1252). If I understand this correctly, both src.c and src.C should be found, but not src.cs (just to show a negative case as well). hello : @gcc ©\src.c -o ©\src.exe ifneq ("$(wildcard ❎\src.c)","") @echo ❎\src.c exists else @echo ❎\src.c does NOT exist endif ifneq ("$(wildcard ❎\src.C)","") @echo ❎\src.C exists else @echo ❎\src.C does NOT exist endif ifneq ("$(wildcard ❎\src.cs)","") @echo ❎\src.cs exists else @echo ❎\src.cs does NOT exist endif Here is the result of running the UTF-8-patched Make on it: C:\Users\cargyris\temp>make.exe -f utf8.mk ❎\src.c exists ❎\src.C exists ❎\src.cs does NOT exist I don't know if that was a good way to test your point, feel free to suggest a different one if it was not.It seems to be doing the right thing, finding the .C file as well. Indeed. But build_w32.bat is a very simple batch file, so I don't think modifying it will present any difficulty. Let us know if you need help in that matter. Sure, thanks. Btw, there's one aspect where Make on MS-Windows will probably fall short of modern Posix systems: the display of non-ASCII characters on the screen. Indeed, some thoughts on that: 1) As you know, this is only affecting the visual aspect of the logs, not the inner workings of Make.This could confuse users because they would be seeing "errors" on the screen, without there being any real errors. Perhaps a mention in the doc or release notes could remedy that. 2) To some extent (maybe even completely, I don't know) this can be mitigated with using PowerShell instead of the classic Command Prompt. This seems to be working in this case at least: Command Prompt: C:\Users\cargyris\temp>make.exe -f utf8.mk echo â?Z\src.c exists PowerShell: PS C:\Users\cargyris\temp> make.exe -f utf8.mk echo ❎\src.c exists If anything, it could be worth a mention in the doc. On Sun, 19 Mar 2023 at 14:38, Eli Zaretskii wrote: > > From: Costas Argyris > > Date: Sun, 19 Mar 2023 13:42:52 + > > Cc: bug-make@gnu.org > > > > Does this support require Make to be linked against the UCRT > > run-time library, or does it also work with the older MSVCRT? > > > > I haven't found anything explicitly mentioned about this in the official > > doc: > > > > > https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page > > OK, but how is the make.exe you produced built? is it using UCRT or > MSVCRT when it runs? You can check that by examining the dependencies > of the .exe file with, e.g., the Dependency Walker program > (https://www.dependencywalker.com/) or similar. Or just use objdump > from GNU Binutils: > > objdump -p make.exe | fgrep "DLL Name:" > > and see if this shows MSVCRT.DLL or the UCRT one. > > > Does using UTF-8 as the active page in Make mean that locale-dependent > > C library functions will behave as expected? > > > > I think so.Here is the relevant doc I found: > > > > > https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170 > > This is not enough. If locale-dependent C library function still > support only the characters expressible with the ANSI codepage, then a > program using the UTF-8 active codepage will be unable to process the > non-ASCII characters outside of the ANSI codepage correctly. For > example, downcasing such characters or comparing them in > case-insensitive manner will not work. This is because for this to > work those functions need to have access to tables of character > properties for the entire Unicode range, not just for the current > locale. If you try using in a Makefile file names with non-ASCII > characters outside of the current ANSI codepage, does Make succeed to > recognize files mentioned in the Makefile whose letter-case is > different from what is seen in the file system? > > > Also, since the above experiments seem to suggest that we are not > > dropping
Re: [PATCH] Use UTF-8 active code page for Windows host.
> Date: Sun, 19 Mar 2023 16:38:08 +0200 > From: Eli Zaretskii > Cc: bug-make@gnu.org > > > From: Costas Argyris > > Date: Sun, 19 Mar 2023 13:42:52 + > > Cc: bug-make@gnu.org > > > > Also, since the above experiments seem to suggest that we are not > > dropping existing support for non-ASCII characters in programs > > called by Make, it seems like a clear step forwards in terms of > > Unicode support on Windows. > > I agree. Btw, there's one aspect where Make on MS-Windows will probably fall short of modern Posix systems: the display of non-ASCII characters on the screen. Such as the "Entering directory FOO" and echo of the commands being run by Make. A typical Windows console (a.k.a. "Command Prompt" window) can display non-ASCII characters only from a handful of scripts due to limitations of the fonts used for these windows, and in addition displaying UTF-8 encoded characters in these windows using printf etc. doesn't work well. So users who use such non-ASCII characters in their Makefiles should expect a lot of mojibake on the screen.
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Sun, 2023-03-19 at 16:47 +0200, Eli Zaretskii wrote: > If we add tests for this feature (and I agree it's desirable), we > should generate the files with non-ASCII names for those tests as > part of the test script, not having them ready in the repository and > the tarball. Agreed for sure; plus that's how all the tests work today (create the test files they are going to use and delete them again after) so new tests should follow that precedent.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Paul Smith > Date: Sun, 19 Mar 2023 10:32:36 -0400 > > It would be nice if there was a regression test or two created that > would show this behavior. If we add tests for this feature (and I agree it's desirable), we should generate the files with non-ASCII names for those tests as part of the test script, not having them ready in the repository and the tarball. That's because unpacking a tarball with non-ASCII characters and/or having them in Git will immediately cause problems on Windows, where the unpacking tools and at least some versions of Git for Windows cannot cope with arbitrary non-ASCII file names. The Texinfo project had quite a few similar problems, and ended up generating the files as part of running the test suite as the only viable solution.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Paul Smith > Cc: bug-make@gnu.org > Date: Sun, 19 Mar 2023 10:27:16 -0400 > > Other people (like Eli who is the primary maintainer of GNU Make for > Windows) have other environments and do more vigorous testing. But I > don't believe Eli uses autotools on Windows, either. I do use autotools on Windows, just not for building GNU Make. > > Assuming all questions are answered first, would it be OK to work on > > the build_w32.bat changes in a second separate patch, and keep the > > first one focused only on the Unix-like build process? > > Patches can be provided in any order, but until build_w32.bat is > updated there won't be any testing of these features during the > "normal" development process. Presumably (but again, you'll have to > ask them) the MinGW folks etc. will take release candidate builds and > verify them in their own environments, once those become available. > > This is not to discourage you in any way: UTF-8 is assumed by GNU Make > on POSIX systems and getting that to be true on Windows is a big step > in the right direction IMO! Indeed. But build_w32.bat is a very simple batch file, so I don't think modifying it will present any difficulty. Let us know if you need help in that matter.
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sun, 19 Mar 2023 13:42:52 + > Cc: bug-make@gnu.org > > Does this support require Make to be linked against the UCRT > run-time library, or does it also work with the older MSVCRT? > > I haven't found anything explicitly mentioned about this in the official > doc: > > https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page OK, but how is the make.exe you produced built? is it using UCRT or MSVCRT when it runs? You can check that by examining the dependencies of the .exe file with, e.g., the Dependency Walker program (https://www.dependencywalker.com/) or similar. Or just use objdump from GNU Binutils: objdump -p make.exe | fgrep "DLL Name:" and see if this shows MSVCRT.DLL or the UCRT one. > Does using UTF-8 as the active page in Make mean that locale-dependent > C library functions will behave as expected? > > I think so.Here is the relevant doc I found: > > https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170 This is not enough. If locale-dependent C library function still support only the characters expressible with the ANSI codepage, then a program using the UTF-8 active codepage will be unable to process the non-ASCII characters outside of the ANSI codepage correctly. For example, downcasing such characters or comparing them in case-insensitive manner will not work. This is because for this to work those functions need to have access to tables of character properties for the entire Unicode range, not just for the current locale. If you try using in a Makefile file names with non-ASCII characters outside of the current ANSI codepage, does Make succeed to recognize files mentioned in the Makefile whose letter-case is different from what is seen in the file system? > Also, since the above experiments seem to suggest that we are not > dropping existing support for non-ASCII characters in programs > called by Make, it seems like a clear step forwards in terms of > Unicode support on Windows. I agree. > I cross-compiled Make for Windows using gcc (mingw-w64) and the > autoconf + automake + configure + make approach, so it clearly worked > for me, but I didn't imagine that this wasn't the standard way to build for > Windows host. Make is a basic utility used to built others, so we don't require a full suite of build tools for building Make itself. > Does this mean that all builds of Make found in the various build > distributions of the GNU toolchain for Windows (like > mingw32-make.exe in the examples above) were necessarily built using > build_w32.bat? I don't know. I can tell you that the precompiled binaries I make available here: https://sourceforge.net/projects/ezwinports/files/ are produced by running that batch file. > Since build_w32.bat is a Windows-specific batch file, does this rule out > cross-compilation as a canonical way to build Make for Windows? No, it doesn't rule that out. But using cross-compilation is not very important these days, since one can have a fully functional MinGW build environment quite easily. > Assuming all questions are answered first, would it be OK to work on the > build_w32.bat changes in a second separate patch, and keep the first one > focused only on the Unix-like build process? Yes. But my point is that without also changing build_w32.bat the change is incomplete.
Re: [PATCH] Use UTF-8 active code page for Windows host.
On Sun, 2023-03-19 at 13:42 +, Costas Argyris wrote: > I cross-compiled Make for Windows using gcc (mingw-w64) and the > autoconf + automake + configure + make approach, so it clearly worked > for me, but I didn't imagine that this wasn't the standard way to > build for Windows host. There is no one "standard way". The GNU project doesn't provide binaries on any platform, including Windows: we only provide source code. So whatever methods people use to build the software is the "standard way" for them. I don't do Windows: my system did not come with Windows and I don't own a Windows license, so when I test before releasing these days I use the free developer Windows VM provided by Microsoft. It expires regularly so I don't spend time customizing it. Because of that I personally use MSVC (the latest version, which comes pre-installed on the VM) and the build_w32.bat file, and I have Git for Windows POSIX tools on my PATH. I install Strawberry Perl to be able to run the regression tests, and that's it. This is only intended to be a trivial, anti-brown-paper-bag test for coding errors or obvious regressions. Other people (like Eli who is the primary maintainer of GNU Make for Windows) have other environments and do more vigorous testing. But I don't believe Eli uses autotools on Windows, either. > Does this mean that all builds of Make found in the various build > distributions of the GNU toolchain for Windows (like mingw32-make.exe > in the examples above) were necessarily built using build_w32.bat? You will have to ask each of them. They all do their own thing and we must be doing an OK job of keeping things portable, since they rarely come back to us with requests of any kind so we don't really know what they are doing. > Assuming all questions are answered first, would it be OK to work on > the build_w32.bat changes in a second separate patch, and keep the > first one focused only on the Unix-like build process? Patches can be provided in any order, but until build_w32.bat is updated there won't be any testing of these features during the "normal" development process. Presumably (but again, you'll have to ask them) the MinGW folks etc. will take release candidate builds and verify them in their own environments, once those become available. This is not to discourage you in any way: UTF-8 is assumed by GNU Make on POSIX systems and getting that to be true on Windows is a big step in the right direction IMO!
Re: [PATCH] Use UTF-8 active code page for Windows host.
Does this support require Make to be linked against the UCRT run-time library, or does it also work with the older MSVCRT? I haven't found anything explicitly mentioned about this in the official doc: https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page Also, it is possible to apply the manifest even post-compilation of the executable, using mt.exe (MS standard workflow) on it, so it shouldn't matter if it is linked against either one because it can be done even after the link phase.Not sure if that's a convincing argument though. If Make is built with MSVC, does it have to be built with some new enough version of Studio to have the necessary run-time support for this feature, or any version will do? I haven't built Make with MSVC at all (patch is focused on building with GNU tools) but again there is no mention of this in the official doc above. It is just another case of using a manifest file, where this time the manifest is used to set the active code page of the process to UTF-8. In fact, the manifest can be embedded into the target executable even post-compilation, using mt.exe, so I don't think a recent version of VS is a requirement to build properly. Does using UTF-8 as the active page in Make mean that locale-dependent C library functions will behave as expected? I think so.Here is the relevant doc I found: https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170 where the interesting bits are those where "operating system" is mentioned, like: "Also, the run-time library might obtain and use the value of the operating system code page, which is constant for the duration of the program's execution." I believe with setting the active code page of the process to UTF-8 we are effectively forcing the process to think that the operating system code page is UTF-8, as far as that process is concerned. Did you try running Make with this manifest on older Windows systems, like Windows 8.1 or 7? It is important to make sure this manifest doesn't preclude Make from running on those older systems, even though the UTF-8 feature will then be unavailable. I did not try as I don't have access to such systems, but it seems pretty clear from the doc that this should not be a problem: "You can declare this property and target/run on earlier Windows builds, but you must handle legacy code page detection and conversion as usual. With a minimum target version of Windows Version 1903, the process code page will always be UTF-8 so legacy code page detection and conversion can be avoided." It sounds like it will simply not use UTF-8, meaning that any UTF-8 input would still cause Make to break, but that would happen anyway with such input.Based on the above, it shouldn't change existing behavior in these older systems, and certainly not stop Make from running on them. When Make invokes other programs (which it does quite a lot ;-), and passes command-line arguments to it with non-ASCII characters, what will happen to those non-ASCII characters? I think your expectation is correct. Windows seems to be converting the UTF-8 encoded strings to the current ANSI codepage, therefore allowing non-ASCII characters (that are part of that ANSI codepage) to be propagated to the non-UTF-8 program. Below are some experiments to show this. In what follows, 'mingw32-make' is today's (unpatched) Make for Windows, as found in a typical mingw build distribution.Since it is unpatched, it is using the local ANSI codepage which is windows-1252 in my machine. 'make' is the patched version which uses the UTF-8 codepage. Makefile 'windows-1252-non-ascii.mk' is encoded in 1252 and has content: hello : gcc ©\src.c -o ©\src.exe where the (extended ASCII) Copyright sign has been used (0xA9 in 1252). Makefile 'utf8.mk' has the same content but is encoded in UTF-8, so the Copyright sign is represented as 0xC2 0xA9 (two-byte UTF-8 sequence, confirmed by looking through hex editor). With the unpatched Make that uses the local codepage: mingw32-make -f windows-1252-non-ascii.mk works fine and produces the .exe under the copyright folder (current behavior). mingw32-make -f utf8.mk breaks because the unpatched make can't understand the UTF-8 file (expected). With the patched Make that uses the UTF-8 codepage: make -f windows-1252-non-ascii.mk breaks because Make expects UTF-8 and we are feeding it with a 1252 file. make -f utf8.mk works fine and produces the .exe under the copyright folder. I believe this last case is the one that answers your question: Make (now working in UTF-8) calls gcc (working in 1252) with some UTF-8 encoded arguments.gcc has no problem doing the compilation and producing the executable under the Copyright folder, which suggests that Windows did indeed convert the UTF-8 arguments into gcc's codepage (1252), and because the Copyright sign does exist in 1252 the conversion was successful, allowing gcc to run. So it doesn't
Re: [PATCH] Use UTF-8 active code page for Windows host.
> From: Costas Argyris > Date: Sat, 18 Mar 2023 16:37:20 + > > This is a proposed patch to enable UTF-8 support in GNU Make running on > Windows host. Thanks. > Today, the make process on Windows is using the legacy system code page > because of the "A" functions > called in the source code.This means that any UTF-8 input to make on > Windows will break.A few > examples follow: Yes, this misfeature of using the system codepage is known, together with the consequences. > The attached patch incorporates the UTF-8 manifest into the build process of > GNU Make when hosted on > Windows, and forces the built executable to use UTF-8 as its active code > page, solving all problems shown > above because this has a global effect in the process.All existing "A" > calls use the UTF-8 code page now > instead of the legacy one.This is the relevant Microsoft doc: > > https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page > > With the patch, after building make, the above cases now work on Windows: > > ## > C:\Users\cargyris\temp>cat utf8Makefile.mk > hello : > @echo ﹏ > @echo ❎ > C:\Users\cargyris\temp>make -f utf8Makefile.mk > ﹏ > ❎ > > C:\Users\cargyris\temp>make -f ❎\utf8Makefile.mk > ﹏ > ❎ > > C:\Users\cargyris\temp>cd ❎ > > C:\Users\cargyris\temp\❎>make -f utf8Makefile.mk > ﹏ > ❎ > > C:\Users\cargyris\temp\❎>make -f ❎\utf8Makefile.mk > ﹏ > ❎ > ## > > This change might also fix other existing issues on Windows having to do with > filenames and paths, but I > can't point at something particular right now. > > Would a patch like that be considered? Yes, of course. However, we need to understand better the conditions under which the UTF-8 support in Make will be activated, and the consequences of activating it. Here are some specific questions, based on initial thinking about this: . Does this support require Make to be linked against the UCRT run-time library, or does it also work with the older MSVCRT? If Make is built with MSVC, does it have to be built with some new enough version of Studio to have the necessary run-time support for this feature, or any version will do? . Does using UTF-8 as the active page in Make mean that locale-dependent C library functions will behave as expected? For example, what happens with character classification functions such as isalpha and isdigit, and what happens with functions related to letter-case, such as tolower and stricmp -- will they perform correctly with characters in the entire Unicode range? (This might be related to the first question above.) . Did you try running Make with this manifest on older Windows systems, like Windows 8.1 or 7? It is important to make sure this manifest doesn't preclude Make from running on those older systems, even though the UTF-8 feature will then be unavailable. . When Make invokes other programs (which it does quite a lot ;-), and passes command-line arguments to it with non-ASCII characters, what will happen to those non-ASCII characters? I'm guessing that if the program also has such a manifest, it will get the UTF-8 encoded strings verbatim, but what if it doesn't have such a manifest? (The vast majority of the programs Make invokes nowadays don't have such manifests.) Will Windows convert the UTF-8 encoded strings into the system codepage, or will the program get UTF-8 regardless of whether it can or cannot handle them? If the latter, it will become impossible to use non-ASCII strings and file names with such programs even if those non-ASCII characters can be represented using the current system ANSI codepage, because most programs Make invokes on Windows don't support UTF-8. Your examples invoked only the built-in commands of cmd.exe, but what happens if you instead invoke, say, GCC, and pass it a non-ASCII file name, including a file name which cannot be represented in the current ANSI codepage? . Even if the answer to the previous question is, as I expect, that Windows will convert UTF-8 encoded strings to the current ANSI codepage, it is important to understand that with the UTF-8 active codepage enabled Make will still be unable to invoke programs with UTF-8 encoded strings if those programs don't have the same UTF-8 active codepage enabled, except if the non-ASCII characters in those strings can be represented by the current ANSI codepage. So this feature will only be complete when the programs invoked by Make are also UTF-8 capable. A specific comment on your patch: > --- a/Makefile.am > +++ b/Makefile.am > @@ -46,6 +46,8 @@ w32_SRCS = src/w32/pathstuff.c src/w32/w32os.c > src/w32/compat/dirent.c \ > src/w32/subproc/misc.c src/w32/subproc/proc.h \ > src/w32/subproc/sub_proc.c src/w32/subproc/w32err.c