Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Mon, 20 Mar 2023 21:47:27 +
> Cc: bug-make@gnu.org, Paul Smith 
> 
> Any thoughts for next steps then?

If the patch ready to be installed?

If so, could you please post it again, rebased on the current Git
master?

And would you please consider working on changing build_w32.bat as
well?

Thanks.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Costas Argyris
We won't know, not unless and until some user complains and debugging
shows this is the cause.  But we can warn about this issue in the
documentation up front, so that people don't raise their expectations
too high.

Makes sense.

Any thoughts for next steps then?

On Mon, 20 Mar 2023 at 16:17, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Mon, 20 Mar 2023 14:58:39 +
> > Cc: bug-make@gnu.org, Paul Smith 
> >
> > Still my concern would be: assuming that we actually learn something
> > from this test, how would we know:
> >
> > 1) Which other functions besides stricmp are affected?Maybe
> > letter-case is not the only problematic aspect.
> > 2) Which of the above (#1) set of functions are actually called from
> > Make source code, directly or indirectly?
> > 3) Which of the above (#2) set of functions could be called with UTF-8
> > input that would cause them to break?
>
> We won't know, not unless and until some user complains and debugging
> shows this is the cause.  But we can warn about this issue in the
> documentation up front, so that people don't raise their expectations
> too high.
>


[bug #51200] Improvement suggestion: listen to signals to adjust number of jobs

2023-03-20 Thread Henrik Carlqvist
Follow-up Comment #3, bug #51200 (project make):

My latest patch signal_num_jobs6.patch is almost a complete rewrite to work
better with recursive calls to make. Jobs are simply added by putting tokens
in the pipe from the signal routine for SIGUSR2. Revoking jobs are slightly
more tricky, a separate process is spawned with fork which tries to remove a
token from the pipe. However, some jobs might be able to start before the
process is able to revoke a token. Making that process busy waiting on the
pipe much increases the probability that it will become the first process to
get a token, but I didn't want to do that. Another option might have been to
make the ordinary make processes sleep some ms each time before they pick a
token, but I didn't want to do that either. My choice was to live with the
fact that it might take a long time before the number of jobs are decreased.

An slightly intrusive change from this patch is that the job server is always
setup initially. This is as it cannot be done in a signal safe way from the
signal handlers, it has to be ready when the signal handlers need them. This
also means that the job server is setup when -j is called without a value.
That was supposed to give an infinite number of jobs, but the job server has
to have a limit. My choice for now was to set that limit to 1 jobs. In
practice 1 jobs are enough to be backwards compatible with the old
behavior of make to by default become kind of a fork bomb on a big source tree
to build when no value is given to -j. By definition 1 jobs are not an
infinite number of jobs, but in practice I see no big difference. Maybe I
would prefer to by default limit the number of jobs to a low number and let
the people who really know what they are doing specify higher numbers if they
want to, but that would break backwards compatibility.

(file #54516)

___

Additional Item Attachment:

File name: signal_num_jobs6.patch Size:5 KB




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #51200] Improvement suggestion: listen to signals to adjust number of jobs

2023-03-20 Thread Henrik Carlqvist
Additional Item Attachment, bug #51200 (project make):

File name: signal_num_jobs6.patch Size:5 KB




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Mon, 20 Mar 2023 14:58:39 +
> Cc: bug-make@gnu.org, Paul Smith 
> 
> Still my concern would be: assuming that we actually learn something
> from this test, how would we know:
> 
> 1) Which other functions besides stricmp are affected?Maybe
> letter-case is not the only problematic aspect.
> 2) Which of the above (#1) set of functions are actually called from
> Make source code, directly or indirectly?
> 3) Which of the above (#2) set of functions could be called with UTF-8
> input that would cause them to break?

We won't know, not unless and until some user complains and debugging
shows this is the cause.  But we can warn about this issue in the
documentation up front, so that people don't raise their expectations
too high.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Costas Argyris
It's not easy, AFAICS in the Make sources.  Maybe it will be easier to
write a simple test program prepare a manifest for it, and see if
stricmp compares equal strings with different letter-case when
characters outside of the locale are involved.

I can do that.

Still my concern would be: assuming that we actually learn something
from this test, how would we know:

1) Which other functions besides stricmp are affected?Maybe
letter-case is not the only problematic aspect.
2) Which of the above (#1) set of functions are actually called from
Make source code, directly or indirectly?
3) Which of the above (#2) set of functions could be called with UTF-8
input that would cause them to break?

On Mon, 20 Mar 2023 at 14:05, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Mon, 20 Mar 2023 13:45:14 +
> > Cc: bug-make@gnu.org, Paul Smith 
> >
> > > That's most probably because $(wildcard) calls a Win32 API that is
> > > case-insensitive.  So the jury is still out on this matter, and I
> > > still believe that the below is true:
> >
> > In that case, are you aware of any Make construct other than $(wildcard)
> > that will lead to calling an API of interest?I'd be happy to test it
> against
> > the patched UTF-8 version of Make that I have built.
>
> It's not easy, AFAICS in the Make sources.  Maybe it will be easier to
> write a simple test program prepare a manifest for it, and see if
> stricmp compares equal strings with different letter-case when
> characters outside of the locale are involved.
>
> > In any case, do you see this as a blocking issue for the UTF-8 feature?
>
> Not a blocking issue, just something that we'd need to document, for
> users to be aware of.
>
> > Is the concern that the UTF-8 feature might break existing things that
> > work, or that some things that we might naively expect to work with the
> > switch to UTF-8, won't actually work?
>
> I don't think this will break something that isn't already broken.
> But it could trigger expectations that cannot be met, and so we should
> document these caveats.
>
> > > This is about UCRT specifically, so I wonder whether MSVCRT will
> > > behave the same.
> >
> > That's true.I wonder how the examples I did so far worked, given
> > that (as you found out) my UTF-8-patched Make is linked against
> > MSVCRT.Is it just that everything I tried so far is so simple that it
> > doesn't even trigger calls to sensitive functions in MSVCRT?
>
> I think so, yes.
>
> Also, you didn't try to set the locale to .UTF-8, which is what that
> page was describing.
>
> > Because
> > from what I found online, MSVCRT does not support UTF-8, and yet
> > somehow things appear to be working, at least on the surface.
>
> CRT functions are implemented on top of Win32 APIs, and I think the
> manifest affects the latter.  That's why it works.  Functionality that
> is implemented completely in CRT, such as setlocale, for example, does
> indeed need UCRT to work.  Or at least this is my guess.
>


Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Costas Argyris
Also, you didn't try to set the locale to .UTF-8, which is what that
page was describing.

That is because I am linking Make against MSVCRT, and that call
is only possible when linking against UCRT, AFAICT, so I didn't see
a reason to try.

CRT functions are implemented on top of Win32 APIs, and I think the
manifest affects the latter.  That's why it works.  Functionality that
is implemented completely in CRT, such as setlocale, for example, does
indeed need UCRT to work.  Or at least this is my guess.

Makes sense.From what I have read, the manifest affects the ANSI
code page (ACP) which in turn is being used by the -A APIs (the Make
source code is calling those implicitly by not #define'ing UNICODE, i.e.
a call to CreateProcess is simply a macro that resolves to CreateProcessA
because UNICODE is not defined).

Therefore, CRT functions that are relying on those -A APIs will get to
work in UTF-8 for free it seems.

On Mon, 20 Mar 2023 at 14:05, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Mon, 20 Mar 2023 13:45:14 +
> > Cc: bug-make@gnu.org, Paul Smith 
> >
> > > That's most probably because $(wildcard) calls a Win32 API that is
> > > case-insensitive.  So the jury is still out on this matter, and I
> > > still believe that the below is true:
> >
> > In that case, are you aware of any Make construct other than $(wildcard)
> > that will lead to calling an API of interest?I'd be happy to test it
> against
> > the patched UTF-8 version of Make that I have built.
>
> It's not easy, AFAICS in the Make sources.  Maybe it will be easier to
> write a simple test program prepare a manifest for it, and see if
> stricmp compares equal strings with different letter-case when
> characters outside of the locale are involved.
>
> > In any case, do you see this as a blocking issue for the UTF-8 feature?
>
> Not a blocking issue, just something that we'd need to document, for
> users to be aware of.
>
> > Is the concern that the UTF-8 feature might break existing things that
> > work, or that some things that we might naively expect to work with the
> > switch to UTF-8, won't actually work?
>
> I don't think this will break something that isn't already broken.
> But it could trigger expectations that cannot be met, and so we should
> document these caveats.
>
> > > This is about UCRT specifically, so I wonder whether MSVCRT will
> > > behave the same.
> >
> > That's true.I wonder how the examples I did so far worked, given
> > that (as you found out) my UTF-8-patched Make is linked against
> > MSVCRT.Is it just that everything I tried so far is so simple that it
> > doesn't even trigger calls to sensitive functions in MSVCRT?
>
> I think so, yes.
>
> Also, you didn't try to set the locale to .UTF-8, which is what that
> page was describing.
>
> > Because
> > from what I found online, MSVCRT does not support UTF-8, and yet
> > somehow things appear to be working, at least on the surface.
>
> CRT functions are implemented on top of Win32 APIs, and I think the
> manifest affects the latter.  That's why it works.  Functionality that
> is implemented completely in CRT, such as setlocale, for example, does
> indeed need UCRT to work.  Or at least this is my guess.
>


Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Mon, 20 Mar 2023 13:45:14 +
> Cc: bug-make@gnu.org, Paul Smith 
> 
> > That's most probably because $(wildcard) calls a Win32 API that is
> > case-insensitive.  So the jury is still out on this matter, and I
> > still believe that the below is true:
> 
> In that case, are you aware of any Make construct other than $(wildcard)
> that will lead to calling an API of interest?I'd be happy to test it 
> against
> the patched UTF-8 version of Make that I have built.

It's not easy, AFAICS in the Make sources.  Maybe it will be easier to
write a simple test program prepare a manifest for it, and see if
stricmp compares equal strings with different letter-case when
characters outside of the locale are involved.

> In any case, do you see this as a blocking issue for the UTF-8 feature?

Not a blocking issue, just something that we'd need to document, for
users to be aware of.

> Is the concern that the UTF-8 feature might break existing things that
> work, or that some things that we might naively expect to work with the
> switch to UTF-8, won't actually work?

I don't think this will break something that isn't already broken.
But it could trigger expectations that cannot be met, and so we should
document these caveats.

> > This is about UCRT specifically, so I wonder whether MSVCRT will
> > behave the same.
> 
> That's true.I wonder how the examples I did so far worked, given
> that (as you found out) my UTF-8-patched Make is linked against
> MSVCRT.Is it just that everything I tried so far is so simple that it
> doesn't even trigger calls to sensitive functions in MSVCRT?

I think so, yes.

Also, you didn't try to set the locale to .UTF-8, which is what that
page was describing.

> Because
> from what I found online, MSVCRT does not support UTF-8, and yet
> somehow things appear to be working, at least on the surface.

CRT functions are implemented on top of Win32 APIs, and I think the
manifest affects the latter.  That's why it works.  Functionality that
is implemented completely in CRT, such as setlocale, for example, does
indeed need UCRT to work.  Or at least this is my guess.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Costas Argyris
That's most probably because $(wildcard) calls a Win32 API that is
case-insensitive.  So the jury is still out on this matter, and I
still believe that the below is true:

In that case, are you aware of any Make construct other than $(wildcard)
that will lead to calling an API of interest?I'd be happy to test it
against
the patched UTF-8 version of Make that I have built.

In any case, do you see this as a blocking issue for the UTF-8 feature?

Is the concern that the UTF-8 feature might break existing things that
work, or that some things that we might naively expect to work with the
switch to UTF-8, won't actually work?

This is about UCRT specifically, so I wonder whether MSVCRT will
behave the same.

That's true.I wonder how the examples I did so far worked, given
that (as you found out) my UTF-8-patched Make is linked against
MSVCRT.Is it just that everything I tried so far is so simple that it
doesn't even trigger calls to sensitive functions in MSVCRT?Because
from what I found online, MSVCRT does not support UTF-8, and yet
somehow things appear to be working, at least on the surface.

Only on Windows versions that support this.

Yes, this whole feature makes sense only on
Windows Version 1903 (May 2019 Update)
or later anyway (this is Windows 10).

Previous versions will simply be unaffected.Make will still run, but
will still break when faced with UTF-8 input in any way.

Given that the feature will only work on Windows 10, UCRT will also
be available, so if linking against UCRT it will be possible to call
setlocale(LC_ALL, ".UTF8") and get full UTF-8 support in the C lib as well.

If linking against MSVCRT, we are forced to face the restrictions it has
anyway.
Which brings me back to my question of whether you see this as a potential
blocking issue for Make switching to UTF-8 on Windows by embedding the
UTF-8 manifest at build time.

On Mon, 20 Mar 2023 at 11:54, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Sun, 19 Mar 2023 21:25:30 +
> > Cc: bug-make@gnu.org, Paul Smith 
> >
> > I create a file src.β first:
> >
> > touch src.β
> >
> > and then run the following UTF-8 encoded Makefile:
> >
> > hello :
> > @gcc ©\src.c -o ©\src.exe
> >
> > ifneq ("$(wildcard src.β)","")
> > @echo src.β exists
> > else
> > @echo src.β does NOT exist
> > endif
> >
> > ifneq ("$(wildcard src.Β)","")
> > @echo src.Β exists
> > else
> > @echo src.Β does NOT exist
> > endif
> >
> > ifneq ("$(wildcard src.βΒ)","")
> > @echo src.βΒ exists
> > else
> > @echo src.βΒ does NOT exist
> > endif
> >
> > and the output of Make is:
> >
> > C:\Users\cargyris\temp>make -f utf8.mk
> > src.β exists
> > src.Β exists
> > src.βΒ does NOT exist
> >
> > which shows that it finds the one with the upper case extension as well,
> > despite the fact that it exists in the file system as a lower case
> extension.
>
> That's most probably because $(wildcard) calls a Win32 API that is
> case-insensitive.  So the jury is still out on this matter, and I
> still believe that the below is true:
>
> > My guess would be that only characters within the locale, defined by
> > the ANSI codepage, are supported by locale-aware functions in the C
> > runtime.  That's because this is what happens even if you use "wide"
> > Unicode APIs and/or functions like _wcsicmp that accept wchar_t
> > characters: they all support only the characters of the current locale
> > set by 'setlocale'.  I don't expect that to change just because UTF-8
> > is used on the outside: internally, everything is converted to UTF-16,
> > i.e. to the Windows flavor of wchar_t.
> >
> > But this one looks most relevant to your point:
> >
> >
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support
> >
> >
> > "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
> Runtime supports using a UTF-8 code
> > page. The change means that char strings passed to C runtime functions
> can expect strings in the UTF-8
> > encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using
> setlocale. For example,
> > setlocale(LC_ALL, ".UTF8") will use the current default Windows ANSI
> code page (ACP) for the locale and
> > UTF-8 for the code page."
>
> This is about UCRT specifically, so I wonder whether MSVCRT will
> behave the same.
>
> > My point is, with the manifest embedded at build time, ACP will be UTF-8
> > already when the program (Make) runs, so no need to do anything more.
>
> Only on Windows versions that support this.
>


Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-20 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Sun, 19 Mar 2023 21:25:30 +
> Cc: bug-make@gnu.org, Paul Smith 
> 
> I create a file src.β first:
> 
> touch src.β
> 
> and then run the following UTF-8 encoded Makefile:
> 
> hello :
> @gcc ©\src.c -o ©\src.exe
> 
> ifneq ("$(wildcard src.β)","")
> @echo src.β exists
> else
> @echo src.β does NOT exist
> endif
> 
> ifneq ("$(wildcard src.Β)","")
> @echo src.Β exists
> else
> @echo src.Β does NOT exist
> endif
> 
> ifneq ("$(wildcard src.βΒ)","")
> @echo src.βΒ exists
> else
> @echo src.βΒ does NOT exist
> endif
> 
> and the output of Make is:
> 
> C:\Users\cargyris\temp>make -f utf8.mk
> src.β exists
> src.Β exists
> src.βΒ does NOT exist
> 
> which shows that it finds the one with the upper case extension as well,
> despite the fact that it exists in the file system as a lower case extension.

That's most probably because $(wildcard) calls a Win32 API that is
case-insensitive.  So the jury is still out on this matter, and I
still believe that the below is true:

> My guess would be that only characters within the locale, defined by
> the ANSI codepage, are supported by locale-aware functions in the C
> runtime.  That's because this is what happens even if you use "wide"
> Unicode APIs and/or functions like _wcsicmp that accept wchar_t
> characters: they all support only the characters of the current locale
> set by 'setlocale'.  I don't expect that to change just because UTF-8
> is used on the outside: internally, everything is converted to UTF-16,
> i.e. to the Windows flavor of wchar_t.
> 
> But this one looks most relevant to your point:
> 
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support
> 
> 
> "Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C Runtime 
> supports using a UTF-8 code
> page. The change means that char strings passed to C runtime functions can 
> expect strings in the UTF-8
> encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using 
> setlocale. For example,
> setlocale(LC_ALL, ".UTF8") will use the current default Windows ANSI code 
> page (ACP) for the locale and
> UTF-8 for the code page."

This is about UCRT specifically, so I wonder whether MSVCRT will
behave the same.

> My point is, with the manifest embedded at build time, ACP will be UTF-8
> already when the program (Make) runs, so no need to do anything more.

Only on Windows versions that support this.