Re: GNU make losing jobserver tokens
On Sat, 30 Apr 2022 17:51:03 -0400 Ken Brown wrote: > On 4/29/2022 5:10 AM, Takashi Yano wrote: > > On Thu, 28 Apr 2022 17:32:22 +0200 > > I tried to move sigproc_init() call from dll_crt0_0() to > > fork::child() for 64bit cygwin, however, that causes hang > > at cygwin startup. > > > > Am I missing somehting? > > I've never looked into the Cygwin startup code, so just ignore me if what I > say > is nonsense. > > Currently sigproc_init is called either from dll_crt0_0 or from dll_crt0_1, > depending on the value of dynamically_loaded. What would happen if you > always > call it from dll_crt0_1, right after > >cygwin_finished_initializing = true; Thanks for the advice. That causes hang on cygwin startup due to fork() fail :( -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On 4/29/2022 5:10 AM, Takashi Yano wrote: On Thu, 28 Apr 2022 17:32:22 +0200 I tried to move sigproc_init() call from dll_crt0_0() to fork::child() for 64bit cygwin, however, that causes hang at cygwin startup. Am I missing somehting? I've never looked into the Cygwin startup code, so just ignore me if what I say is nonsense. Currently sigproc_init is called either from dll_crt0_0 or from dll_crt0_1, depending on the value of dynamically_loaded. What would happen if you always call it from dll_crt0_1, right after cygwin_finished_initializing = true; Ken -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Thu, 28 Apr 2022 17:32:22 +0200 Corinna Vinschen wrote: > On Apr 29 00:01, Takashi Yano wrote: > > On Thu, 28 Apr 2022 16:09:24 +0200 > > Corinna Vinschen wrote: > > > On Apr 28 09:42, Ken Brown wrote: > > > > On 4/27/2022 10:13 AM, Takashi Yano wrote: > > > > > On Fri, 1 Apr 2022 17:45:51 +0900 > > > > > Takashi Yano wrote: > > > > > > [...] > > > > > > diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc > > > > > > index 62df96652..3824af199 100644 > > > > > > --- a/winsup/cygwin/sigproc.cc > > > > > > +++ b/winsup/cygwin/sigproc.cc > > > > > > @@ -1325,6 +1325,10 @@ wait_sig (VOID *) > > > > > > _sig_tls = &_my_tls; > > > > > > bool sig_held = false; > > > > > > + /* Wait for _main_tls initialization. */ > > > > > > + while (!cygwin_finished_initializing) > > > > > > +Sleep (10); > > > > > > + > > > > > > sigproc_printf ("entering ReadFile loop, my_readsig %p, > > > > > > my_sendsig %p", > > > > > > my_readsig, my_sendsig); > > > > > > > > > > > > I guess _main_tls may not be initialized correctly until > > > > > > cygwin_finished_initializing is set. > > > > > > > > > > > > Any comments would be appreciated. > > > > > > > > This seems reasonable to me. > > > > Thanks Ken and Corinna. > > > > > Missed that, sorry. I agree this seems reasonable, but wouldn't it be > > > cleaner if we *start* wait_sig only after cygwin_finished_initializing > > > is set to true? > > > > I also thought so, however, there is a comment in dcrt0.cc > > as follows. So, there seems to be some reason to start > > wait_sig before cygwin_finished_initialization. > > > > /* Initialize signal processing here, early, in the hopes that the > > creation > > of a thread early in the process will cause more predictability in > > memory > > layout for the main thread. */ > > if (!dynamically_loaded) > > sigproc_init (); > > This is a 32-bit only problem. The 64 bit address space layout is as > predictable as can be. Maybe the above fix should go into 3.3 and for > 3.4 we try differently? I tried to move sigproc_init() call from dll_crt0_0() to fork::child() for 64bit cygwin, however, that causes hang at cygwin startup. Am I missing somehting? -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Apr 29 00:01, Takashi Yano wrote: > On Thu, 28 Apr 2022 16:09:24 +0200 > Corinna Vinschen wrote: > > On Apr 28 09:42, Ken Brown wrote: > > > On 4/27/2022 10:13 AM, Takashi Yano wrote: > > > > On Fri, 1 Apr 2022 17:45:51 +0900 > > > > Takashi Yano wrote: > > > > > [...] > > > > > diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc > > > > > index 62df96652..3824af199 100644 > > > > > --- a/winsup/cygwin/sigproc.cc > > > > > +++ b/winsup/cygwin/sigproc.cc > > > > > @@ -1325,6 +1325,10 @@ wait_sig (VOID *) > > > > > _sig_tls = &_my_tls; > > > > > bool sig_held = false; > > > > > + /* Wait for _main_tls initialization. */ > > > > > + while (!cygwin_finished_initializing) > > > > > +Sleep (10); > > > > > + > > > > > sigproc_printf ("entering ReadFile loop, my_readsig %p, > > > > > my_sendsig %p", > > > > > my_readsig, my_sendsig); > > > > > > > > > > I guess _main_tls may not be initialized correctly until > > > > > cygwin_finished_initializing is set. > > > > > > > > > > Any comments would be appreciated. > > > > > > This seems reasonable to me. > > Thanks Ken and Corinna. > > > Missed that, sorry. I agree this seems reasonable, but wouldn't it be > > cleaner if we *start* wait_sig only after cygwin_finished_initializing > > is set to true? > > I also thought so, however, there is a comment in dcrt0.cc > as follows. So, there seems to be some reason to start > wait_sig before cygwin_finished_initialization. > > /* Initialize signal processing here, early, in the hopes that the creation > of a thread early in the process will cause more predictability in memory > layout for the main thread. */ > if (!dynamically_loaded) > sigproc_init (); This is a 32-bit only problem. The 64 bit address space layout is as predictable as can be. Maybe the above fix should go into 3.3 and for 3.4 we try differently? Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Thu, 28 Apr 2022 16:09:24 +0200 Corinna Vinschen wrote: > On Apr 28 09:42, Ken Brown wrote: > > On 4/27/2022 10:13 AM, Takashi Yano wrote: > > > On Fri, 1 Apr 2022 17:45:51 +0900 > > > Takashi Yano wrote: > > > > I have tried to reproduce the issue by building OpenJDK > > > > from source, however, I could not. > > > > > > > > Instead, I encountered another issue. > > > > > > > > Building OpenJDK sometimes (rarely) failed with error such as: > > > > > > > >0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, > > > > pipe handle 0x118, nb 0, packsize 176, Win32 error 0 > > > > 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, > > > > pipe handle 0x118, nb 0, packsize 176, Win32 error 0 > > > > common/modules/GensrcModuleInfo.gmk:77: *** open: > > > > /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: > > > > No such file or directory. Stop. > > > > make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] > > > > Error 2 > > > > make[2]: *** Waiting for unfinished jobs > > > > > > > > > > > > I looked into this new problem and found that wait_sig() thread > > > > crashes with segfault. It seems that accessing _main_tls causes > > > > access violation if a signal is sent just after the process is > > > > started. > > > > > > > > static void WINAPI > > > > wait_sig (VOID *) > > > > { > > > >[...] > > > >if (!pack.mask) > > > > { > > > > tl_entry = cygheap->find_tls (_main_tls); > > > > dummy_mask = _main_tls->sigmask; // <--- Segfault here > > > > cygheap->unlock_tls (tl_entry); > > > > pack.mask = &dummy_mask; > > > > } > > > > > > > > I also found the following patch resolves the issue. > > > > > > > > diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc > > > > index 62df96652..3824af199 100644 > > > > --- a/winsup/cygwin/sigproc.cc > > > > +++ b/winsup/cygwin/sigproc.cc > > > > @@ -1325,6 +1325,10 @@ wait_sig (VOID *) > > > > _sig_tls = &_my_tls; > > > > bool sig_held = false; > > > > + /* Wait for _main_tls initialization. */ > > > > + while (!cygwin_finished_initializing) > > > > +Sleep (10); > > > > + > > > > sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig > > > > %p", > > > > my_readsig, my_sendsig); > > > > > > > > I guess _main_tls may not be initialized correctly until > > > > cygwin_finished_initializing is set. > > > > > > > > Any comments would be appreciated. > > > > This seems reasonable to me. Thanks Ken and Corinna. > Missed that, sorry. I agree this seems reasonable, but wouldn't it be > cleaner if we *start* wait_sig only after cygwin_finished_initializing > is set to true? I also thought so, however, there is a comment in dcrt0.cc as follows. So, there seems to be some reason to start wait_sig before cygwin_finished_initialization. /* Initialize signal processing here, early, in the hopes that the creation of a thread early in the process will cause more predictability in memory layout for the main thread. */ if (!dynamically_loaded) sigproc_init (); -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Apr 28 09:42, Ken Brown wrote: > On 4/27/2022 10:13 AM, Takashi Yano wrote: > > On Fri, 1 Apr 2022 17:45:51 +0900 > > Takashi Yano wrote: > > > I have tried to reproduce the issue by building OpenJDK > > > from source, however, I could not. > > > > > > Instead, I encountered another issue. > > > > > > Building OpenJDK sometimes (rarely) failed with error such as: > > > > > >0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, > > > pipe handle 0x118, nb 0, packsize 176, Win32 error 0 > > > 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, > > > pipe handle 0x118, nb 0, packsize 176, Win32 error 0 > > > common/modules/GensrcModuleInfo.gmk:77: *** open: > > > /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: > > > No such file or directory. Stop. > > > make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] > > > Error 2 > > > make[2]: *** Waiting for unfinished jobs > > > > > > > > > I looked into this new problem and found that wait_sig() thread > > > crashes with segfault. It seems that accessing _main_tls causes > > > access violation if a signal is sent just after the process is > > > started. > > > > > > static void WINAPI > > > wait_sig (VOID *) > > > { > > >[...] > > >if (!pack.mask) > > > { > > > tl_entry = cygheap->find_tls (_main_tls); > > > dummy_mask = _main_tls->sigmask; // <--- Segfault here > > > cygheap->unlock_tls (tl_entry); > > > pack.mask = &dummy_mask; > > > } > > > > > > I also found the following patch resolves the issue. > > > > > > diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc > > > index 62df96652..3824af199 100644 > > > --- a/winsup/cygwin/sigproc.cc > > > +++ b/winsup/cygwin/sigproc.cc > > > @@ -1325,6 +1325,10 @@ wait_sig (VOID *) > > > _sig_tls = &_my_tls; > > > bool sig_held = false; > > > + /* Wait for _main_tls initialization. */ > > > + while (!cygwin_finished_initializing) > > > +Sleep (10); > > > + > > > sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig > > > %p", > > > my_readsig, my_sendsig); > > > > > > I guess _main_tls may not be initialized correctly until > > > cygwin_finished_initializing is set. > > > > > > Any comments would be appreciated. > > This seems reasonable to me. Missed that, sorry. I agree this seems reasonable, but wouldn't it be cleaner if we *start* wait_sig only after cygwin_finished_initializing is set to true? Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On 4/27/2022 10:13 AM, Takashi Yano wrote: On Fri, 1 Apr 2022 17:45:51 +0900 Takashi Yano wrote: I have tried to reproduce the issue by building OpenJDK from source, however, I could not. Instead, I encountered another issue. Building OpenJDK sometimes (rarely) failed with error such as: 0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 common/modules/GensrcModuleInfo.gmk:77: *** open: /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: No such file or directory. Stop. make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2 make[2]: *** Waiting for unfinished jobs I looked into this new problem and found that wait_sig() thread crashes with segfault. It seems that accessing _main_tls causes access violation if a signal is sent just after the process is started. static void WINAPI wait_sig (VOID *) { [...] if (!pack.mask) { tl_entry = cygheap->find_tls (_main_tls); dummy_mask = _main_tls->sigmask; // <--- Segfault here cygheap->unlock_tls (tl_entry); pack.mask = &dummy_mask; } I also found the following patch resolves the issue. diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc index 62df96652..3824af199 100644 --- a/winsup/cygwin/sigproc.cc +++ b/winsup/cygwin/sigproc.cc @@ -1325,6 +1325,10 @@ wait_sig (VOID *) _sig_tls = &_my_tls; bool sig_held = false; + /* Wait for _main_tls initialization. */ + while (!cygwin_finished_initializing) +Sleep (10); + sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p", my_readsig, my_sendsig); I guess _main_tls may not be initialized correctly until cygwin_finished_initializing is set. Any comments would be appreciated. This seems reasonable to me. Ken -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Fri, 1 Apr 2022 17:45:51 +0900 Takashi Yano wrote: > On Mon, 21 Mar 2022 15:28:17 +0100 > Magnus Ihse Bursie wrote: > > Hi, > > > > I'm working for Oracle on the OpenJDK build team. We're using GNU make > > to build the JDK on all supported platforms. For Windows, we use Cygwin > > as our build environment, including the Cygwin version of GNU make. > > > > We have had a long-standing issue with make losing jobserver tokens. > > ("long-standing" here means for years, and years, at least since GNU > > make 4.0, up to and including the current latest version in Cygwin.) > > > > Most runs end with something like: > > > > make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be > > 12! > > > > Since the build still succeeds, and it just affects performance (and > > typically not that much), we have not spend too much time getting to the > > bottom of this. > > > > Now, however, I've come across a machine where this happens repeatedly, > > and on a much worse scale: > > > > make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24! > > > > This effectively turns the highly parallelized builds into > > single-threaded builds, and is absolutely detrimental for performance. > > On the flip side, this also makes for the perfect testing environment to > > really get to the bottom of this issue. > > > > I started out by sending a question to bug-m...@gnu.org. The folks over > > there reported that this was not a known problem with GNU make on > > Windows in general, and that as far as they knew, the mingw port did not > > suffer from this problem. > > > > Instead, they suggested that it was a Cygwin-specific problem, possibly > > related to issues with emulating Posix pipes and/or signals in Cygwin. > > > > So, my first question is: Is this a known problem in Cygwin GNU make? > > Are there any workarounds/fixes to get around it? > > > > Otherwise: Any suggestions on how to go on and debug this? I am willing > > to build and test an instrumented debug build of make, but I will need > > assistance to find my way around the source and spot likely candidates > > for the source of the problem. > > I have tried to reproduce the issue by building OpenJDK > from source, however, I could not. > > Instead, I encountered another issue. > > Building OpenJDK sometimes (rarely) failed with error such as: > > 0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe > handle 0x118, nb 0, packsize 176, Win32 error 0 > 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe > handle 0x118, nb 0, packsize 176, Win32 error 0 > common/modules/GensrcModuleInfo.gmk:77: *** open: > /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: > No such file or directory. Stop. > make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2 > make[2]: *** Waiting for unfinished jobs > > > I looked into this new problem and found that wait_sig() thread > crashes with segfault. It seems that accessing _main_tls causes > access violation if a signal is sent just after the process is > started. > > static void WINAPI > wait_sig (VOID *) > { > [...] > if (!pack.mask) > { > tl_entry = cygheap->find_tls (_main_tls); > dummy_mask = _main_tls->sigmask; // <--- Segfault here > cygheap->unlock_tls (tl_entry); > pack.mask = &dummy_mask; > } > > I also found the following patch resolves the issue. > > diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc > index 62df96652..3824af199 100644 > --- a/winsup/cygwin/sigproc.cc > +++ b/winsup/cygwin/sigproc.cc > @@ -1325,6 +1325,10 @@ wait_sig (VOID *) >_sig_tls = &_my_tls; >bool sig_held = false; > > + /* Wait for _main_tls initialization. */ > + while (!cygwin_finished_initializing) > +Sleep (10); > + >sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p", > my_readsig, my_sendsig); > > > I guess _main_tls may not be initialized correctly until > cygwin_finished_initializing is set. > > Any comments would be appreciated. Ping? -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On Mon, 21 Mar 2022 15:28:17 +0100 Magnus Ihse Bursie wrote: > Hi, > > I'm working for Oracle on the OpenJDK build team. We're using GNU make > to build the JDK on all supported platforms. For Windows, we use Cygwin > as our build environment, including the Cygwin version of GNU make. > > We have had a long-standing issue with make losing jobserver tokens. > ("long-standing" here means for years, and years, at least since GNU > make 4.0, up to and including the current latest version in Cygwin.) > > Most runs end with something like: > > make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be > 12! > > Since the build still succeeds, and it just affects performance (and > typically not that much), we have not spend too much time getting to the > bottom of this. > > Now, however, I've come across a machine where this happens repeatedly, > and on a much worse scale: > > make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24! > > This effectively turns the highly parallelized builds into > single-threaded builds, and is absolutely detrimental for performance. > On the flip side, this also makes for the perfect testing environment to > really get to the bottom of this issue. > > I started out by sending a question to bug-m...@gnu.org. The folks over > there reported that this was not a known problem with GNU make on > Windows in general, and that as far as they knew, the mingw port did not > suffer from this problem. > > Instead, they suggested that it was a Cygwin-specific problem, possibly > related to issues with emulating Posix pipes and/or signals in Cygwin. > > So, my first question is: Is this a known problem in Cygwin GNU make? > Are there any workarounds/fixes to get around it? > > Otherwise: Any suggestions on how to go on and debug this? I am willing > to build and test an instrumented debug build of make, but I will need > assistance to find my way around the source and spot likely candidates > for the source of the problem. I have tried to reproduce the issue by building OpenJDK from source, however, I could not. Instead, I encountered another issue. Building OpenJDK sometimes (rarely) failed with error such as: 0 [sig] make 5484 sig_send: error sending signal 11, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 124917 [main] make 5484 sig_send: error sending signal -72, pid 5484, pipe handle 0x118, nb 0, packsize 176, Win32 error 0 common/modules/GensrcModuleInfo.gmk:77: *** open: /home/yano/jdk/build/windows-x86-server-release/make-support/vardeps/make/common/modules/GensrcModuleInfo.gmk/jdk.accessibility/ALL_MODULES.vardeps: No such file or directory. Stop. make[2]: *** [make/Main.gmk:141: jdk.accessibility-gensrc-moduleinfo] Error 2 make[2]: *** Waiting for unfinished jobs I looked into this new problem and found that wait_sig() thread crashes with segfault. It seems that accessing _main_tls causes access violation if a signal is sent just after the process is started. static void WINAPI wait_sig (VOID *) { [...] if (!pack.mask) { tl_entry = cygheap->find_tls (_main_tls); dummy_mask = _main_tls->sigmask; // <--- Segfault here cygheap->unlock_tls (tl_entry); pack.mask = &dummy_mask; } I also found the following patch resolves the issue. diff --git a/winsup/cygwin/sigproc.cc b/winsup/cygwin/sigproc.cc index 62df96652..3824af199 100644 --- a/winsup/cygwin/sigproc.cc +++ b/winsup/cygwin/sigproc.cc @@ -1325,6 +1325,10 @@ wait_sig (VOID *) _sig_tls = &_my_tls; bool sig_held = false; + /* Wait for _main_tls initialization. */ + while (!cygwin_finished_initializing) +Sleep (10); + sigproc_printf ("entering ReadFile loop, my_readsig %p, my_sendsig %p", my_readsig, my_sendsig); I guess _main_tls may not be initialized correctly until cygwin_finished_initializing is set. Any comments would be appreciated. -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: checking cyg version (was Re: GNU make losing jobserver tokens)
Version number of the "cygwin" Cygwin package: ``` cygcheck -c cygwin ``` Version numbers of all installed Cygwin packages: ``` cygcheck -c ``` Save that information to a file: ``` cygcheck -c > cygwin-package-versions.txt ``` Save more complete information to a file: ``` cygcheck -s -r -v > cygcheck.out ``` -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
Hi, Magnus Ihse Bursie wrote: Hi, I'm working for Oracle on the OpenJDK build team. We're using GNU make to build the JDK on all supported platforms. For Windows, we use Cygwin as our build environment, including the Cygwin version of GNU make. We have had a long-standing issue with make losing jobserver tokens. ("long-standing" here means for years, and years, at least since GNU make 4.0, up to and including the current latest version in Cygwin.) Parallel build was working for my on 32-bit cygwin in the past. Most runs end with something like: make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be 12! Since the build still succeeds, and it just affects performance (and typically not that much), we have not spend too much time getting to the bottom of this. Now I cannot get it working on 64-bit cygwin - https://www.mail-archive.com/cygwin@cygwin.com/msg169861.html Interesting that you build succeeds. [SNIP] Instead, they suggested that it was a Cygwin-specific problem, possibly related to issues with emulating Posix pipes and/or signals in Cygwin. There is some issues in the my build environment: - unix domain socket as non administrator (https://www.mail-archive.com/cygwin@cygwin.com/msg169832.html ). If I remember well those socket does not work properly in general. - setup as not admin - package upgrade failed (https://www.mail-archive.com/cygwin@cygwin.com/msg169830.html). The second one look like issue with pipes. So, my first question is: Is this a known problem in Cygwin GNU make? Are there any workarounds/fixes to get around it? Does not look like issue in make. Look like general issue. May be related to Microsoft Windows OS restriction to user account. Otherwise: Any suggestions on how to go on and debug this? I am willing to build and test an instrumented debug build of make, but I will need assistance to find my way around the source and spot likely candidates for the source of the problem. I'm not regular cygwin user and I do not have time and environments to test build variations. Perhaps build could use local administrator account or to use jobs lest than number of cores. /Magnus Regards, Roumen Petrov -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: checking cyg version (was Re: GNU make losing jobserver tokens)
L A Walsh wrote: On 2022/03/21 08:09, Ken Brown wrote: For starters, is your Cygwin installation up to date? Cygwin's internal implementation of pipes was overhauled starting with cygwin-3.3.0. How does one check the version of cygwin? I've updated cygwin files this year, but if I use cygcheck -V, I only see cygwin-3.2, which looks to be from last year. Is that they right way to check the cygwin version? uname -r ..or the catch-all when I can't remember the -r option: uname -a ..mark -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: checking cyg version (was Re: GNU make losing jobserver tokens)
On Tue, Mar 22, 2022 at 12:38:34PM -0700, L A Walsh wrote: > On 2022/03/21 08:09, Ken Brown wrote: > > > > For starters, is your Cygwin installation up to date? Cygwin's internal > > implementation of pipes was overhauled starting with cygwin-3.3.0. > How does one check the version of cygwin? I've updated cygwin files this > year, > but if I use cygcheck -V, I only see cygwin-3.2, which looks to be from last > year. Unless you're doing something very odd, that implies you're running a version that's just under a year old, and which won't include the pipe changes Ken is referring to. > Is that they right way to check the cygwin version? Not really: `cygcheck -V` is just reporting the version of cygcheck that you have installed. That should normally match the version of the core "cygwin" package, which also includes the cygwin1.dll library that makes everything else possible. However "the version of Cygwin" isn't a meaningful concept: a Cygwin installation is made up of many parts, each of which has its own version number. There's more detail at https://cygwin.com/faq/faq.html#faq.what.version -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
checking cyg version (was Re: GNU make losing jobserver tokens)
On 2022/03/21 08:09, Ken Brown wrote: For starters, is your Cygwin installation up to date? Cygwin's internal implementation of pipes was overhauled starting with cygwin-3.3.0. How does one check the version of cygwin? I've updated cygwin files this year, but if I use cygcheck -V, I only see cygwin-3.2, which looks to be from last year. Is that they right way to check the cygwin version? thanks! -linda -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens in pipes
On 2022-03-22 00:54, Noel Grandin wrote: On 2022/03/21 5:09 pm, Ken Brown wrote: On 3/21/2022 10:28 AM, Magnus Ihse Bursie wrote: We have had a long-standing issue with make losing jobserver tokens. ("long-standing" here means for years, and years, at least since GNU make 4.0, up to and including the current latest version in Cygwin.) It was not that long ago that Linus Torvalds found a bug in the Linux kernel pipe implementation which caused GNU make to lose jobserver tokens, so possibly researching that bug may shed some light on the kinds of things that could be wrong with the Cygwin pipe code. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ddad21d3e99c743a3aa473121dc5561679e26bb https://lkml.org/lkml/2019/12/18/1064 Perhaps add "in pipes" to subject to get that maintainer's attention? -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.] -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On 2022/03/21 5:09 pm, Ken Brown wrote: On 3/21/2022 10:28 AM, Magnus Ihse Bursie wrote: We have had a long-standing issue with make losing jobserver tokens. ("long-standing" here means for years, and years, at least since GNU make 4.0, up to and including the current latest version in Cygwin.) Hi It was not that long ago that Linus Torvalds found a bug in the Linux kernel pipe implementation which caused GNU make to lose jobserver tokens, so possibly researching that bug may shed some light on the kinds of things that could be wrong with the Cygwin pipe code. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ddad21d3e99c743a3aa473121dc5561679e26bb https://lkml.org/lkml/2019/12/18/1064 Regards, Noel Grandin -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: GNU make losing jobserver tokens
On 3/21/2022 10:28 AM, Magnus Ihse Bursie wrote: Hi, I'm working for Oracle on the OpenJDK build team. We're using GNU make to build the JDK on all supported platforms. For Windows, we use Cygwin as our build environment, including the Cygwin version of GNU make. We have had a long-standing issue with make losing jobserver tokens. ("long-standing" here means for years, and years, at least since GNU make 4.0, up to and including the current latest version in Cygwin.) Most runs end with something like: make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be 12! Since the build still succeeds, and it just affects performance (and typically not that much), we have not spend too much time getting to the bottom of this. Now, however, I've come across a machine where this happens repeatedly, and on a much worse scale: make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24! This effectively turns the highly parallelized builds into single-threaded builds, and is absolutely detrimental for performance. On the flip side, this also makes for the perfect testing environment to really get to the bottom of this issue. I started out by sending a question to bug-m...@gnu.org. The folks over there reported that this was not a known problem with GNU make on Windows in general, and that as far as they knew, the mingw port did not suffer from this problem. Instead, they suggested that it was a Cygwin-specific problem, possibly related to issues with emulating Posix pipes and/or signals in Cygwin. So, my first question is: Is this a known problem in Cygwin GNU make? Are there any workarounds/fixes to get around it? No, it's not a known problem. Otherwise: Any suggestions on how to go on and debug this? I am willing to build and test an instrumented debug build of make, but I will need assistance to find my way around the source and spot likely candidates for the source of the problem. For starters, is your Cygwin installation up to date? Cygwin's internal implementation of pipes was overhauled starting with cygwin-3.3.0. Ken -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
GNU make losing jobserver tokens
Hi, I'm working for Oracle on the OpenJDK build team. We're using GNU make to build the JDK on all supported platforms. For Windows, we use Cygwin as our build environment, including the Cygwin version of GNU make. We have had a long-standing issue with make losing jobserver tokens. ("long-standing" here means for years, and years, at least since GNU make 4.0, up to and including the current latest version in Cygwin.) Most runs end with something like: make[2]: INTERNAL: Exiting with 11 jobserver tokens available; should be 12! Since the build still succeeds, and it just affects performance (and typically not that much), we have not spend too much time getting to the bottom of this. Now, however, I've come across a machine where this happens repeatedly, and on a much worse scale: make[2]: INTERNAL: Exiting with 1 jobserver tokens available; should be 24! This effectively turns the highly parallelized builds into single-threaded builds, and is absolutely detrimental for performance. On the flip side, this also makes for the perfect testing environment to really get to the bottom of this issue. I started out by sending a question to bug-m...@gnu.org. The folks over there reported that this was not a known problem with GNU make on Windows in general, and that as far as they knew, the mingw port did not suffer from this problem. Instead, they suggested that it was a Cygwin-specific problem, possibly related to issues with emulating Posix pipes and/or signals in Cygwin. So, my first question is: Is this a known problem in Cygwin GNU make? Are there any workarounds/fixes to get around it? Otherwise: Any suggestions on how to go on and debug this? I am willing to build and test an instrumented debug build of make, but I will need assistance to find my way around the source and spot likely candidates for the source of the problem. /Magnus -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple