[bug #64806] "invalid output sync mutex" on windows

2024-02-10 Thread Eli Zaretskii
Follow-up Comment #34, bug#64806 (group make):

That weird problem is with a particular build of Windows port of GNU Make, and
it is not clear to me on what OS the error was seem.

But yes, it sounds like it's a separate issue.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-02-10 Thread Gergely Pinter
Follow-up Comment #33, bug#64806 (group make):

The original problem statement by Michael Davidsaver (comment #1 - comment
#11) suggested some apparent corruption of a command line argument that
carries the mutex handle number:

> make: *** invalid output sync mutex: �*V8�.  Stop.

I am afraid, what we discovered (closing a handle to be inherited before
starting a child) does not explain these strange strings in the error message,
thus probably the two phenomena are of different root cause.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-02-10 Thread Eli Zaretskii
Follow-up Comment #32, bug#64806 (group make):

I don't think I understand what you mean by "change management aspect", and
why the behavior discussed here doesn't have much to do with the original
problem report.  Please elaborate.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-02-10 Thread Gergely Pinter
Follow-up Comment #31, bug#64806 (group make):

Indeed, probably the smallest change would be just _not_ closing the handle in
_osync_clear()_ [w32os.c] i.e., relying on Windows to auto-close the handle
upon process termination, without any other change in the control flow (this
is the workaround mentioned in my comment #27).

Regarding change management aspect, I guess we can conclude that the behavior
analyzed in the recent discussion has not much to do with the original problem
statement, maybe another bug should be opened for this.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-02-06 Thread Eli Zaretskii
Follow-up Comment #30, bug#64806 (group make):

Thanks for the footwork and detailed information.
This is for Paul to decide, eventually, but I personally would prefer a
simpler way of calling osync_clear on MS-Windows only in the top-level Make. 
AFAIU, this should solve the problem without rocking the boat too much,
because other than this tricky issue, the current implementation was working
well for several releases, whereas changing it to use named mutexes that are
not inherited could easily uncover exciting new problems.

I understand that doing what I propose will make the Posix and MS-Windows
implementation different in one more way, but I don't see much harm in that,
since they are already quite different.  Documenting these aspects clearly
should go a long way towards mitigating this downside.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-02-06 Thread Gergely Pinter
Follow-up Comment #29, bug#64806 (group make):

As proposed by Eli, I have investigated the process of mutex creation and
inheritance; please find below a brief summary regarding the implementation of
key functions in _posixos.c_ and _w32os.c_, a proposal for bringing the two
approaches closer to each other then finally eliminating the hang experienced
previously.

According to _os.h_, _osync_setup()_ is a function "called in the parent make
to set up output sync initially":
* In _posixos.c_ this function is implemented by opening a temporary file and
storing its descriptor and name in variables _osync_handle_ and
_osync_tmpfile_ respectively; the handle is configured for _not_ being
inherited.
* In _w32os.c_ this function is implemented by creating an unnamed mutex,
setting its handle to be _inherited_ and storing the handle in _osync_handle_
(since the mutex is unnamed, there is no variable equivalent to
_osync_tmpfile_).

According to _os.h_, _osync_get_mutex()_ is a function that "returns an
allocated buffer containing output sync info to pass to child instances, or
NULL if not needed":
* In _posixos.c_ this function returns a dynamically allocated copy of
_osync_tmpfile_ prefixed with _MUTEX_PREFIX_ ("fnm:"), i.e., a file name
prefixed with a known constant.
* In _w32os.c_ this function returns a dynamically allocated string that holds
_osync_handle_ printed in hexadecimal (note that apparently the _MUTEX_PREFIX_
macro is there in _w32os.c_, but not used).

According to _os.h_, _osync_parse_mutex()_ is a function "called in a child
instance to obtain info on the output sync mutex":
* In _posixos.c_ this function extracts the name of the temporary file
(created by _osync_setup()_ and name prefixed by _osync_get_mutex()_) and
opens it; note that this behavior does not fully conform to the description in
_os.h_, because here we not only "obtain info on" the mutex but actually _gain
access to it_ by opening the file.
* In _w32os.c_ this function extracts the numeric mutex handle (created by
_osync_setup()_ and written into string in hexadecimal by
_osync_get_mutex()_); since here we do not open any object just really extract
information about some inherited resource, this behavior seems to better match
the description in _os.h_.

According to _os.h_, _osync_clear()_ is a function to "clean up this
instance's output sync facilities":
* In _posixos.c_ this closes the handle; additionally if the actual process is
the topmost make process, also deletes the file.  Note that (unless we are in
the topmost make process), this function does not prevent children from using
the temporary file for synchronization since the file remains existing.
* In _w32os.c_ this function closes the handle of the mutex.  Note that,
closing the handle prevents its inheritance to children, furthermore, since we
only remembered the handle as a number (no name for the mutex), a child
process calling _osync_parse_mutex()_ will falsely consider the number
returned to be an inherited mutex handle; this is made even worse by the fact
that several pipes are opened upon child process creation thus the numeric
handle will likely be valid but refer to some pipe or file, not the mutex
(this is the explanation for apparent hang: child process tries to wait for a
"mutex" that is in reality some terminal stream).  This behavior is not a
problem, as long as no child process is created after calling _osync_clear()_
but triggered if a child process is started after _osync_clear()_ --
apparently this happens only under quite special circumstances, namely when a
Makefile includes another makefile (component.mk in my example) that triggers
inclusion of further generated makefiles (_.d_ files in my example).

Functions _osync_acquire()_ and _osync_release()_ are quite obvious:
* In _posixos.c osync_acquire()_ locks the temporary file, _osync_release()_
unlocks it.
* In _w32os.c osync_acquire()_ locks waits for the mutex
(_WaitForSingleObject()_), _osync_release()_ releases the mutex
(_ReleaseMutex()_).

Summary:
* _osync_setup()_ creates some object that processes can synchronize on (file
on POSIX, mutex on Windows).
* _osync_get_mutex()_ creates a string representation that can be used for
referring to the synchronization object and can be passed as command line
argument to child processes (prefixed file name on POSIX, handle number on
Windows).
* _osync_parse_mutex()_ extracts the information from the command line
argument and (i) on POSIX it additionally opens the temporary file while (ii)
on Windows the mutex is not opened but expected to be inherited.
* _osync_clear()_ closes the file or mutex; the POSIX implementation
explicitly deletes the temporary file in the topmost make process, while the
Windows implementation relies on the garbage collection upon exiting the last
process that had a handle on the unnamed mutex object.

The key difference between the two approaches is that _osync_parse_mutex()_ on
POSIX only expects that the file used for 

[bug #64806] "invalid output sync mutex" on windows

2024-01-17 Thread Eli Zaretskii
Follow-up Comment #28, bug#64806 (group make):

Thanks, this is important information.

So I think the next step is to understand which call to osync_clear closes the
handle.  Maybe we shouldn't make that call, at least on Windows?

Also, this only happens sometimes, right?  That is, -Otarget sometimes does
work, right?  So it isn't that inheriting mutex handles doesn't work in
general, it's more like sometimes the handle is "taken" after the child
process called osync_clear (which frees the handle for opening any other file
object), and then the handle is no longer usable as a mutex, and a grandchild
process inherits that unusable handle.

So perhaps only the top-level make, the one which calls CreateMutex, should
call CloseHandle on the mutex?  Can you try that?




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-17 Thread Gergely Pinter
Follow-up Comment #27, bug#64806 (group make):

I went on with analysis of the situation as follows.  When processes are
apparently hung, they are waiting for a handle whose numeric value equals the
handle of the mutex created by the root process (0x128 in the example below). 
At this point Process Explorer indicates that 0x128 in a grandchild process
(18728) that is waiting is not a mutex, but "\Device\ConDrv" (see attached
screenshot).  Surprisingly, if one hits ENTER in the terminal window of "hung"
processes, their execution resumes.  At this point I suspected that when we
think that we are waiting for the mutex in osync_acquire(), in reality we are
waiting for the standard input (which would explain the phenomenon that
hitting ENTER resumes execution).  Added instrumentation that logs in separate
files calls to CreateMutex(), WaitForSingleObject() and CloseHandle() in
w32os.c and CloseHandle() and CreateProcess() in sub_proc.c.  Let the root
process be of PID 2728, its child 7116, grandchild 18728 as shown in the
attached screenshot.  According to the log, the following happens (log entries
are in the form "[PID] FILE:FUNCTION: logged API call", only relevant lines
and only from processes 2728, 7116 and 18728 kept, lines were originally
timestamped and ordered, timestamps are hidden for readability):

[2728]  w32os.c:osync_setup: CreateMutex(): 0x128 <<< 2728 creates
mutex with handle 0x128
[...]
[2728]  sub_proc.c:process_begin: CreateProcess: 7116 <<< 2728 creates
process 7116 that inherits
  <<< handle 0x128
referring to the mutex
[...]
[7116]  w32os.c:osync_acquire: WaitForSingleObject(0x128) <<< 7116 uses the
mutex via handle 0x128
[...]
[7116]  w32os.c:osync_acquire: WaitForSingleObject(0x128) <<< 7116 still uses
the mutex via handle 0x128
[...]
[7116]  w32os.c:osync_clear: CloseHandle(0x128)   <<< 7116 closes
handle of the mutex
[...]
[7116]  sub_proc.c:process_begin: CreateProcess: 18728<<< 7116 creates
process 18728 that may inherit a handle
  <<< 0x128 but it is
not a mutex any more but probably
  <<< some terminal
stream, most probably obtained when
  <<< redirecting
stdin/out etc.
[...]
[18728] w32os.c:osync_acquire: WaitForSingleObject(0x128) <<< 18728 tries to
wait for the "mutex" but in reality
  <<< it is waiting
for something else resulting in hang

As indicated in the comments above, if I am not mistaken, the grandchild
process 18728 did not inherit the mutex in handle 0x128 as intended, but 0x128
refers to something else, this seems to be the reason for the hang.

Commenting out the CloseHandle() call from in osync_clear() eliminates the
hang: the entire compilation goes fine, even output synchronization is OK. 
(Removing CloseHandle() is clearly undesired practice, but since Windows
should close any open handles when a process terminates, this temporary
workaround is probably not a long-term resource leak).

To put together: closing the mutex handle before creating a child process that
is expected to use that mutex by inheriting the handle, seems to be the root
cause of the phenomenon.  Maybe this is related to inclusion of dependency
files that had to be created by make.


(file #55583)

___

Additional Item Attachment:

File name: mingw32-make-procexp.png   Size:25 KB
   



AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-d1e0f7a1b19199bb94f733b79bf92733e4fe5029.tar.gz


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #26, bug#64806 (group make):

Killing sub-processes one by one did not change the situation, the rest of
processes stayed hung.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
> Date: Mon, 15 Jan 2024 12:37:42 -0500
> From: Ken Brown 
> 
> This is a long shot, but I had a problem a year ago with parallel make 
> on Cygwin occasionally hanging.  The solution turned out to be to force 
> make's jobserver to use pipes instead of fifos.  If you want to try 
> this, pass make the option '--jobserver-style=pipe'.

Thanks.  But the native Windows port of GNU Make uses neither fifos
nor pipes for the jobserver implementation.  It uses Windows
semaphores.  Moreover, this problem is not with the jobserver, it is
with output-sync feature, which uses Windows mutexes to synchronize
output from separate subprocesses that produce different targets.



Re: [bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Ken Brown
This is a long shot, but I had a problem a year ago with parallel make 
on Cygwin occasionally hanging.  The solution turned out to be to force 
make's jobserver to use pipes instead of fifos.  If you want to try 
this, pass make the option '--jobserver-style=pipe'.


Ken



[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #25, bug#64806 (group make):

The \Device\ConDrv thing is strange, not sure what to make of that. The one
process where the handle is shown as "Mutant" is correct, AFAIU.  Maybe it
means the child processes actually released the mutex, but then why does the
parent keep waiting, and why the children don't exit?

Otherwise, this looks like all of the child processes wait for the mutex
handle, and no process releases it.  What happens if you kill one of the
subprocesses -- do the rest continue being hung, or does the job resume
running?

Another idea is to build Make with additional printf's in osync_acquire and
osync_release to stderr, and see what the output tells us.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #24, bug#64806 (group make):

Looked for this handle by invoking "handle -a -p mingw" then, kept only those
lines of the output where the handle 120 was mentioned; this is the remaining
output:

mingw32-make.exe pid: 7496
  120: Mutant
mingw32-make.exe pid: 13060
  120: File  \Device\ConDrv
mingw32-make.exe pid: 22120
  120: File  \Device\ConDrv
mingw32-make.exe pid: 14472
  120: File  \Device\ConDrv
mingw32-make.exe pid: 908
  120: File  \Device\ConDrv
mingw32-make.exe pid: 21044
  120: File  \Device\ConDrv
mingw32-make.exe pid: 5688
  120: File  \Device\ConDrv
mingw32-make.exe pid: 21984
  120: File  \Device\ConDrv
mingw32-make.exe pid: 20800
  120: File  \Device\ConDrv

Although not sure about the interpretation of the output, I would say that the
"root" and the "grandchildren" processes are doing something with handle
0x120.




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #23, bug#64806 (group make):

Do any other processes of those involved in the hang have this same mutex
handle open?



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #22, bug#64806 (group make):

I am afraid, we are now somewhat over my Windows competence, here is what I
was able to derive.  Variable osync_handle is (HANDLE) 0x120:

(gdb) attach 14472
(gdb) thread 1
(gdb) bt
#0  0x771f2bcc in ntdll!ZwWaitForSingleObject ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x762fac59 in WaitForSingleObjectEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x762fabb2 in WaitForSingleObject ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x009070ce in osync_acquire () at src/w32/w32os.c:476
#4  0x008f8148 in output_dump (out=out@entry=0x3a848a8) at src/output.c:280
#5  0x008f28ce in reap_children (block=, block@entry=0,
err=err@entry=0) at src/job.c:1058
#6  0x008f3aaf in new_job (file=0x3a7c548) at src/job.c:1866
#7  0x008ffe5b in remake_file (file=0x3a7c548) at src/remake.c:1313
#8  update_file_1 (depth=, file=)
at src/remake.c:905
#9  update_file (file=file@entry=0x3a7c548, depth=)
at src/remake.c:367
#10 0x009004b7 in check_dep (file=0x3a7c548, depth=depth@entry=1,
this_mtime=, must_make_ptr=,
must_make_ptr@entry=0x1bff054) at src/remake.c:1100
#11 0x008ff2ba in update_file_1 (depth=, file=)
at src/remake.c:633
#12 update_file (file=file@entry=0x3a7c9c8, depth=)
at src/remake.c:367
#13 0x0090077e in update_goal_chain (goaldeps=)
at src/remake.c:184
#14 0x009143ce in main (argc=, argv=,
envp=) at src/main.c:2921

(gdb) f 3
#3  0x009070ce in osync_acquire () at src/w32/w32os.c:476
476   DWORD result = WaitForSingleObject (osync_handle, INFINITE);

(gdb) p osync_handle
$1 = (HANDLE) 0x120


Now the interpretation of this handle is beyond my understanding, since
Process Explorer tells that it is a File handle ("\Device\ConDrv"), see
attached image -- note that I may seriously misunderstand something here. 
Maybe this is related to the original submission by Michael who in comment #5
suspected some use after free situation causing apparently random strings
printed in their case (just guessing)?

(file #55572)

___

Additional Item Attachment:

File name: mingw32-make-procexp-handle.png Size:4 KB
   



AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-3f5b69a3b837951a0e5c0b7730ee347c798a8844.tar.gz


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #21, bug#64806 (group make):

Can you use ProcessExplorer (or some other SysInternals tool) to find out
which process, if any holds the handle that is the value of osync_handle, the
handle for which osync_acquire is waiting in the instances that are hung?



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #20, bug#64806 (group make):

Checked for three other make processes: I would say that they are in the same
situation (the processes on the screenshot are still running). 

=Summary=

* 7496 (T1: *process_wait_for_multiple_objects*() at
src/w32/subproc/sub_proc.c:89)
* 3704 (T1: *process_wait_for_multiple_objects*() at
src/w32/subproc/sub_proc.c:89)
* 14472 (T1: *osync_acquire*() at src/w32/w32os.c:476)
* 19576 (T1: *process_wait_for_multiple_objects*() at
src/w32/subproc/sub_proc.c:89)
* 21984 (T1: *osync_acquire*() at src/w32/w32os.c:476)
* 8452 (T1: *process_wait_for_multiple_objects*() at
src/w32/subproc/sub_proc.c:89)
* 5688 (T1: *osync_acquire*() at src/w32/w32os.c:476)
* 17676 (T1: *process_wait_for_multiple_objects*() at
src/w32/subproc/sub_proc.c:89)
* 13060 (T1: *osync_acquire*() at src/w32/w32os.c:476)

=Details=

==7496==

(gdb) attach 7496
(gdb) thread 1
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x00907ad5 in process_wait_for_multiple_objects (nCount=9,
lpHandles=lpHandles@entry=0x1e1ef50, bWaitAll=bWaitAll@entry=0,
dwMilliseconds=dwMilliseconds@entry=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x00906e58 in jobserver_acquire (timeout=0) at src/w32/w32os.c:370
#5  0x008f3b1b in new_job (file=0x1dfce88) at src/job.c:1882
#6  0x008ffe5b in remake_file (file=0x1dfce88) at src/remake.c:1313
#7  update_file_1 (depth=, file=)
at src/remake.c:905
#8  update_file (file=file@entry=0x1dfce88, depth=)
at src/remake.c:367
#9  0x009004b7 in check_dep (file=0x1dfce88, depth=depth@entry=1,
this_mtime=, must_make_ptr=,
must_make_ptr@entry=0x1bff244) at src/remake.c:1100
#10 0x008ff2ba in update_file_1 (depth=, file=)
at src/remake.c:633
#11 update_file (file=file@entry=0x1dfd3c8, depth=)
at src/remake.c:367
#12 0x0090077e in update_goal_chain (goaldeps=)
at src/remake.c:184
#13 0x009143ce in main (argc=, argv=,
envp=) at src/main.c:2921
(gdb) detach


==3704==

(gdb) attach 3704
(gdb) thread 1
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x00907ad5 in process_wait_for_multiple_objects (nCount=1,
lpHandles=0x1bfa940, bWaitAll=0, dwMilliseconds=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x00907c20 in process_wait_for_any_private (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x0) at src/w32/subproc/sub_proc.c:204
#5  0x00908ad1 in process_wait_for_any (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x0) at src/w32/subproc/sub_proc.c:307
#6  0x008f1d1f in exec_command (argv=0x1bfea70, envp=0x3ad9d88)
at src/job.c:2549
#7  0x00914a69 in main (argc=, argv=,
envp=) at src/main.c:2820
(gdb) detach


==14472==

(gdb) attach 14472
(gdb) thread 1
(gdb) bt
#0  0x771f2bcc in ntdll!ZwWaitForSingleObject ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x762fac59 in WaitForSingleObjectEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x762fabb2 in WaitForSingleObject ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x009070ce in osync_acquire () at src/w32/w32os.c:476
#4  0x008f8148 in output_dump (out=out@entry=0x3a848a8) at src/output.c:280
#5  0x008f28ce in reap_children (block=, block@entry=0,
err=err@entry=0) at src/job.c:1058
#6  0x008f3aaf in new_job (file=0x3a7c548) at src/job.c:1866
#7  0x008ffe5b in remake_file (file=0x3a7c548) at src/remake.c:1313
#8  update_file_1 (depth=, file=)
at src/remake.c:905
#9  update_file (file=file@entry=0x3a7c548, depth=)
at src/remake.c:367
#10 0x009004b7 in check_dep (file=0x3a7c548, depth=depth@entry=1,
this_mtime=, must_make_ptr=,
must_make_ptr@entry=0x1bff054) at src/remake.c:1100
#11 0x008ff2ba in update_file_1 (depth=, file=)
at src/remake.c:633
#12 update_file (file=file@entry=0x3a7c9c8, depth=)
at src/remake.c:367
#13 0x0090077e in update_goal_chain (goaldeps=)
at src/remake.c:184
#14 0x009143ce in main (argc=, argv=,
envp=) at src/main.c:2921
(gdb) detach


==19576==

(gdb) attach 19576
(gdb) thread 1
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x00907ad5 in process_wait_for_multiple_objects (nCount=1,
lpHandles=0x1bfaf20, bWaitAll=0, dwMilliseconds=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x00907c20 in 

[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #19, bug#64806 (group make):

When the stuff hangs, can you use ProcessExplorer to see which mutexes are
actually used by these Make processes, and how many of them are used?  You can
find the code which manipulates the mutexes in src/w32/w32os.c (look for the
osync_* functions near the end of the file).

maybe if we know which mutexes are used, that will help us understand what's
going on that causes the hang.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #18, bug#64806 (group make):

Very strange.  Looks like the parent process is waiting for the child to
finish, and the child process cannot acquire the mutex and also waits.

Are all of the 8 sub-make's in the same situation?

FWIW, I tried to run the Makefile you sent on my system, using my own MinGW32
build of Make 4.4.1, available here:

 https://sourceforge.net/projects/ezwinports/files/

and I cannot reproduce the hang, although I do see the "Cannot acquire output
lock, disabling output sync" warnings a couple of times during the run.  This
is with GCC 9.2.0 on Windows XP, FTR. Perhaps also relevant: on my system cmp0
to cmp7 jobs run together (in parallel), whereas cmp8 and cmp9 run afterwards
(in parallel with each other, but sequentially after cmp0-cmp7).  Is this
expected, given that I have a Core i7 machine with 4 cores and
hyperthreading?




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #17, bug#64806 (group make):

All the 8 processes seem to be hung: see attached screenshot from Process
Explorer.  Apparently there is a "root" process, that has eight children, who
each have 1-1 children.  Now did a more systematic investigation along the
process creation tree 7496->3704->14472.

Attaching the debugger to the "root" process 7496 and printing call trees in
both threads:

(gdb) attach 7496
(gdb) info threads
  Id   Target Id  Frame
  1Thread 7496.0x557c 0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
* 2Thread 7496.0x1640 0x771f4f21 in ntdll!DbgBreakPoint ()
   from C:\WINDOWS\SysWOW64\ntdll.dll

(gdb) thread 1
[Switching to thread 1 (Thread 7496.0x557c)]
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x00907ad5 in process_wait_for_multiple_objects (nCount=9,
lpHandles=lpHandles@entry=0x1e1ef50, bWaitAll=bWaitAll@entry=0,
dwMilliseconds=dwMilliseconds@entry=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x00906e58 in jobserver_acquire (timeout=0) at src/w32/w32os.c:370
#5  0x008f3b1b in new_job (file=0x1dfce88) at src/job.c:1882
#6  0x008ffe5b in remake_file (file=0x1dfce88) at src/remake.c:1313
#7  update_file_1 (depth=, file=)
at src/remake.c:905
#8  update_file (file=file@entry=0x1dfce88, depth=)
at src/remake.c:367
#9  0x009004b7 in check_dep (file=0x1dfce88, depth=depth@entry=1,
this_mtime=, must_make_ptr=,
must_make_ptr@entry=0x1bff244) at src/remake.c:1100
#10 0x008ff2ba in update_file_1 (depth=, file=)
at src/remake.c:633
#11 update_file (file=file@entry=0x1dfd3c8, depth=)
at src/remake.c:367
#12 0x0090077e in update_goal_chain (goaldeps=)
at src/remake.c:184
#13 0x009143ce in main (argc=, argv=,
envp=) at src/main.c:2921

(gdb) thread 2
[Switching to thread 2 (Thread 7496.0x1640)]
#0  0x771f4f21 in ntdll!DbgBreakPoint () from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f4f21 in ntdll!DbgBreakPoint () from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x7722dbc9 in ntdll!DbgUiRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#2  0x69d77919 in ?? ()
#3  0x7722db90 in ntdll!DbgUiIssueRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#4  0x767bfcc9 in KERNEL32!BaseThreadInitThunk ()
   from C:\WINDOWS\SysWOW64\kernel32.dll
#5  0x771e7c6e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#6  0x771e7c3e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#7  0x in ?? ()

(gdb) detach

Attaching the debugger to the "child" process 3704 and printing call trees in
both threads:

(gdb) attach 3704
(gdb) info threads
  Id   Target Id  Frame
  1Thread 3704.0x50e4 0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
* 2Thread 3704.0x1778 0x771f4f21 in ntdll!DbgBreakPoint ()
   from C:\WINDOWS\SysWOW64\ntdll.dll

(gdb) thread 1
[Switching to thread 1 (Thread 3704.0x50e4)]
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x00907ad5 in process_wait_for_multiple_objects (nCount=1,
lpHandles=0x1bfa940, bWaitAll=0, dwMilliseconds=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x00907c20 in process_wait_for_any_private (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x0) at src/w32/subproc/sub_proc.c:204
#5  0x00908ad1 in process_wait_for_any (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x0) at src/w32/subproc/sub_proc.c:307
#6  0x008f1d1f in exec_command (argv=0x1bfea70, envp=0x3ad9d88)
at src/job.c:2549
#7  0x00914a69 in main (argc=, argv=,
envp=) at src/main.c:2820

(gdb) thread 2
[Switching to thread 2 (Thread 3704.0x1778)]
#0  0x771f4f21 in ntdll!DbgBreakPoint () from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f4f21 in ntdll!DbgBreakPoint () from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x7722dbc9 in ntdll!DbgUiRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#2  0x58dd79db in ?? ()
#3  0x7722db90 in ntdll!DbgUiIssueRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#4  0x767bfcc9 in KERNEL32!BaseThreadInitThunk ()
   from C:\WINDOWS\SysWOW64\kernel32.dll
#5  0x771e7c6e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#6  0x771e7c3e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#7  0x in ?? ()


[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Eli Zaretskii
Follow-up Comment #16, bug#64806 (group make):

>From the backtrace of the hung process, it looks like we wait forever for a
mutex to be released, which means some child process that holds the mutex
exited.  But the mutex is never released.  Do you see any other related
sub-processes that are hung at that time? Since Make is run with "-j 8", there
could be up to 8 Make sub-processes running for which we are waiting in the
top-level Make.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #15, bug#64806 (group make):

Having placed a breakpoint on the error() function, it seems that the
"mingw32-make: *** [Makefile:7: cmp3] Error 130" error message comes from this
function call chain:

#0  error (flocp=flocp@entry=0x0, len=len@entry=40,
fmt=fmt@entry=0x60982d "%s[%s: %s] Error %d%s%s") at src/output.c:450
#1  0x005e0373 in child_error (child=child@entry=0x3b5d8d8,
exit_code=, exit_sig=,
coredump=, ignored=, ignored@entry=0)
at src/job.c:584
#2  0x005e2887 in reap_children (block=, block@entry=0,
err=err@entry=0) at src/job.c:990
#3  0x005e3aaf in new_job (file=0x3b4cdc0) at src/job.c:1866
#4  0x005efe5b in remake_file (file=0x3b4cdc0) at src/remake.c:1313
#5  update_file_1 (depth=, file=)
at src/remake.c:905
#6  update_file (file=file@entry=0x3b4cdc0, depth=)
at src/remake.c:367
#7  0x005f04b7 in check_dep (file=0x3b4cdc0, depth=depth@entry=1,
this_mtime=, must_make_ptr=,
must_make_ptr@entry=0x19ff444) at src/remake.c:1100
#8  0x005ef2ba in update_file_1 (depth=, file=)
at src/remake.c:633
#9  update_file (file=file@entry=0x3b4c940, depth=)
at src/remake.c:367
#10 0x005f077e in update_goal_chain (goaldeps=)
at src/remake.c:184
#11 0x006043ce in main (argc=, argv=,
envp=) at src/main.c:2921


However since this only happens after Ctrl-C-ing the hung process, I guess
this is not the root cause.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-15 Thread Gergely Pinter
Follow-up Comment #14, bug#64806 (group make):

I can confirm Eli's comment that the relevant difference is apparently using a
mutex for synchronization, not being 32 or 64 bit.  MSys2 comes with several
"environments", the make binary reported hanging was from the "MINGW32"
environment:

$ file /mingw32/bin/mingw32-make.exe
/mingw32/bin/mingw32-make.exe: PE32 executable (console) Intel 80386 (stripped
to external PDB), for MS Windows, 11 sections

This is the make from "MSYS" environment, that runs fine:

$ file /bin/make.exe
/bin/make.exe: PE32+ executable (console) x86-64 (stripped to external PDB),
for MS Windows, 10 sections

According to the comment, I also checked with the "MINGW64" environment:

$ file /mingw64/bin/mingw32-make.exe
/mingw64/bin/mingw32-make.exe: PE32+ executable (console) x86-64 (stripped to
external PDB), for MS Windows, 12 sections

This third binary also hangs, but with this binary I also saw this once (in
which case the compilation went on):

DEP src99.c [cmp3]
DEP src99.c [cmp5]
mingw32-make[1]: warning: Cannot acquire output lock, disabling output sync.
CC  src0.c [cmp0]
CC  src1.c [cmp0]

It may be worth mentioning that the hang occurs exactly once having finished
creating dependency files whose creation was triggered by an "include"
directive and before starting actual compilation (first creating .d files with
"-Onone", then calling "mingw32-make -s -j 8 -Otarget" with already existing
.d files successfully compiles all 1000 files in the example).

Tried to investigate what is going on in hung processes: (i) re-compiled
mingw32-make from the MSys2 package with debug symbols, (ii) called
"mingw32-make -s -j 8 -Otarget" as usually, (iii) waited until processes hang,
(iv) attached gdb to one of the processes and dumped backtrace of both
threads:


(gdb) info threads
  Id   Target Id  Frame
  1Thread 4680.0x57ec 0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
* 2Thread 4680.0x16b0 0x771f4f21 in ntdll!DbgBreakPoint ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f4f21 in ntdll!DbgBreakPoint () from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x7722dbc9 in ntdll!DbgUiRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#2  0xdb1ac4d4 in ?? ()
#3  0x7722db90 in ntdll!DbgUiIssueRemoteBreakin ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#4  0x767bfcc9 in KERNEL32!BaseThreadInitThunk ()
   from C:\WINDOWS\SysWOW64\kernel32.dll
#5  0x771e7c6e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#6  0x771e7c3e in ntdll!RtlGetAppContainerNamedObjectPath ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#7  0x in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 4680.0x57ec)]
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
(gdb) bt
#0  0x771f315c in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\ntdll.dll
#1  0x76304de3 in WaitForMultipleObjectsEx ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#2  0x76304cc8 in WaitForMultipleObjects ()
   from C:\WINDOWS\SysWOW64\KernelBase.dll
#3  0x005f7ad5 in process_wait_for_multiple_objects (nCount=8,
lpHandles=0x19faf40, bWaitAll=0, dwMilliseconds=4294967295)
at src/w32/subproc/sub_proc.c:89
#4  0x005f7c20 in process_wait_for_any_private (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x19fefbc)
at src/w32/subproc/sub_proc.c:204
#5  0x005f8ad1 in process_wait_for_any (block=block@entry=1,
pdwWaitStatus=pdwWaitStatus@entry=0x19fefbc)
at src/w32/subproc/sub_proc.c:307
#6  0x005e2659 in reap_children (block=, err=err@entry=0)
at src/job.c:848
#7  0x005f06ce in update_goal_chain (goaldeps=)
at src/remake.c:142
#8  0x006043ce in main (argc=, argv=,
envp=) at src/main.c:2921


I hope this provides some information; unfortunately I am not familiar with
debugging under Windows or the Windows API in general.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-14 Thread Eli Zaretskii
Follow-up Comment #13, bug#64806 (group make):

It would be helpful to understand which code cause the "Error 130" message to
be displayed, so as to allow interpreting that error, which might give us some
hint about what's going on.  If 130 is a value obtained from GetLastError,
then its meaning sounds strange for what Make does (something about using a
wrong handle for an open disk partition?).

By the way, I doubt that mingw32-make.exe is a 32-bit program.  All MSYS2
ports are nowadays 64-bit programs, regardless of the fact that the program
says it was built for Windows32.  But that is not important.  Th real
difference between make for pc-msys and mingw32-make is that only the latter
uses the mutex for synchronization, the pc-msys port uses a method similar to
the one used on GNU/Linux.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2024-01-14 Thread Gergely Pinter
Follow-up Comment #12, bug#64806 (group make):

Hi all, 
I've recently encountered some undesired behavior which is likely related to
this open issue; attaching description of the phenomenon with a tiny project
where it consistently appears.

The environment I used is the MSYS2 distribution (https://www.msys2.org/), the
phenomenon is shown in case of the MINGW32 version of make (mingw32-make.exe).
 Please find attached a project with two makefiles (the top-level Makefile and
the utility component.mk), 10 “components” (cmp0-cmp9) with 100 tiny
source files each.  These 1000 source files can be compiled in parallel
(apparently the phenomenon described below occurs in case of large number of
parallel compilations).  If issuing the “all” goal, make should create
dependency files then compile sources (only object files created, no linking
in this example).  Most of the activities can be done in parallel, e.g., as
“make -s -j 8 -Otarget”.

The 4.4.1 version of make is available in the MSYS2 distribution both as
“make” (for the 64-bit MSYS environment) and as “mingw32-make” (for
the 32-bit MINGW32 environment):

$ make --version
GNU Make 4.4.1
Built for x86_64-pc-msys
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

$ mingw32-make --version
GNU Make 4.4.1
Built for Windows32
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

When running “make -s -j 8 -Otarget” (i.e., using the 64-bit MSYS
version), compilation goes fine.  However if invoking the same goal with
mingw32-make the (“mingw32-make -s -j 8 -Otarget”) the compilation hangs
somewhere around having created most of dependency files:

$ mingw32-make -s -j 8 -Otarget
[...]
DEP src98.c [cmp3]
DEP src98.c [cmp7]
DEP src99.c [cmp3]
DEP src99.c [cmp7]
[hangs here]

At this point 17 mingw32-make.exe processes are running, all at 0% CPU usage,
without any output, apparently in some kind of deadlock.  If hitting here
Ctrl-C the following output is printed:

mingw32-make[1]: *** Deleting file 'cmp7/src0.o'
mingw32-make[1]: *** Deleting file 'cmp4/src0.o'
mingw32-make[1]: *** Deleting file 'cmp0/src0.o'
mingw32-make[1]: *** Deleting file 'cmp5/src0.o'
mingw32-make[1]: *** Deleting file 'cmp1/src0.o'
mingw32-make[1]: *** Deleting file 'cmp6/src0.o'
mingw32-make[1]: *** Deleting file 'cmp2/src0.o'
mingw32-make[1]: *** Deleting file 'cmp3/src0.o'
mingw32-make: *** [Makefile:7: cmp3] Error 130
mingw32-make: *** [Makefile:7: cmp0] Error 130
mingw32-make: *** [Makefile:7: cmp1] Error 130
mingw32-make: *** [Makefile:7: cmp2] Error 130
mingw32-make: *** [Makefile:7: cmp4] Error 130
mingw32-make: *** [Makefile:7: cmp5] Error 130
mingw32-make: *** [Makefile:7: cmp6] Error 130
mingw32-make: *** [Makefile:7: cmp7] Error 130

Error messages suggest that mingw32-make processes were already executing the
compilation rules, but apparently weren’t able to do any progress. 
Furthermore at this point there are several mingw32-make.exe processes still
alive, whose CPU usage jumps to maximum after the Ctrl-C, such that the PC is
practically unresponsive unless killing these processes in Process Explorer.

The same behavior is experienced when using “-Oline” output
synchronization directive, but not with “-Onone” (then the compilation
goes fine).  Based on this I would suspect the root cause somewhere around
output synchronization, apparently only if using 32-bit version of make on
Windows.

(file #55565)

___

Additional Item Attachment:

File name: make-output-synchronization-example.tar.gz Size:13 KB
   



AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-3f5b69a3b837951a0e5c0b7730ee347c798a8844.tar.gz


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-28 Thread Peter Heesterman
Follow-up Comment #11, bug #64806 (project make):

The utility is built using the MingW tool set.
I would not personally recommend doing so.
 


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-28 Thread Peter Heesterman
Follow-up Comment #10, bug #64806 (project make):

5.32.1.1-64bit
gmake.exe 14/03/2019 - EPICS base 7.0.7 builds fine with this.

5.38.0.1-64bit
EPICS base constains a file 'print.cpp', which should of course compile to
'print.obj'.
But the make utility puts this in upper case so the outfput file is
'PRINT.obj'.
This cause subsequent linker fail, as the file 'print.obj' does not
exist.
This is the most usual and reproducible fault, but there are others.

This fault ocurrs with:

mingw32-make.exe 04/06/2023 Fails cannot open input file print.obj
make.exe 04/06/2023 Fails cannot open input file print.obj
gmake.exe 04/06/2023 Fails cannont open input file print.obj

I built the GNU make 4.4.1 code from sources using Visual Studio 2022.
This results in 'gnumake.exe'.
EPICS base builds fine with this.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-06 Thread Michael Davidsaver
Follow-up Comment #9, bug #64806 (project make):


> Is the brechtsanders build a 32-bit executable or a 64-bit executable?

64-bit as I read it.


bin/make.exe:PE32 executable (console) Intel 80386 (stripped
to external PDB), for MS Windows, 10 sections
perl/c/bin/mingw32-make.exe: PE32+ executable (console) x86-64 (stripped to
external PDB), for MS Windows, 12 sections
perl/c/bin/make.exe: PE32+ executable (console) x86-64 (stripped to
external PDB), for MS Windows, 12 sections




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-05 Thread Eli Zaretskii
Follow-up Comment #8, bug #64806 (project make):

Is the brechtsanders build a 32-bit executable or a 64-bit executable?  If
it's a 64-bit executable, maybe the problem only rears its ugly head in a
64-bit build, because the ezwinports stuff is 32-bit.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-05 Thread Michael Davidsaver
Follow-up Comment #7, bug #64806 (project make):


> This bug report lacks a reproduction recipe ...

The closest which I think I can come to providing such a recipe is in WINE
where "-Otarget -j" works with the ezwinports build, but not with the
brechtsanders/winlibs_mingw build.

detailed outputs included with:

https://github.com/brechtsanders/winlibs_mingw/issues/174#issuecomment-1793823524

I have asked brechtsanders for more details on this build process and
environment.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-05 Thread Michael Davidsaver
Follow-up Comment #6, bug #64806 (project make):


> This bug report lacks a reproduction recipe ...

Yes, I know.  Unfortunately (well, in this instance ;) ) I run Linux
exclusively on my personal systems and only interact with windows via
continuous integration builds.  Troubleshooting with CI is ... tedious.  If I
could reproduce this issue from a local build, I would likely be submitting a
patch instead of this confused investigation into where these two windows
builds originate.  (this whole process reminds me not to take "apt-get source
make" for granted)

> ... it is not clear whether the make.exe binary which exhibits the problem
is the one from ezwinports or something else ...

I observed this issue with the make.exe distributed with the Strawberry Perl
5.38.0.1 installer.  From what I have found, this binary originates as a
"personal build" by Brecht Sanders (see comment #2).  So as far as I can tell,
this binary is _not_ from ezwinports.

> So more information is needed to make any progress with this bug report.

As explained above, there are limits to what additional information I can
provide.

I made a brief test on Linux to rename /usr/bin/make to something longer or
shorter.  Both appear to work, and running with valgrind does not report any
errors.

This makes me wonder whether src/getopt.c is involved?

>From what I can tell from src/main.c, it looks like the pointer stored in
argv[0] is being saved prior to calling getopt_long().  Of course, there is a
lot of #ifdef in that file, which makes it difficult for me to follow.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-11-05 Thread Michael Davidsaver
Follow-up Comment #5, bug #64806 (project make):

Quoting @shawnlaffan from
https://github.com/StrawberryPerl/Perl-Dist-Strawberry/issues/148#issuecomment-1783929512

> The root cause of the issue seems to be that the make executables do not
support recursive calls when they have been renamed. Calling as mingw32-make
will work in this case, but not make and I suspect also not gmake. ...

In combination with the variation in error messages which I see, I suspect an
use-after-free of argv[0] or a copy of the same.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-10-28 Thread Eli Zaretskii
Update of bug #64806 (project make):

   Triage Status:None => Need Info  

___

Follow-up Comment #4:

This bug report lacks a reproduction recipe, preferably one that doesn't
require to download huge packages and trying to build them.

In addition, it is not clear whether the make.exe binary which exhibits the
problem is the one from ezwinports or something else, and if it is from
ezwinports, what package exactly was downloaded from ezwinports (there are 2
binary packages of GNU Make there).

So more information is needed to make any progress with this bug report.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-10-28 Thread Michael Davidsaver
Follow-up Comment #3, bug #64806 (project make):

"choco install make" is stated to be a build from  
https://sourceforge.net/projects/ezwinports

https://bitbucket.org/xoviat/chocolatey-packages/src/master/make/4.4.1/tools/VERIFICATION.txt

At this point I have am not sure I can provide any additional input.  Both
builds seem to be of unmodified upstream source.  However, they behave
differently.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-10-24 Thread Michael Davidsaver
Follow-up Comment #2, bug #64806 (project make):

Following the trail from the strawberry perl installer leads to:

https://github.com/brechtsanders/winlibs_mingw/issues/174



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-10-24 Thread Michael Davidsaver
Follow-up Comment #1, bug #64806 (project make):

Correction.  We intended to use make 4.4.1 via chocolatey.  It appears that
the latest strawberry perl release 5.38.0.1 now installs "make.exe".  Previous
releases only install "gmake.exe".  This executable also identifies as make
4.4.1, but I am unsure how it is built.



___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #64806] "invalid output sync mutex" on windows

2023-10-22 Thread Michael Davidsaver
URL:
  

 Summary: "invalid output sync mutex" on windows
   Group: make
   Submitter: mdavidsaver
   Submitted: Sun 22 Oct 2023 09:38:22 PM UTC
Severity: 3 - Normal
  Item Group: Bug
  Status: None
 Privacy: Public
 Assigned to: None
 Open/Closed: Open
 Discussion Lock: Any
   Component Version: 4.4.1
Operating System: MS Windows
   Fixed Release: None
   Triage Status: None


___

Follow-up Comments:


---
Date: Sun 22 Oct 2023 09:38:22 PM UTC By: Michael Davidsaver 
I have recently found that recursive invocations of make 4.4.1 (from
chocolatey) in the github actions windows-2019 and -2022 images error out
with:


...
  make -C ./configure install
  make: *** invalid output sync mutex: --.  Stop.
  make: *** [configure/RULES_DIRS:85: configure.install] Error 2
...


The exact message differs from run to run.


   make: *** invalid output sync mutex: ���.  Stop.



  make: *** invalid output sync mutex: �*V8�.  Stop.


Our builds were running "make -Otarget ...".  The error message was enough of
a hint that I tried removing the "-Otarget".  This avoids the error.

make with "-Otarget" had been working in this environment for some time, and
as recently as 3 weeks ago.  This continues to do so in the appveyor
environment (today at least).  So this may be connected with an update to the
GHA windows images last week.

for reference: https://github.com/epics-base/ci-scripts/issues/84







___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/