[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-11-05 Thread Paul Moore

Paul Moore  added the comment:

I'm not actually sure what the proposal here is. Are we suggesting that all 
Python's means of terminating a process should use the same exit code?

Note that doing so would be a backward compatibility break, as os.kill() is 
documented as having the behaviour seen here (it's just that SIGTERM isn't a 
particularly meaningful value to use on Windows). subprocess terminate() 
doesn't document the exit code sent on Windows, and maybe should - but 1 seems 
a reasonable value (it's the C EXIT_FAILURE code after all). I don't fully 
understand the issue multiprocessing is trying to solve, but it seems to be 
around signals, which are very different between Windows and Unix anyway.

So, in summary - I'd need to see a specific proposal, but my instinct is that 
this is only an issue if you're trying to cover over the differences between 
Unix and Windows, and this isn't a case where I think that's advisable (the 
current situation is "good enough" if you don't care, and if you do, you have 
the means to do it right, you just need to cater for the platform differences 
yourself, in a way that suits your application.).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-11-05 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I would like to know what our resident Windows users think about this (Paul, 
Steve, Zach).

Reading the above arguments, I'd be inclined to settle on 15 (that is, the 
non-negative "signal" number).  While it is not consistent with what "taskkill" 
or other APIs do, it makes it clear that the process was terminated in a 
certain way.  Certainly, there is a slight chance that 15 is a legitimate error 
code returned by the process, but that is far less likely than returning 1 as a 
legitimate error code, which I presume is extremely common.

In any case, this can't go in a bugfix release, so marking as 3.7-only.

--
versions:  -Python 3.6, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-27 Thread Eryk Sun

Eryk Sun  added the comment:

If a multiprocessing Process gets terminated by any means other than its 
terminate() method, it won't get this special TERMINATE (0x1) exit code 
that allows the object to pretend the exit status is POSIX -SIGTERM. In 
general, the exit code will be 1. IMO, Process.terminate should be consistent 
with typical exit code of 1 and thus consistent with Popen.terminate. However, 
I'm adding Davin and Antoine to the nosy list in case they disagree -- before 
you go to the trouble of creating a PR.

--
nosy: +davin, pitrou
versions: +Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-26 Thread Akos Kiss

Akos Kiss  added the comment:

And I thought that my analysis was thorough... Exit code 1 is the way to go, I 
agree now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-26 Thread Eryk Sun

Eryk Sun  added the comment:

A C/C++ program returns EXIT_FAILURE for a generic failure. Microsoft defines 
this macro value as 1. Most tools that a user might use to forcibly terminate a 
process don't allow specifying the reason; they just use the generic value of 
1. This includes Task Manager, taskkill.exe /f, the WDK's kill.exe -f, and 
Sysinternals pskill.exe and Process Explorer. subprocess and multiprocessing 
should also use 1 to be consistent.

The system itself doesn't distinguish a forced termination from a normal exit. 
Ultimately every thread and process gets terminated by the system calls 
NtTerminateThread and NtTerminateProcess (or the equivalent Process Manager 
private functions PspTerminateThreadByPointer, PspTerminateProcess, etc). 
Windows API TerminateThread and TerminateProcess are light wrappers around the 
corresponding system calls.

ExitThread and ExitProcess (actually implemented as RtlExitUserThread and 
RtlExitUserProcess in ntdll.dll) are within-process calls that integrate with 
the loader's LdrShutdownThread and LdrShutdownProcess routines. This allows the 
loader to call the entry points for loaded DLLs with DLL_THREAD_DETACH or 
DLL_PROCESS_DETACH, respectively. ExitThread also handles deallocating the 
thread's stack. Beyond that, the bulk of the work is handled by 
NtTerminateThread and NtTerminateProcess. For ExitProcess, NtTerminateProcess 
is actually called twice -- the first time it's called with a NULL process 
handle to kill the other threads in the current process. After 
LdrShutdownProcess returns, NtTerminateProcess is called again to truly 
terminate the process.

> PowerShell and .NET ... `System.Diagnostics.Process.Kill()` ... 
> `TerminateProcess` is called with -1

.NET is in its own (cross-platform) managed-code universe. I don't know why the 
developers decided to make Kill() use -1 (0x) as the exit code. I can 
guess that they negated the conventional EXIT_FAILURE value to indicate a 
signal-like kill. I think it's an odd decision, and I'm not inclined to favor 
it over behaviors that predate the existence of .NET. 

Making the ExitCode property a signed integer in .NET is easy to understand, 
and not a cause for concern since it's only a matter of interpretation. Note 
that the return value from wmain() or wWinMain() is a signed integer. Also, the 
two fundamental status result types in Windows -- NTSTATUS [1] and HRESULT [2] 
-- are 32-bit signed integers (warnings and errors are negative). Internally, 
the NT Process object's EPROCESS structure defines ExitStatus as an NTSTATUS 
value. You can see in a kernel debugger that it's a 32-bit signed integer 
(Int4B):

lkd> dt nt!_eprocess ExitStatus
   +0x624 ExitStatus : Int4B

Python also wants the exit code to be a signed value. If we try to exit with an 
unsigned value that exceeds 0x7FFF_, it instead uses a default code of -1 
(0x_). For example:

>>> hex(subprocess.call('python -c "raise SystemExit(0x8000_)"'))
'0x'

Using the corresponding signed integer works fine:

>>> 0x8000_ - 2**32
-2147483648
>>> hex(subprocess.call('python -c "raise SystemExit(-2_147_483_648)"'))
'0x8000'

[1]: https://msdn.microsoft.com/en-us/library/cc231200
[2]: https://msdn.microsoft.com/en-us/library/cc231198


> termination by a signal "terminates the calling program with 
> exit code 3"

MS C raise() defaults to calling exit(3). I don't know why it uses the value 3; 
it's a legacy value from the MS-DOS era. Python doesn't directly expose C 
raise(), so this exit code only occurs in rare circumstances.

Note that SIGINT and SIGBREAK are based on console control events, and in this 
case the default behavior (i.e. SIG_DFL) is not to call exit(3) but rather to 
continue to the next registered console control handler. This is normally the 
Windows default handler (i.e. kernelbase!DefaultHandler), which calls 
ExitProcess with STATUS_CONTROL_C_EXIT. When closing the console itself (i.e. 
CTRL_CLOSE_EVENT), if a control handler in a console client returns TRUE, the 
default handler doesn't get called, but (starting with NT 6.0) the process 
still has to be terminated. In this case the session server, csrss.exe, calls 
NtTerminateProcess with STATUS_CONTROL_C_EXIT.

The exit code also isn't normally 3 for SIGABRT when abort() (i.e. os.abort in 
Python) gets called. In a release build, abort() defaults to using the 
__fastfail intrinsic (i.e. INT 0x29 on x64 systems) with the code 
FAST_FAIL_FATAL_APP_EXIT. This terminates the process with a 
STATUS_STACK_BUFFER_OVERRUN exception. By design, a __fastfail exception cannot 
be handled. An attached debugger only sees it as a second-chance exception. 
(Ideally they should have split this functionality into multiple status codes, 
since a __fastfail isn't necessarily due to a stack buffer overrun.) The 
error-reporting dialog may change the exit status to 255 in this case, but you 
can suppress this dialog 

[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-26 Thread Akos Kiss

Akos Kiss  added the comment:

A follow-up: in addition to `taskkill`, I've taken a look at another "official" 
way for killing processes, the `Stop-Process` PowerShell cmdlet 
(https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.management/stop-process?view=powershell-5.1).
 Yet again, documentation is scarce on what the exit code of the terminated 
process will be. But PowerShell and .NET code base is open sourced, so I've dug 
a bit deeper and found that `Stop-Process` is based on 
`System.Diagnostics.Process.Kill()` 
(https://github.com/PowerShell/PowerShell/blob/master/src/Microsoft.PowerShell.Commands.Management/commands/management/Process.cs#L1240),
 while `Process.Kill()` uses the `TerminateProcess` Win32 API 
(https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.Process/src/System/Diagnostics/Process.Windows.cs#L93).
 Interestingly, `TerminateProcess` is called with -1 (this was surprising, to 
me at least, as exit code is unsigned on Windows AFAIK).

Therefore, I've added two new "kill" implementations to my original code 
experiment (wont repeat the whole code here, just the additions):

```py
def kill_with_taskkill(proc):
print('kill child with taskkill /F')
subprocess.run(['taskkill', '/F', '/pid', '%s' % proc.pid], check=True)

def kill_with_stopprocess(proc):
print('kill child with powershell stop-process')
subprocess.run(['powershell', 'stop-process', '%s' % proc.pid], check=True)
```

And I got:

```
run subprocess child with subprocess-taskkill
child process started with subprocess-taskkill
kill child with taskkill /F
SUCCESS: The process with PID 4024 has been terminated.
child terminated with 1
run subprocess child with subprocess-stopprocess
child process started with subprocess-stopprocess
kill child with powershell stop-process
child terminated with 4294967295

run multiprocessing child with multiprocessing-taskkill
child process started with multiprocessing-taskkill
kill child with taskkill /F
SUCCESS: The process with PID 5988 has been terminated.
child terminated with 1
run multiprocessing child with multiprocessing-stopprocess
child process started with multiprocessing-stopprocess
kill child with powershell stop-process
child terminated with 4294967295
```

My takeaways from the above are that
1) Windows is not consistent across itself,
2) 1 is not the only "valid" "terminated forcibly" exit code, and
3) negative exit code does not work, even if MS itself tries to use it.

BTW, I really think that killing a process with a code of 1 is questionable, as 
quite some apps return 1 themselves just to signal error (but proper 
termination). This makes it hard to tell applications' own error signaling and 
forced kills apart. But that's a personal opinion.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-26 Thread Akos Kiss

Akos Kiss  added the comment:

`taskkill /F` sets exit code to 1, indeed. (Confirmed by experiment. Cannot 
find this behaviour documented, though.)

On the other hand, MS Docs state 
(https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/signal#remarks)
 that termination by a signal "terminates the calling program with exit code 
3". (So, there may be other "valid" exit codes, too.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-25 Thread Eryk Sun

Change by Eryk Sun :


--
stage:  -> needs patch
versions: +Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-25 Thread Eryk Sun

Eryk Sun  added the comment:

Setting the exit code to the negative of a C signal value isn't generally 
meaningful in Windows. It seems multiprocessing doesn't have a significant use 
for this, other than getting a formatted exit code in the repr via its 
_exitcode_to_name dict. For example:

p = multiprocessing.Process(target=time.sleep, args=(30,))
p.start()
p.terminate()

>>> p


This may mislead people into thinking incorrectly that Windows implements POSIX 
signals. Python uses the C runtime's emulation of the basic set of required 
signals. SIGSEGV, SIGFPE, and SIGILL are based on exceptions. SIGINT and 
SIGBREAK are based on console control events. SIGABRT and SIGTERM are for use 
with C `raise`. Additionally it implements os.kill via TerminateProcess and 
GenerateConsoleCntrlEvent. (The latter takes process group IDs, so it should 
have been used to implement os.killpg instead. Its use in os.kill is wrong and 
confusing.)

The normal exit code for a forced shutdown is 1, which you can confirm via Task 
Manager or `taskkill /F`. subprocess is correct here. I think multiprocessing 
should follow suit.

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31863] Inconsistent returncode/exitcode for terminated child processes on Windows

2017-10-24 Thread Akos Kiss

New submission from Akos Kiss :

I've been working with various approaches for running and terminating 
subprocesses on Windows and I've obtained surprisingly different results if I 
used different modules and ways of termination. Here is the script I wrote, it 
uses the `subprocess` and the `multiprocessing` modules for starting new 
subprocesses, and process termination is performed either by the modules' own 
`terminate` functions or by `os.kill`.

```py
import multiprocessing
import os
import signal
import subprocess
import sys
import time

def kill_with_os_kill(proc):
print('kill with os.kill(pid,SIGTERM)')
os.kill(proc.pid, signal.SIGTERM)

def kill_with_terminate(proc):
print('kill child with proc.terminate()')
proc.terminate()

def run_and_kill_subprocess(killfn, procarg):
print('run subprocess child with %s' % procarg)
with subprocess.Popen([sys.executable, __file__, procarg]) as proc:
time.sleep(1)
killfn(proc)
proc.wait()
print('child terminated with %s' % proc.returncode)

def run_and_kill_multiprocessing(killfn, procarg):
print('run multiprocessing child with %s' % procarg)
proc = multiprocessing.Process(target=childmain, args=(procarg,))
proc.start()
time.sleep(1)
killfn(proc)
proc.join()
print('child terminated with %s' % proc.exitcode)

def childmain(arg):
print('child process started with %s' % arg)
while True:
pass

if __name__ == '__main__':
if len(sys.argv) < 2:
print('parent process started')
run_and_kill_subprocess(kill_with_os_kill, 'subprocess-oskill')
run_and_kill_subprocess(kill_with_terminate, 'subprocess-terminate')
run_and_kill_multiprocessing(kill_with_os_kill, 
'multiprocessing-oskill')
run_and_kill_multiprocessing(kill_with_terminate, 
'multiprocessing-terminate')
else:
childmain(sys.argv[1])
```

On macOS, everything works as expected (and I think that Linux will behave 
alike):

```
$ python3 killtest.py 
parent process started
run subprocess child with subprocess-oskill
child process started with subprocess-oskill
kill with os.kill(pid,SIGTERM)
child terminated with -15
run subprocess child with subprocess-terminate
child process started with subprocess-terminate
kill child with proc.terminate()
child terminated with -15
run multiprocessing child with multiprocessing-oskill
child process started with multiprocessing-oskill
kill with os.kill(pid,SIGTERM)
child terminated with -15
run multiprocessing child with multiprocessing-terminate
child process started with multiprocessing-terminate
kill child with proc.terminate()
child terminated with -15
```

But on Windows, I got:

```
>py -3 killtest.py
parent process started
run subprocess child with subprocess-oskill
child process started with subprocess-oskill
kill with os.kill(pid,SIGTERM)
child terminated with 15
run subprocess child with subprocess-terminate
child process started with subprocess-terminate
kill child with proc.terminate()
child terminated with 1
run multiprocessing child with multiprocessing-oskill
child process started with multiprocessing-oskill
kill with os.kill(pid,SIGTERM)
child terminated with 15
run multiprocessing child with multiprocessing-terminate
child process started with multiprocessing-terminate
kill child with proc.terminate()
child terminated with -15
```

Notes:
- On Windows with `os.kill(pid, sig)`, "sig will cause the process to be 
unconditionally killed by the TerminateProcess API, and the exit code will be 
set to sig." I.e., it is not possible to detect on Windows whether a process 
was terminated by a signal or it exited properly, because `kill` does not 
actually raise a signal and no Windows API allows to differentiate between 
proper or forced termination.
- The `multiprocessing` module has a workaround for this by terminating the 
process with a designated exit code (`TERMINATE = 0x1`) and checking for 
that value afterwards, rewriting it to `-SIGTERM` if found. The related 
documentation is a bit misleading, as `exitcode` is meant to have "negative 
value -N [which] indicates that the child was terminated by signal N" -- 
however, if the process was indeed killed with `SIGTERM` (and not via 
`terminate`), then `exitcode` will be `SIGTERM` and not `-SIGTERM` (see above). 
(The documentation of `terminate` does not clarify the situation much by 
stating that "on Windows TerminateProcess() is used", since it does not mention 
the special exit code -- and well, it's not even a signal after all, so it's 
not obvious whether negative or positive exit code is to be expected.)
- The `subprocess` module choses the quite arbitrary exit code of 1 and 
documents that "negative value -N indicates that the child was terminated by 
signal N" is POSIX only, not mentioning anything about what to expect on 
Windows.

Long story short: on Windows, the observable exit code of a forcibly terminated 
child process is quite inconsistent even acros