[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-19 Thread STINNER Victor


STINNER Victor  added the comment:

> Its interesting to read the comment on the IA64 definition for SIGSTKSZ:

Python doesn't support IA64 architecture.

Note: I'm not sure that this architecture is going to survive in the long 
term... More and more systems stopped to support it.


> Well, one argument for the dynamic approach is that existing python
binaries can adjust without needing to be respun for new CPUs.

PR 13649 gets the default thread stack size, it doesn't get SIGSTKSZ. I expect 
a thread stack size to be way larger than SIGSTKSZ.

For on my Fedora 30 (Linux kernel 5.1.19-300.fc30.x86_64, 
glibc-2.29-15.fc30.x86_64) on x86-64, SIGSTKSZ is just 8 KiB (8192 bytes) 
whereas pthread_attr_getstacksize() returns 8 MiB (8388608 bytes).

As I already wrote, using 8 MiB instead of 8 KiB looks like a waste of memory: 
faulthandler is only useful for stack overflow, which is a very unlikely bug.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 23:13, STINNER Victor  wrote:

>
> STINNER Victor  added the comment:
>
> About PR 13649, I'm not sure that _PyThread_preferred_stacksize() is still
> relevant, since my change fixed test_faulthandler test_register_chain(). I
> chose my change since it's less invasive: it only impacts faulthandler, and
> it minimalizes the memory usage (especially when faulthandler is not used).
>

Sure - there's no reason for it to exist if you don't want to use it to fix
the issue here.

> Python/thread_pthread.h refactor changes of PR 13649 are interested. Would
> you like to extract them into a new PR which doesn't add
> _PyThread_preferred_stacksize() but just add new PLATFORM_xxx macros?
>

Yes, certainly.

Maybe test_faulthandler will fail tomorrow on a new platform, but I prefer
> to open a discussion once such case happens, rather than guessing how
> faulthandler can crash on an hypothetical platforms.

Well, one argument for the dynamic approach is that existing python
binaries can adjust without needing to be respun for new CPUs. I think
SIGSTKSZ is a vestage from when CPU architectures had consistently sized
register sets across models.  Its interesting to read the comment on the
IA64 definition for SIGSTKSZ:

https://github.com/torvalds/linux/blob/master/arch/ia64/include/uapi/asm/signal.h#L83

> I'm sure that libc developers are well aware of the FPU state size and
> update SIGSTKSZ accordingly.
>

The current value comes from the kernel sources, and has not changed since
at the latest 2005 (with the initial git commit of the kernel), which I
think predates xsave/xrestore by some margin. I don't think its a useful
measure of anything in the real (x86) world today.

> glibc code computing xsave_state_size:
>
>
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/cpu-features.c;h=4bab1549132fe8a4c203a70b8c7a51c1dc304049;hb=HEAD#l223
>
> --
>
> If tomorrow, it becomes too hard to choose a good default value for
> faulthandler stack size, another workaround would be to make it
> configurable, as Python lets developers choose the thread stack size:
> _thread.stack_size(size).
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 22:32, STINNER Victor  wrote:

>
> We are talking abou the faulthandler_user() function of
> Modules/faulthandler.c. It is implemented in pure C, it doesn't allocate
> memory on the heap, it uses a very small set of functions (write(),
> sigaction(), raise()) and it tries to minimize its usage of the stack
> memory.
>

I was more concerned about what was happening in the chained handler, which
will also run on the restricted stack: I had assumed that was potentially
running arbitrary python code. That's actually probably incorrect, now that
I think about it, but it's harder to infer much about its stack usage
directly in faulthandler.c. I'll take a look (just to satisfy myself, more
than anything)

> It is very different than the traceback module which is implemented in
> pure Python.
>

Right, totally - I had jumped to the conclusion that it would end up
executing in the interpreter via the chain, but, as I say, that's probably
wrong. I'm not sure what guarantees the chained signal handler makes about
its stack usage. (Will educate myself)

> faulthandler is really designed to debug segmentation fault, stack
> overflow, Python hang (like a deadlock), etc.

> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 22:34, STINNER Victor  wrote:

>
> ...I'm not sure that we can fix bpo-37851 in Python 3.7.

 That's totally reasonable, sure.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

The bug has been fixed in 3.7, 3.8 and master (future 3.9) branches. I close 
the issue. Thanks to everyone who was involved to report the bug and help to 
find the root issue! The relationship between faulthandler, the Linux kernel 
version, CPU model, and the FPU state size wasn't obvious at the first look ;-)

If someone wants to cleanup/rework how Python handles thread stack size, please 
open a separated issue. I prefer to restrict this issue to 
test_faulthandler.test_register_chain() (which is now fixed).

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
versions: +Python 3.8, Python 3.9 -Python 3.3, Python 3.4, Python 3.5, Python 
3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

"I can confirm that on the specific hardware I could reproduce this, that
PR14276 and setting the stacksize to SIGSTKSZ*2 passes the test_faulthandler 
test."

Thanks for testing. I merged my PR.

About PR 13649, I'm not sure that _PyThread_preferred_stacksize() is still 
relevant, since my change fixed test_faulthandler test_register_chain(). I 
chose my change since it's less invasive: it only impacts faulthandler, and it 
minimalizes the memory usage (especially when faulthandler is not used).

Python/thread_pthread.h refactor changes of PR 13649 are interested. Would you 
like to extract them into a new PR which doesn't add 
_PyThread_preferred_stacksize() but just add new PLATFORM_xxx macros?

--

Maybe test_faulthandler will fail tomorrow on a new platform, but I prefer to 
open a discussion once such case happens, rather than guessing how faulthandler 
can crash on an hypothetical platforms. I'm sure that libc developers are well 
aware of the FPU state size and update SIGSTKSZ accordingly.

glibc code computing xsave_state_size:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/cpu-features.c;h=4bab1549132fe8a4c203a70b8c7a51c1dc304049;hb=HEAD#l223

--

If tomorrow, it becomes too hard to choose a good default value for 
faulthandler stack size, another workaround would be to make it configurable, 
as Python lets developers choose the thread stack size: 
_thread.stack_size(size).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread miss-islington


miss-islington  added the comment:


New changeset 1581d9c405f140491791a07dca3f6166bc499ec1 by Miss Islington (bot) 
in branch '3.7':
bpo-21131: Fix faulthandler.register(chain=True) stack (GH-15276)
https://github.com/python/cpython/commit/1581d9c405f140491791a07dca3f6166bc499ec1


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread miss-islington


miss-islington  added the comment:


New changeset b8e682427a80798fec90dce31392beaf616c3e37 by Miss Islington (bot) 
in branch '3.8':
bpo-21131: Fix faulthandler.register(chain=True) stack (GH-15276)
https://github.com/python/cpython/commit/b8e682427a80798fec90dce31392beaf616c3e37


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread miss-islington


Change by miss-islington :


--
pull_requests: +15016
pull_request: https://github.com/python/cpython/pull/15292

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread miss-islington


Change by miss-islington :


--
pull_requests: +15015
pull_request: https://github.com/python/cpython/pull/15291

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset ac827edc493d3ac3f5b9b0cc353df1d4b418a9aa by Victor Stinner in 
branch 'master':
bpo-21131: Fix faulthandler.register(chain=True) stack (GH-15276)
https://github.com/python/cpython/commit/ac827edc493d3ac3f5b9b0cc353df1d4b418a9aa


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

"I can understand the aversion to the waste when its never used - I can address 
37851 if you like - it seems pretty simple to fix. The pedant in me must point 
out that it's 8M of address space, not memory. The cost on 64-bit (well, with a 
47-bit user address space) is vanishingly small, ..."

Well, many users pay attention to the RSS value and don't care if the memory is 
physically allocated or not.

Moreover, I'm not sure that we can fix bpo-37851 in Python 3.7. In general, the 
policy is to minimize changes in stable Python versions. I'm not sure for 
Python 3.8 neither. I would suggest to only modify Python 3.9, simply to reduce 
the risk of regressions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

"I do think SIGSTKSZ*2=16k is far too small considering the fault handler could 
be running arbitrary python code,"

We are talking abou the faulthandler_user() function of Modules/faulthandler.c. 
It is implemented in pure C, it doesn't allocate memory on the heap, it uses a 
very small set of functions (write(), sigaction(), raise()) and it tries to 
minimize its usage of the stack memory.

It is very different than the traceback module which is implemented in pure 
Python.

faulthandler is really designed to debug segmentation fault, stack overflow, 
Python hang (like a deadlock), etc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread Bennet Fauber


Bennet Fauber  added the comment:

I updated the versions affected to include 3.6 and 3.7, both of which are 
affected.

I am a bit concerned that the conversation might get fragmented, so I will put 
in the full URL to the newly created PR at GitHub here.

https://github.com/python/cpython/pull/15276

just to make it easier for anyone else who finds this to see where things have 
gone.

I am now, also, uncertain what the current status of

https://github.com/python/cpython/pull/13649

is now.

It seems that we are now waiting on review from someone from the core 
developers?

--
versions: +Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 14:46, STINNER Victor  wrote:

>
> STINNER Victor  added the comment:
>
> "On a 64-bit system, consuming 8M of address space is a drop in the ocean."
>
> Let me disagree here. Python always allocates faulthandler stack, even if
> faulthandler is not used. Even when faulthandler is used, I would prefer to
> not waste memory if 8 KiB is good enough.
>

I can understand the aversion to the waste when its never used - I can
address 37851 if you like - it seems pretty simple to fix. The pedant in me
must point out that it's 8M of address space, not memory. The cost on
64-bit (well, with a 47-bit user address space) is vanishingly small,
regardless of the physical memory on the system. On 32-bit, it's 0.2% of
your address space, which I think I'd trade for the safety, but that's your
call, and I appreciate that address space can be a constrained resource on
32-bit systems.

I do think SIGSTKSZ*2=16k is far too small considering the fault handler
could be running arbitrary python code, and we know that there's somewhat
less than half of that available for use by the interpreter.

>
> By the way, I just created bpo-37851 to allocate this stack at the first
> faulthandler usage, instead of always allocating it, even when faulthandler
> is not used.
>
> I wrote PR 15276 to use a stack of SIGSTKSZ*2 bytes. According to
> msg349694, it does fix the crash.
>
> Can someone please double check that PR 15276 fix test_faulthandler on a
> platform where the test crash without this change?
>

I can confirm that on the specific hardware I could reproduce this, that
PR14276 and setting the stacksize to SIGSTKSZ*2 passes the
test_faulthandler test.

>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

"On a 64-bit system, consuming 8M of address space is a drop in the ocean."

Let me disagree here. Python always allocates faulthandler stack, even if 
faulthandler is not used. Even when faulthandler is used, I would prefer to not 
waste memory if 8 KiB is good enough.

By the way, I just created bpo-37851 to allocate this stack at the first 
faulthandler usage, instead of always allocating it, even when faulthandler is 
not used.

I wrote PR 15276 to use a stack of SIGSTKSZ*2 bytes. According to 
msg349694, it does fix the crash.

Can someone please double check that PR 15276 fix test_faulthandler on a 
platform where the test crash without this change?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +14999
pull_request: https://github.com/python/cpython/pull/15276

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread Peter Edwards


Peter Edwards  added the comment:

The patch I originally proposed here (
https://bugs.python.org/file48353/sigaltstack-stacksize.patch ) is a pretty
minimal fix that uses the pthread stack size where available, with a
hard-coded lower bound of 1M. @Victor : if you want a minimal diff, I can
drop the existing PR, submit the above as a new one, and we can progress
from there?
Let me know how you'd like me to proceed.

On Wed, 14 Aug 2019 at 14:26, Bennet Fauber  wrote:

>
> Bennet Fauber  added the comment:
>
> I just tested the proposed change in
>
> Aha, that's interesting: SIGSTKSZ should be enough for 1 signal handler,
> but test_register_chain calls 2 signal handlers using the same stack. Can
> you please try the following patch?
>
> ```
> diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c
> index 2331051f79..e7d13f2b2d 100644
> --- a/Modules/faulthandler.c
> +++ b/Modules/faulthandler.c
> . . . .
> -stack.ss_size = SIGSTKSZ;
> +stack.ss_size = SIGSTKSZ * 2;
> ```
>
> and the segfault no longer occurs at the faulthandler test.
>
> Compiling and running the altstack.c using the system installed GCC 4.8.5
> on CentOS 7.6.1810, kernel version 3.10.0-957.10.1.el7.x86_64 running on
> Dell R640 Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz results in this output.
>
> ```
> $ gcc -o altstack altstack.c
> $ ./altstack
> SIGSTKSZ = 8192
> our signal handler
> User defined signal 1
> ```
>
> It does seem to me that relying on a statically set stack size when using
> dynamically loaded libraries is inviting similar problems in the future for
> the reasons that Peter enumerated:  There is no telling now what the
> requirements will be for some new chip family, and one cannot predict now
> what additional (if any) memory requirements might be needed by the linker
> in the future.
>
> But, I think getting _some_ patch accepted and pushed to the main Python
> releases should have some priority, as the current state does seem
> undesirable.
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread Bennet Fauber


Bennet Fauber  added the comment:

I just tested the proposed change in

Aha, that's interesting: SIGSTKSZ should be enough for 1 signal handler, but 
test_register_chain calls 2 signal handlers using the same stack. Can you 
please try the following patch?

```
diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c
index 2331051f79..e7d13f2b2d 100644
--- a/Modules/faulthandler.c
+++ b/Modules/faulthandler.c
. . . .
-stack.ss_size = SIGSTKSZ;
+stack.ss_size = SIGSTKSZ * 2;
```

and the segfault no longer occurs at the faulthandler test.

Compiling and running the altstack.c using the system installed GCC 4.8.5 on 
CentOS 7.6.1810, kernel version 3.10.0-957.10.1.el7.x86_64 running on Dell R640 
Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz results in this output.

```
$ gcc -o altstack altstack.c 
$ ./altstack 
SIGSTKSZ = 8192
our signal handler
User defined signal 1
```

It does seem to me that relying on a statically set stack size when using 
dynamically loaded libraries is inviting similar problems in the future for the 
reasons that Peter enumerated:  There is no telling now what the requirements 
will be for some new chip family, and one cannot predict now what additional 
(if any) memory requirements might be needed by the linker in the future.

But, I think getting _some_ patch accepted and pushed to the main Python 
releases should have some priority, as the current state does seem undesirable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread Peter Edwards


Peter Edwards  added the comment:

Hi Victor, thanks for the comments. Responses inline below.

On Wed, 14 Aug 2019 at 12:25, STINNER Victor  wrote:

>
> STINNER Victor  added the comment:
>
> I dislike PR 13649 because it touch the thread module, only to fix a
> faulthandler unit test.

My original patch (posted in the comments above) was purely in faulthandler
- I went at the threading code on your suggestion:

PyThread_start_new_thread() of thread_pthread.h already contains logic to
> get a "good" stack size. I would prefer to reuse this code."

I have no problem reformulating the code to avoid touching the threads
library - let me redo it as such.

> The relationship between thread stack size and faulthandler is not obvious
> to me. Currently, faulthandler uses SIGSTKSZ, not the thread stack size.
> faulthandler usage of the stack should be quite low: it should need less
> than 1 MiB for example.
>

The point of contention here is really he choice of stack size. SIGSTKSZ is
ridiculously small - it is the bare minimum amount of memory required to
actually handle the signal. The signal handling mechanism eats a huge chunk
of it, and the dynamic linker can also eat several K too. The intent was to
use the default thread stack size as a heuristic for what the platform
considers to be a reasonable size stack for applications. If the
pthread-aware OS is somehow constrained for address space, then I'd expect
it to reflect that in the default stack size. For 32-bit linux, the 8M of
address space is a bit of a chunk, but it's not a huge proportion of the
3-4G you have, and you're not consuming actual memory. On a 64-bit system,
consuming 8M of address space is a drop in the ocean.

>
> "When faulthandler.c uses sigaltstack(2), the stack size is set up with a
> buffer of size SIGSTKSZ. That is, sadly, only 8k."
>
> "A chained signal handler that needs to invoke dynamic linking will
> therefore consume more than the default stack space allocated in
> faulthandler.c, just in machine-state saves alone. So, the failing test is
> failing because its scribbling on random memory before the allocated stack
> space."
>
> Aha, that's interesting: SIGSTKSZ should be enough for 1 signal handler,
> but test_register_chain calls 2 signal handlers using the same stack. Can
> you please try the following patch?
>

It's more complex than that - in dynamically linked applications when you
call functions that still need to be resolved by the dynamic linker, the
resolving thunk in the PLT also ends up saving the register state via
xsavec, so with a chained call, there are up to 3 register states saved on
the stack, each over 2.5k on actual hardware we have now. I'm not convinced
there are not other ways stack space will be consumed during the signal
handler, and I'm not convinced that the amount of memory required per
handler will not go up as new CPUs come out, and I'm not convinced that
SIGSTKSZ will be bumped to reflect that (it certainly hasn't in the past),
so scaling SIGSTKSZ like this, while it'll likely fix the problem on any
machine I can test it on, doesn't seem like a stable solution

> diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c
> index 2331051f79..e7d13f2b2d 100644
> --- a/Modules/faulthandler.c
> +++ b/Modules/faulthandler.c
> @@ -1325,7 +1325,7 @@ _PyFaulthandler_Init(int enable)
>   * be able to allocate memory on the stack, even on a stack overflow.
> If it
>   * fails, ignore the error. */
>  stack.ss_flags = 0;
> -stack.ss_size = SIGSTKSZ;
> +stack.ss_size = SIGSTKSZ * 2;
>  stack.ss_sp = PyMem_Malloc(stack.ss_size);
>  if (stack.ss_sp != NULL) {
>  err = sigaltstack(, _stack);
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

Ah, I found the recent change about XSAVE: it is a fix for CVE-2018-3665 
vulnerability.

"The software mitigation for this is to switch to an "eager" / immediate FPU 
state save and restore, in both kernels and hypervisors."

"On Intel and AMD x86 processors, operating systems and hypervisors often use 
what is referred to as a deferred saving and restoring method of the x86 FPU 
state, as part of performance optimization. This is done in a "lazy" on-demand 
fashion."

"It was found that due to the "lazy" approach, the x86 FPU states or FPU / XMM 
/ AVX512 register content, could leak across process, or even VM boundaries, 
giving attackers possibilities to read private data from other processes, when 
using speculative execution side channel gadgets."

https://www.suse.com/support/kb/doc/?id=7023076

See also: https://en.wikipedia.org/wiki/Lazy_FP_state_restore

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

Ah, the cpuid command tells me "bytes required by XSAVE/XRSTOR area = 1088":

CPU 0:
   vendor_id = "GenuineIntel"
   version information (1/eax):
  processor type  = primary processor (0)
  family  = Intel Pentium Pro/II/III/Celeron/Core/Core 2/Atom, AMD 
Athlon/Duron, Cyrix M2, VIA C3 (6)
  model   = 0xe (14)
  stepping id = 0x3 (3)
  extended family = 0x0 (0)
  extended model  = 0x5 (5)
  (simple synth)  = Intel Core i3-6000 / i5-6000 / i7-6000 / Pentium G4000 
/ Celeron G3900 / Xeon E3-1200 (Skylake), 14nm
   ...
   feature information (1/edx):
  FXSAVE/FXRSTOR = true
  ...
   feature information (1/ecx):
  XSAVE/XSTOR states  = true
  OS-enabled XSAVE/XSTOR  = true
  ...
   XSAVE features (0xd/0):
  bytes required by XSAVE/XRSTOR area = 0x0440 (1088)
  ...
   XSAVE features (0xd/1):
  XSAVEOPT instruction= true
  XSAVEC instruction  = true
  XSAVES/XRSTORS instructions = true
  ...

/proc/cpuinfo also says:

flags   : ... xsave avx ... xsaveopt xsavec xgetbv1 xsaves ...

I recall that the Linux kernel was modified to only save AVX registers if a 
program uses AVX. So the process state size depends on the usage of AVX. But I 
cannot find the related LWN article.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

Attached altstack.c mimicks faulthandler unit test test_register_chain(), 
except that faulthandler_user() uses almost no stack memory. This test should 
check if SIGSTKSZ is big enough to call a second signal handler from a first 
signal handler.

Example of my Fedora 30 with "Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz" model 
name in /proc/cpuinfo and Linux kernel 5.1.18-300.fc30.x86_64:

$ gcc altstack.c -o altstack && ./altstack
SIGSTKSZ = 8192
our signal handler
User defined signal 1

Note: the follow gdb command didn't work for me:

> (gdb) p _rtld_local_ro._dl_x86_cpu_features.xsave_state_size
> $1 = 896

How can I get xsave_state_size for my glibc/kernel/CPU?

--
Added file: https://bugs.python.org/file48542/altstack.c

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-14 Thread STINNER Victor


STINNER Victor  added the comment:

I dislike PR 13649 because it touch the thread module, only to fix a 
faulthandler unit test. The relationship between thread stack size and 
faulthandler is not obvious to me. Currently, faulthandler uses SIGSTKSZ, not 
the thread stack size. faulthandler usage of the stack should be quite low: it 
should need less than 1 MiB for example.

"When faulthandler.c uses sigaltstack(2), the stack size is set up with a 
buffer of size SIGSTKSZ. That is, sadly, only 8k."

"A chained signal handler that needs to invoke dynamic linking will therefore 
consume more than the default stack space allocated in faulthandler.c, just in 
machine-state saves alone. So, the failing test is failing because its 
scribbling on random memory before the allocated stack space."

Aha, that's interesting: SIGSTKSZ should be enough for 1 signal handler, but 
test_register_chain calls 2 signal handlers using the same stack. Can you 
please try the following patch?

diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c
index 2331051f79..e7d13f2b2d 100644
--- a/Modules/faulthandler.c
+++ b/Modules/faulthandler.c
@@ -1325,7 +1325,7 @@ _PyFaulthandler_Init(int enable)
  * be able to allocate memory on the stack, even on a stack overflow. If it
  * fails, ignore the error. */
 stack.ss_flags = 0;
-stack.ss_size = SIGSTKSZ;
+stack.ss_size = SIGSTKSZ * 2;
 stack.ss_sp = PyMem_Malloc(stack.ss_size);
 if (stack.ss_sp != NULL) {
 err = sigaltstack(, _stack);

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-13 Thread Bennet Fauber


Bennet Fauber  added the comment:

Perhaps I should add, that we are able to reproduce this behavior on this 
hardware

Dell R640 Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Dell R740 Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
Dell R440 Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
Dell C6420 Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz

but not on other chips, such as a recent i5, Haswell family, and older.  That 
may also make a difference if someone tries to reproduce this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-13 Thread Bennet Fauber


Bennet Fauber  added the comment:

One additional note on this.  Thanks to a colleague at USC who pointed out that 
this bug does not seem to get exercised if one does not include 
`--enable-shared` at configuration.

I confirmed this using the distributed Python-3.7.4.tgz file and `configure 
--prefix=/path/to/install`.  The `make test` then runs to completion with no 
errors.  Running `make distclean`, then rerunning with `configure 
--prefix=/path/to/install --enable-shared` will then exercise the bug, and the 
faulthandler test fails with a segfault.

Applying the patch file attached to this report and rebuilding leads to `make 
test` passing all tests, also.

If someone was trying to reproduce and did not configure shared libraries, then 
that would have failed to reproduce.

--
nosy: +justbennet

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-05-29 Thread Peter Edwards


Change by Peter Edwards :


--
pull_requests: +13543
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/13649

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-05-28 Thread Peter Edwards


Peter Edwards  added the comment:

Ok - let me submit a pull request with your suggestions

On Tue, 28 May 2019 at 13:08, STINNER Victor  wrote:

>
> STINNER Victor  added the comment:
>
> +pthread_attr_t attrs;
> +pthread_attr_init();
> +(void)pthread_attr_getstacksize(, _size);
>
> PyThread_start_new_thread() of thread_pthread.h already contains logic to
> get a "good" stack size. I would prefer to reuse this code.
>
> See also _pythread_nt_set_stacksize() of thread_nt.h.
>
> Maybe we need a private function to get the default stack size?
>
> See also PyThread_get_stacksize() and _thread.stack_size().
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-05-28 Thread STINNER Victor


STINNER Victor  added the comment:

+pthread_attr_t attrs;
+pthread_attr_init();
+(void)pthread_attr_getstacksize(, _size);

PyThread_start_new_thread() of thread_pthread.h already contains logic to get a 
"good" stack size. I would prefer to reuse this code.

See also _pythread_nt_set_stacksize() of thread_nt.h.

Maybe we need a private function to get the default stack size?

See also PyThread_get_stacksize() and _thread.stack_size().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-05-24 Thread Peter Edwards


Peter Edwards  added the comment:

Hi - we ran into what looks like exactly this issue on an x86_64 sporadically, 
and tracked down the root cause.

When faulthandler.c uses sigaltstack(2), the stack size is set up with a buffer 
of size SIGSTKSZ. That is, sadly, only 8k.

When a signal is raised, before the handler is called, the kernel stores the 
machine state on the user's (possibly "alternate") stack. The size of that 
state is very much variable, depending on the CPU.

When we chain the signal handler in the sigaction variant of the code in 
faulthandler, we raise the signal with the existing handler still on the stack, 
and save a second copy of the CPU state.

Finally, when any part of that signal handler has to invoke a function that 
requires the dynamic linker's intervention to resolve, it will call some form 
of _dl_runtime_resolve* - likely _dl_runtime_resolve_xsave or 
_dl_runtime_resolve_xsavec.

These functions will also have to save machine state. So, how big is the 
machine state? Well, it depends on the CPU. 
On one machine I have access to, /proc/cpuinfo shows "Intel(R) Xeon(R) CPU 
E5-2640 v4", I have:

> (gdb) p _rtld_local_ro._dl_x86_cpu_features.xsave_state_size
> $1 = 896

On another machine, reporting as "Intel(R) Xeon(R) Gold 5118 CPU", I have:

> (gdb) p _rtld_local_ro._dl_x86_cpu_features.xsave_state_size
> $1 = 2560

This means that the required stack space to hold 3 sets of CPU state is over 
7.5k. And, for the signal handlers, it's actually worse: more like 3.25k per 
frame. A chained signal handler that needs to invoke dynamic linking will 
therefore consume more than the default stack space allocated in 
faulthandler.c, just in machine-state saves alone. So, the failing test is 
failing because its scribbling on random memory before the allocated stack 
space.

My guess is that the previous architectures this manifested in have larger 
stack demands for signal handling than x86_64, but clearly newer x86_64 
processors are starting to get tickled by this.

Fix is pretty simple - just allocate more stack space. The attached patch uses 
pthread_attr_getstacksize to find the system's default stack size, and then 
uses that as the default, and also defines an absolute minimum stack size of 
1M. This fixes the issue on our machine with the big xsave state size. (I'm 
sure I'm getting the feature test macros wrong for testing for pthreads 
availability)

Also, I think in the case of a threaded environment, using the altstack might 
not be the best choice - I think multiple threads handling signals that run on 
that stack will wind up stomping on the same memory - is there a strong reason 
to maintain this altstack behaviour?

--
keywords: +patch
nosy: +peadar
Added file: https://bugs.python.org/file48353/sigaltstack-stacksize.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-05-24 Thread markmcclain


Change by markmcclain :


--
nosy: +markmcclain

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2016-10-05 Thread STINNER Victor

STINNER Victor added the comment:

> We're running into this building python 3.4.3 on EL6 ppc64.  The os kernel is 
> 4.7.2-201.fc24.ppc64, but the EL6 chroot kernel-headers are 
> 2.6.32-642.4.2.el6.  Any progress here?

Sorry, but if I'm unable to reproduce the issue, I cannot make progress on 
analyzing the bug. I would need an remote access (SSH) to a computer where the 
bug can be reproduced.

I also would like to know if the issue can be reproduced in C.

faulthandler depends a lot how signals and threads are implemented. I'm not 
completely suprised to see bugs on some specific platforms.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2016-10-04 Thread Orion Poplawski

Orion Poplawski added the comment:

We're running into this building python 3.4.3 on EL6 ppc64.  The os kernel is 
4.7.2-201.fc24.ppc64, but the EL6 chroot kernel-headers are 2.6.32-642.4.2.el6. 
 Any progress here?

--
nosy: +opoplawski

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread Bohuslav Slavek Kabrda

New submission from Bohuslav Slavek Kabrda:

test_faulthandler.test_register_chain fails on some 64bit architectures (arm8, 
ppc64) with kernel = 3.10:

==
FAIL: test_register_chain (__main__.FaultHandlerTests)
--
Traceback (most recent call last):
  File /builddir/build/BUILD/Python-3.3.2/Lib/test/test_faulthandler.py, line 
588, in test_register_chain
self.check_register(chain=True)
  File /builddir/build/BUILD/Python-3.3.2/Lib/test/test_faulthandler.py, line 
572, in check_register
self.assertEqual(exitcode, 0)
AssertionError: -11 != 0

I created a minimal reproducer (reproduces the issue on 3.3, 3.4 and dev) of 
this segfault (attached). When run with GC assertions turned on, Python fails 
with:

python: /builddir/build/BUILD/Python-3.3.2/Modules/gcmodule.c:332: update_refs: 
Assertion `gc-gc.gc_refs == (-3)\' failed.

We experienced this issue when building Python 3.3 on Fedora's arm8 builder 
[1], it seems that opensuse have experienced the same failure on ppc64 [2] and 
ubuntu has a similar issue in their 64bit arm builds [3].

It seems that this may be caused by a change in kernel, since it's only 
reproducible on kernel = 3.10. A nice explanation of what goes on and how the 
problem can be narrowed down is at the opensuse bug report [4], this is 
basically where I got stuck with this problem, too.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1045193
[2] https://bugzilla.novell.com/show_bug.cgi?id=831629
[3] https://bugs.launchpad.net/ubuntu/+source/python3.4/+bug/1264354
[4] https://bugzilla.novell.com/show_bug.cgi?id=831629#c15

--
components: Tests
files: test_register_chain_segfault_reproducer.py
messages: 215365
nosy: bkabrda
priority: normal
severity: normal
status: open
title: test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel 
= 3.10
versions: Python 3.3, Python 3.4, Python 3.5
Added file: 
http://bugs.python.org/file34699/test_register_chain_segfault_reproducer.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@gmail.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread STINNER Victor

STINNER Victor added the comment:

test_faulthandler.test_register_chain fails on some 64bit architectures (arm8, 
ppc64) with kernel = 3.10

I am a little bit surprised that the bug depends on the kernel version.

Does test_register_chain_segfault_reproducer.py also crash with chain=False?

Can you check if HAVE_SIGACTION is defined in pyconfig.h? Or in Python: import 
sysconfig; print(sysconfig.get_config_var('HAVE_SIGACTION')).

Wit chain=True, faulthandler_register() registers its signal handler with 
SA_NODEFER flag:
---
/* if the signal is received while the kernel is executing a system
   call, try to restart the system call instead of interrupting it and
   return EINTR. */
action.sa_flags = SA_RESTART;
if (chain) {
/* do not prevent the signal from being received from within its
   own signal handler */
action.sa_flags = SA_NODEFER;
}
---

With chain=True, the faulthandler_user() function calls the previous signal 
handler with:
---
#ifdef HAVE_SIGACTION
if (user-chain) {
(void)sigaction(signum, user-previous, NULL);
errno = save_errno;

/* call the previous signal handler */
raise(signum);

save_errno = errno;
(void)faulthandler_register(signum, user-chain, NULL);
errno = save_errno;
}
#else
if (user-chain) {
errno = save_errno;
/* call the previous signal handler */
user-previous(signum);
}
#endif
---

You can try to add #undef HAVE_SIGACTION in faulthandler.c (after #include 
Python.h) to see if the bug can be reproduced without sigaction. The code 
using signal() is very different, especially when chaining signal handlers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread STINNER Victor

STINNER Victor added the comment:

Would it be possible to list the kernel version (full version including the 
minor version) on which the test works or crash? You are talking about 3.10 
without the minor version. It may be a regression in the kernel.

Example of change in Linux kernel 3.10.17:
---
commit 19a420033da02200c424adfa3a7b9eed6e3a6dc2
Author: Christian Ruppert christian.rupp...@abilis.com
Date:   Wed Oct 2 11:13:38 2013 +0200

ARC: Fix signal frame management for SA_SIGINFO

commit 10469350e345599dfef3fa78a7c19fb230e674c1 upstream.

Previously, when a signal was registered with SA_SIGINFO, parameters 2
and 3 of the signal handler were written to registers r1 and r2 before
the register set was saved. This led to corruption of these two
registers after returning from the signal handler (the wrong values were
restored).
With this patch, registers are now saved before any parameters are
passed, thus maintaining the processor state from before signal entry.

Signed-off-by: Christian Ruppert christian.rupp...@abilis.com
Signed-off-by: Vineet Gupta vgu...@synopsys.com
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
---

Extract of changes of Linux 3.10.6:
---
  ARM: move signal handlers into a vdso-like page
  ARM: make vectors page inaccessible from userspace
  ARM: fix a cockup in 48be69a02 (ARM: move signal handlers into a 
vdso-like page)
  ARM: fix nommu builds with 48be69a02 (ARM: move signal handlers into a 
vdso-like page)
---
Commit:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=48be69a02

I would like to make sure that the bug is not a kernel bug.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread Bohuslav Slavek Kabrda

Bohuslav Slavek Kabrda added the comment:

I'm also surprised that this depends on kernel version, however that's what I 
found out (and the opensuse guys seem to only have reproduced this on kernel = 
3.10, too).

- Full kernel version (uname -r output): 3.13.0-0.rc7.28.sa2.aarch64
- The reproducer *doesn't* crash with chain=False.
- HAVE_SIGACTION:
 import sysconfig; print(sysconfig.get_config_var('HAVE_SIGACTION'))
1
- I'll do rebuild with #undef HAVE_SIGACTION and post my results here as soon 
as it's finished.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel = 3.10

2014-04-02 Thread Bohuslav Slavek Kabrda

Bohuslav Slavek Kabrda added the comment:

Ok, so with #undef HAVE_SIGACTION both the reproducer and the original test 
(as well as all tests in test_faulthandler) pass fine.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21131
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com