[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-03 Thread Stefan Krah


Stefan Krah  added the comment:


New changeset 852aee69f49c654a03ad1f64d90a78ba8848e2c6 by Stefan Krah in branch 
'3.7':
 bpo-39776: Lock ++interp->tstate_next_unique_id (GH-18746)
https://github.com/python/cpython/commit/852aee69f49c654a03ad1f64d90a78ba8848e2c6


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-03 Thread Stefan Krah


Stefan Krah  added the comment:


New changeset 5a92f42d8723ee865be80f028d402204649da15d by Stefan Krah in branch 
'3.8':
 bpo-39776: Lock ++interp->tstate_next_unique_id. (GH-18746) (#18746) (#18752)
https://github.com/python/cpython/commit/5a92f42d8723ee865be80f028d402204649da15d


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-02 Thread Stefan Krah


Change by Stefan Krah :


--
pull_requests: +18108
pull_request: https://github.com/python/cpython/pull/18753

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-02 Thread Stefan Krah


Change by Stefan Krah :


--
pull_requests: +18107
pull_request: https://github.com/python/cpython/pull/18752

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-02 Thread Stefan Krah


Stefan Krah  added the comment:


New changeset b3b9ade4a3d3fe00d933bcd8fc5c5c755d1024f9 by Stefan Krah in branch 
'master':
 bpo-39776: Lock ++interp->tstate_next_unique_id. (GH-18746) (#18746)
https://github.com/python/cpython/commit/b3b9ade4a3d3fe00d933bcd8fc5c5c755d1024f9


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-02 Thread Stefan Krah


Stefan Krah  added the comment:

I think the PR fixes the issue but I have to run longer tests still.

Threads created by PyGILState_Ensure() could have a duplicate tstate->id,
which confused the ContextVar caching machinery.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-02 Thread Stefan Krah


Change by Stefan Krah :


--
keywords: +patch
pull_requests: +18101
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/18746

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-01 Thread STINNER Victor


Change by STINNER Victor :


--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-03-01 Thread Antoine Pitrou


Change by Antoine Pitrou :


--
priority: normal -> critical
versions: +Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
nosy: +yselivanov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Stefan Krah


Stefan Krah  added the comment:

> With python 3.7.3 without https://github.com/python/cpython/pull/5278 works 
> just fine.

Thanks, I'm now getting the same results as you. Looking at the smaller
test case, I also agree that it should work (as it did in 3.6).

--
keywords: +3.7regression
resolution: not a bug -> 
stage:  -> needs patch
type:  -> crash

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

Also I understood the source of your crash with my initial example. Since you 
haven't used CMake to configure project, pybind didn't setup required macroses 
to enable threading support. So no issues in pybind.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

I rewrote my example without pybind and eliminated C++ module (I realized that 
time.sleep() also releases the GIL, so we achieve the same effect). Still the 
same results: with python 3.7.3 app crashes with attached ASAN output, with 
python 3.7.3 without https://github.com/python/cpython/pull/5278 works just 
fine.

To run main.cpp you should add directory with crash_test.py to PYTHONPATH.

--
Added file: https://bugs.python.org/file48930/threaded_crash.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Stefan Krah

Stefan Krah  added the comment:

Regarding *my* issue, it could be anything, e.g. a missing call to
PyEval_InitThreads() in 3.6:

"Changed in version 3.7: This function is now called by Py_Initialize(), so you 
don’t have to call it yourself anymore."


This is why we need to eliminate pybind11 so we can see what is
actually going on.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

Your callstack is very strange. At line 30 of main.cpp GIL is obviously locked:


   // importing module in this tread
   gstate = PyGILState_Ensure();
   py::module crash_test = py::module::import( "crash_test" ); <-- import
   PyGILState_Release( gstate );

I suppose that there is something wrong with your setup. Maybe - wrong working 
directory for the main executable, which doesn't contain crash_test.py

Also I've tried to revert this patch 
https://github.com/python/cpython/pull/5278 for 3.7. It makes problem to 
disappear, 1 hour of stable work under ASAN. So I suppose it is the source of 
the bug.

I will try to tweak _testembed.c.

--
resolution:  -> not a bug

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Stefan Krah


Change by Stefan Krah :


--
resolution: not a bug -> 
stage: resolved -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Stefan Krah


Stefan Krah  added the comment:

Note that my pybind11 is from GitHub master, it can also be a pybind11
issue.


It is interesting that you cannot reproduce your original issue with
3.6, so I'm reopening this issue.

I think we need a reproducer without pybind11 though, could you
tweak Programs/_testembed.c (from the CPython sources) to run the
crash script?

--
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-28 Thread Stefan Krah


Stefan Krah  added the comment:

This is 3.6.7, compiled --with-pydebug:


$ ./main 
Aborted (core dumped)



(gdb) bt
#0  0x7f9974077428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7f997407902a in __GI_abort () at abort.c:89
#2  0x0056e2d1 in Py_FatalError (msg=msg@entry=0x62ccf8 "Python memory 
allocator called without holding the GIL") at Python/pylifecycle.c:1462
#3  0x004c0cec in _PyMem_DebugCheckGIL () at Objects/obmalloc.c:1963
#4  0x004c0d27 in _PyMem_DebugMalloc (ctx=0x8f7220 <_PyMem_Debug+96>, 
nbytes=75) at Objects/obmalloc.c:1971
#5  0x004c204e in PyObject_Malloc (size=) at 
Objects/obmalloc.c:479
#6  0x004ec12f in PyUnicode_New (size=10, maxchar=) at 
Objects/unicodeobject.c:1281
#7  0x005162f4 in _PyUnicodeWriter_PrepareInternal 
(writer=writer@entry=0x7f9971ca4cf0, length=length@entry=10, maxchar=, maxchar@entry=127) at Objects/unicodeobject.c:13565
#8  0x0051af20 in PyUnicode_DecodeUTF8Stateful (s=0x61d15b 
"crash_test", size=10, errors=errors@entry=0x0, consumed=consumed@entry=0x0) at 
Objects/unicodeobject.c:5067
#9  0x0051c6b0 in PyUnicode_FromString (u=) at 
Objects/unicodeobject.c:2077
#10 0x00563c1c in PyImport_ImportModule (name=) at 
Python/import.c:1266
#11 0x004531dd in pybind11::module::import (name=0x61d15b "crash_test") 
at ./pybind11/include/pybind11/pybind11.h:849
#12 0x00446434 in ThreadFunc () at main.cpp:30
#13 0x0046a1b1 in std::_Bind_simple::_M_invoke<>(std::_Index_tuple<>) (this=0x10c28d8) at 
/usr/include/c++/5/functional:1531
#14 0x0046a10a in std::_Bind_simple::operator()() 
(this=0x10c28d8) at /usr/include/c++/5/functional:1520
#15 0x0046a09a in std::thread::_Impl 
>::_M_run() (this=0x10c28c0) at /usr/include/c++/5/thread:115
#16 0x7f99749e3c80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#17 0x7f99750bb6ba in start_thread (arg=0x7f9971ca5700) at 
pthread_create.c:333
#18 0x7f997414941d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

I'am unable to reproduce neither my or your issues with python 3.6. The program 
runs infinitely as it meant to be. Can you please give me C++ traceback from 
the core dump, which was created when you ran my program?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

Thank you for feedback. I will try to reproduce the issue with 3.6. By the way, 
haven't you used gdb with python pretty-printers enabled to examine the state 
of the program? I've got the same error message, then I breaked the execution 
in debugger and tried to examine the callstack of threads, that stucked in 
UnlockGILandSleep. The reason for it is clear: then the debugger tries to build 
a callstack, some of pretty printers try to execute some python code to give a 
better representation of interpreter objects. The code is executed at the top 
of the stack of the examined thread. Since this thread explicitly released the 
GIL before going to sleep, these functions hit the assert about calling the 
memory allocator without holdng the GIL. Disabling pretty-printers makes these 
error messages to disappear.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Stefan Krah


Stefan Krah  added the comment:

I built your example with 3.6:

git clone https://github.com/pybind/pybind11
wget https://bugs.python.org/file48923/decimal_crash.zip
unzip decimal_crash.zip

git checkout v3.6.7
./configure --with-pydebug
make

g++ -std=c++11 -pthread -Wno-unused-result -Wsign-compare -g -Og -Wall -Wextra 
-Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -I. 
-I./Include -I./pybind11/include -c main.cpp

g++ -pthread -Xlinker -export-dynamic -o main main.o libpython3.6dm.a -lpthread 
-ldl -lutil -lm


cp python python3
PATH=.:$PATH
./main



And I literally get this error (not always, it may take 10 runs or so):

$ ./main
Fatal Python error: Python memory allocator called without holding the GIL

Thread 0x7f1e73fff700 (most recent call first):

Thread 0x7f1e7b836700 (most recent call first):

Thread 0x7f1e7a834700 (most recent call first):

Thread 0x7f1e7b035700 (most recent call first):

Thread 0x7f1e7d039700 (most recent call first):

Thread 0x7f1e7c838700 (most recent call first):

Current thread 0x7f1e7c037700 (most recent call first):

Thread 0x7f1e7e84f740 (most recent call first):
Aborted (core dumped)



So no, I don't think the GIL handling is correct.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Evgeny Boytsov


Evgeny Boytsov  added the comment:

Please note, that UnlockGILandSleep takes GIL back before returning. In a real 
production code there is a database query. In this example I emulate them with 
random sleep. So I don't see any problems here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Stefan Krah


Change by Stefan Krah :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Stefan Krah


Stefan Krah  added the comment:

I've briefly looked at the zip archive. Without going much into
the C++ module as a whole, this should not be done:


gil_unlocker.UnlockGILAndSleep()
self.val = decimal.Decimal(1) / decimal.Decimal(7)
gil_unlocker.UnlockGILAndSleep()


If you want C++ threads with a released GIL, you should use libmpdec
directly and not the Python module.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Stefan Krah


Stefan Krah  added the comment:

Before I look at the example code: Can you also reproduce this with
Python 3.6?  The threading code in _decimal was changed to a ContextVar
in 3.7.

There's a high chance though that the problem is in the c++ module.

--
nosy: +skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39776] Crash in decimal module in heavy-multithreaded scenario

2020-02-27 Thread Evgeny Boytsov


New submission from Evgeny Boytsov :

Hello everybody!

We are using Python 3.7 running at CentOS 7 x64. Python is used as a library to 
create dynamic extensions for our app server. 

Some time ago we began to experience crashes in decimal module in some 
heavy-multithreaded scenarios. After some testing and debugging I was able to 
reproduce it without our own code using only pybind11 library to simplify 
embedding (in real app we are using boost.python).


I've built python 3.8 with clang 7 and address sanitizer enabled and got error 
"use-after-free" with some additional data.

Please find attached C++ source file, python module and ASAN output. Is it 
really a bug (most probably - data race) or there is something wrong with such 
embedding scenario?

--
components: Interpreter Core
files: decimal_crash.zip
messages: 362807
nosy: boytsovea
priority: normal
severity: normal
status: open
title: Crash in decimal module in heavy-multithreaded scenario
versions: Python 3.7, Python 3.8
Added file: https://bugs.python.org/file48923/decimal_crash.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com