[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: I'm sorry, but even if it's a real bug, we don't have enough information nor any scenario to reproduce the bug, so it's not possible to investigate the bug. I close the issue. -- resolution: - out of date status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: Did you see some errors or warnings when running your application with asyncio in debug mode? No, but I'll try. I doubt the problem is in asyncio itself because it's mostly written in Python. This looks like a garbage collection issue. Sorry, you're wrong: the proactor event loop heavily uses the _overlapped module which is implemented in C. A crash in the garbage collector is more likely a bug in asyncio/your application, than a bug in Python itself. Please enable asyncio debug mode, ResourceWarning, enable DEBUG log level, and check if you get some errors. For example, GetQueuedCompletionStatus() returned an unexpected event usually means that you will quickly get a memory corruption... If possible, write all asyncio logs at DEBUG log level and attach them to the issue. Using the logging module, it's possible to add a custom handler for asyncio. Example: --- import asyncio import logging logging.basicConfig() logger = asyncio.log.logger output = open(asyncio.log, a) handler = logging.StreamHandler(output) handler.setLevel(logging.DEBUG) logger.setLevel(logging.DEBUG) logger.addHandler(handler) asyncio.log.logger.debug(plop) output.close() --- python34_d.dll!PyList_New(__int64 size=0) Line 159 + 0xc bytes So the crash occurred again during a call to PyList_New(0) which triggerred a garbage collection. The backtrace looks more reliable: _PyObject_GC_Malloc(56). msvcr100d.dll!_wassert(const wchar_t * expr=0x63220618, const wchar_t * filename=0x632205e0, unsigned int lineno=364) Line 153 python34_d.dll!update_refs(_gc_head * containers=0x6338cfc0) Line 364 + 0x2b bytes This assertion contains a long comment which explains when the assertion can fail. /* Python's cyclic gc should never see an incoming refcount * of 0: if something decref'ed to 0, it should have been * deallocated immediately at that time. * Possible cause (if the assert triggers): a tp_dealloc * routine left a gc-aware object tracked during its teardown * phase, and did something-- or allowed something to happen -- * that called back into Python. gc can trigger then, and may * see the still-tracked dying object. Before this assert * was added, such mistakes went on to allow gc to try to * delete the object again. In a debug build, that caused * a mysterious segfault, when _Py_ForgetReference tried * to remove the object from the doubly-linked list of all * objects a second time. In a release build, an actual * double deallocation occurred, which leads to corruption * of the allocator's internal bookkeeping pointers. That's * so serious that maybe this should be a release-build * check instead of an assert? */ assert(_PyGCHead_REFS(gc) != 0); To come back to asyncio and the proactor event loop: the implementations of the proactor event loop is still young and probably unstable. Would it be possible to test again with fewer clients (500 clients or less to not hit the select() limit on Windows) with selector event loop? A crash in the garbage collector can be caused by a memory corruption. I fixed a lot of memory corruption caused by internal objects uesd in the proactor event loop. Can you check if asyncio.windows_events._BaseWaitHandleFuture is used? This class is used by IocpProactor.wait_for_handle() which is called by _WindowsSubprocessTransport. But you wrote you the server (where the crash occurs) doesn't run subprocesses. I never tried the proactor event loop with threads. Are you using asyncio code outside the main thread? asyncio is not thread-safe at all :-p Are you using a single event loop or one event loop per thread? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: Bugfixes of asyncio 3.4.3, a lot of them are related to the proactor event loop (and most of them leaded to crashes): https://code.google.com/p/tulip/source/browse/ChangeLog#66 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: (Since you chose to comment this closed issue instead of opening a new one, I reopen this issue.) -- resolution: not a bug - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: OK, I caught the crash with a debug build of Python 3.4.3. I have a core dump and even the process itself still alive in memory. I can provide any information you need. I can also explain how to debug a core dump with Visual Studio, if necessary. This time the crash was here: static void update_refs(PyGC_Head *containers) { PyGC_Head *gc = containers-gc.gc_next; for (; gc != containers; gc = gc-gc.gc_next) { assert(_PyGCHead_REFS(gc) == GC_REACHABLE); _PyGCHead_SET_REFS(gc, Py_REFCNT(FROM_GC(gc))); /* Python's cyclic gc should never see an incoming refcount * of 0: if something decref'ed to 0, it should have been * deallocated immediately at that time. * Possible cause (if the assert triggers): a tp_dealloc * routine left a gc-aware object tracked during its teardown * phase, and did something-- or allowed something to happen -- * that called back into Python. gc can trigger then, and may * see the still-tracked dying object. Before this assert * was added, such mistakes went on to allow gc to try to * delete the object again. In a debug build, that caused * a mysterious segfault, when _Py_ForgetReference tried * to remove the object from the doubly-linked list of all * objects a second time. In a release build, an actual * double deallocation occurred, which leads to corruption * of the allocator's internal bookkeeping pointers. That's * so serious that maybe this should be a release-build * check instead of an assert? */ -- assert(_PyGCHead_REFS(gc) != 0); --- crash } } In the calling frame I can see that update_refs() is called with a PyGC_Head object whose gc_refs is 0, and according to the paragraph above that shouldn't happen. A screenshot is attached. Stack trace: msvcr100d.dll!_NMSG_WRITE(int rterrnum=10) Line 217 msvcr100d.dll!abort() Line 71 msvcr100d.dll!_wassert(const wchar_t * expr=0x63220618, const wchar_t * filename=0x632205e0, unsigned int lineno=364) Line 153 python34_d.dll!update_refs(_gc_head * containers=0x6338cfc0) Line 364 + 0x2b bytes python34_d.dll!collect(int generation=0, __int64 * n_collected=0x00012beee530, __int64 * n_uncollectable=0x00012beee520, int nofail=0) Line 969 python34_d.dll!collect_with_callback(int generation=0) Line 1140 + 0x16 bytes python34_d.dll!collect_generations() Line 1163 + 0x9 bytes python34_d.dll!_PyObject_GC_Malloc(unsigned __int64 basicsize=56) Line 1739 python34_d.dll!_PyObject_GC_New(_typeobject * tp=0x633ce280) Line 1748 + 0xe bytes python34_d.dll!PyList_New(__int64 size=0) Line 159 + 0xc bytes python34_d.dll!PyEval_EvalFrameEx(_frame * f=0x00010f387658, int throwflag=3) Line 2347 + 0x8 bytes python34_d.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0x000129e69948, int n=737077536, int na=1661137695, int nk=0) Line 4335 python34_d.dll!call_function(_object * * * pp_stack=0x00012beee919, int oparg=131) Line 4263 python34_d.dll!PyEval_EvalFrameEx(_frame * f=0x00011c077528, int throwflag=3) Line 2840 python34_d.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0x89a53190, int n=737078048, int na=1661137695, int nk=0) Line 4335 python34_d.dll!call_function(_object * * * pp_stack=0x00012beeeb19, int oparg=131) Line 4263 python34_d.dll!PyEval_EvalFrameEx(_frame * f=0x00010369f2a8, int throwflag=0) Line 2840 python34_d.dll!PyEval_EvalCodeEx(_object * _co=0x00010002, _object * globals=0x00010002, _object * locals=0x1fa5e218, _object * * args=0x00730088, int argcount=2, _object * * kws=0x00730080, int kwcount=0, _object * * defs=0x, int defcount=0, _object * kwdefs=0x, _object * closure=0x) Line 3588 + 0xf bytes python34_d.dll!function_call(_object * func=0x03c10058, _object * arg=0x1fa5e218, _object * kw=0x0001296360c8) Line 638 + 0x8a bytes python34_d.dll!PyObject_Call(_object * func=0x03c10058, _object * arg=0x1fa5e218, _object * kw=0x0001296360c8) Line 2040 + 0x13 bytes python34_d.dll!ext_do_call(_object * func=0xc42c25a0, _object * * * pp_stack=0x00012b49, int flags=62980184, int na=1, int nk=0) Line 4561 + 0xe bytes python34_d.dll!PyEval_EvalFrameEx(_frame * f=0xfa4eab18, int throwflag=1) Line 2880 python34_d.dll!fast_function(_object * func=0x0001, _object * * * pp_stack=0x3b42f398, int n=737079376, int na=1661137695, int nk=0) Line 4335 python34_d.dll!call_function(_object * * * pp_stack=0x00012beef049, int oparg=131) Line 4263 python34_d.dll!PyEval_EvalFrameEx(_frame * f=0x00010f38ec28, int throwflag=0) Line 2840
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: Sorry, you're wrong: the proactor event loop heavily uses the _overlapped module which is implemented in C. A crash in the garbage collector is more likely a bug in asyncio/your application, than a bug in Python itself. I'm aware of that. I assumed the original crash reported in this thread was the same as mine but I can't be sure of that. And that crash was on Linux, where asyncio is pure Python AFAIK. Please enable asyncio debug mode, ResourceWarning, enable DEBUG log level, and check if you get some errors. For example, GetQueuedCompletionStatus() returned an unexpected event usually means that you will quickly get a memory corruption... I've already done that but admittedly not for very long because the server ran so slowly that it was unresponsive. I didn't see any error or warning messages, but I did see numerous Executing Handle _ProactorReadPipeTransport._loop_reading(_OverlappedF...events.py:451) messages. Would it be possible to test again with fewer clients (500 clients or less to not hit the select() limit on Windows) with selector event loop? I'll see if I can try that, but I guess reproducing the crash can take much longer with 500 clients. Can you check if asyncio.windows_events._BaseWaitHandleFuture is used? This class is used by IocpProactor.wait_for_handle() which is called by _WindowsSubprocessTransport. But you wrote you the server (where the crash occurs) doesn't run subprocesses. It's not used. I put print statements and an exit() in the constructor of _BaseWaitHandleFuture and it wasn't triggered. Are you using asyncio code outside the main thread? No. asyncio is not thread-safe at all :-p Are you using a single event loop or one event loop per thread? A single event loop, and I've made sure that all code running outside the main thread doesn't call any asyncio functions. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: It looks like you are running your app on Windows. Are you using the proactor event loop? I guess yes since you have more than 1000 clients and select() is limited to 500 sockets. In Python 3.4.3, I fixed a *lot* of crashes and race conditions in the proactor event loop. There are maybe more race conditions. Are you running subprocesses with asyncio? Le 14 mars 2015 05:42, Guido van Rossum rep...@bugs.python.org a écrit : PyObject_GC_Malloc(unsigned __int64 basicsize=4046951880) That's nearly 4 GB. I somehow doubt your app is actually trying to allocate that much memory The caller is PyList_New(0). When using a debugger on an optimized build, you should not trust the debugger. For example, variables can be stored in registers. Displaying values in the old frames can just display junk. In gdb, you see optimized. The fast call convention uses registers to pass first parameters to functions. I'm not sure that debuggers handle this case correctly. Etc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: That's nearly 4 GB. I somehow doubt your app is actually trying to allocate that much memory There's no reason for the app to allocate that much memory in a single call. It's using almost 4 GB of memory in total today, but that makes sense given its load today. Is there any you can run an unoptimized build? I'll try doing that now. I suppose I should build it myself with Py_DEBUG enabled. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: It looks like you are running your app on Windows. Are you using the proactor event loop? Yes. In Python 3.4.3, I fixed a *lot* of crashes and race conditions in the proactor event loop. There are maybe more race conditions. I've already tried Python 3.4.3 and got the same crash. The one described in my long message with the stack trace is from 3.4.2, but I got that same stack trace (or a very similar one) with 3.4.3. I have a core dump from that crash as well so I can provide the exact stack trace if necessary. Are you running subprocesses with asyncio? I can't recall running any subprocesses in the server program. The clients do that very often, but they're not the ones crashing... The caller is PyList_New(0). I suppose this creates an empty list? I managed to get a stack trace from faulthandler somehow and it points to this line in one of the threads: d = rg.item_counts_by_params.setdefault(params, {}) This runs in a thread (managed by a ThreadPoolExecutor) that computes stuff and writes it to a file. It's in a loop that runs ~20 times per job. Several lines above there's a similar line that passes an empty list as the 2nd parameter to setdefault(). The other threads are all in threading.wait(), probably doing nothing interesting. One thread is in _run_once(). Unfortunately I don't have any additional information about this particular crash (no dump or stack trace), but it's not unreasonable to assume other crashes occurred during creation of new lists/dicts in a similar manner. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: I caught another crash just now, this time in update_refs(). A stack trace is attached below. I still think this is the same issue. static void update_refs(PyGC_Head *containers) { PyGC_Head *gc = containers-gc.gc_next; 67382D60 mov rdx,qword ptr [rcx] for (; gc != containers; gc = gc-gc.gc_next) { 67382D63 cmp rdx,rcx 67382D66 je update_refs+28h (67382D88h) 67382D68 nop dword ptr [rax+rax] assert(_PyGCHead_REFS(gc) == GC_REACHABLE); _PyGCHead_SET_REFS(gc, Py_REFCNT(FROM_GC(gc))); 67382D70 and qword ptr [rdx+10h],1 Here rdx is 0, which means we're writing to 0x10, probably because the member gc_refs is at offset 0x10 of struct PyGC_Head.gc. So I suppose containers-gc.gc_next was NULL. (In case this is helpful: somehow, possibly due to compiler optimizations, Visual Studio claims that containers-gc.gc_next is 0x34. I'm not sure what to make of this. It also claims that containers-gc.gc_prev-gc.gc_next is 0x3e, and that containers-gc.gc_prev-gc.gc_prev-gc.gc_next is 0x3e, and so on... gc_prev always seems fine and gc_next is always 0x3e, except for the first one which is 0x34. I'm attaching a screenshot to make this clearer.) Stack trace (Python 3.4.2, 64 bit, Windows): python34.dll!update_refs(_gc_head * containers=0x676af8e0) Line 345 python34.dll!collect(int generation=-290088656, __int64 * n_collected=0xf166e920, __int64 * n_uncollectable=0x, int nofail=0) Line 969 python34.dll!collect_with_callback(int generation=-290088656) Line 1141 python34.dll!_PyObject_GC_Malloc(unsigned __int64 basicsize=4046951880) Line 1739 python34.dll!_PyObject_GC_New(_typeobject * tp=0x01c624f0) Line 1749 python34.dll!PyList_New(__int64 size=0) Line 159 + 0xc bytes python34.dll!PyEval_EvalFrameEx(_frame * f=0xbab65b08, int throwflag=-244913096) Line 2346 python34.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0xf77684e0, int n=102445400, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166ec29, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0x601cbd68, int throwflag=-244912600) Line 2838 python34.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0xf7768f28, int n=56017240, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166ee19, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0xc65ff8f8, int throwflag=0) Line 2838 python34.dll!PyEval_EvalCodeEx(_object * _co=0x0002, _object * globals=0x0002, _object * locals=0x, _object * * args=0x0358d248, int argcount=2, _object * * kws=0x01c50060, int kwcount=0, _object * * defs=0x, int defcount=0, _object * kwdefs=0x, _object * closure=0x) Line 3585 + 0xa bytes python34.dll!function_call(_object * func=0x0355f048, _object * arg=0xf12f7688, _object * kw=0x) Line 638 + 0x45 bytes python34.dll!PyObject_Call(_object * func=0xf12f7688, _object * arg=0xf11c4d08, _object * kw=0xf4e9ba58) Line 2068 python34.dll!ext_do_call(_object * func=0x0355f048, _object * * * pp_stack=0xf166f0d9, int flags=-200649216, int na=1, int nk=0) Line 4558 + 0xe bytes python34.dll!PyEval_EvalFrameEx(_frame * f=0xf40a5458, int throwflag=-244911400) Line 2879 python34.dll!fast_function(_object * func=0x0001, _object * * * pp_stack=0xf1380f98, int n=45993224, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166f2c9, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0xd6922548, int throwflag=0) Line 2838 python34.dll!PyEval_EvalCodeEx(_object * _co=0x0002, _object * globals=0x0002, _object * locals=0x, _object * * args=0x02334200, int argcount=2, _object * * kws=0x01c50060, int kwcount=0, _object * * defs=0x, int defcount=0, _object * kwdefs=0x, _object * closure=0x) Line 3585 + 0xa bytes python34.dll!function_call(_object * func=0x02bdcbf8, _object * arg=0xe41ef808, _object * kw=0x) Line 638 + 0x45 bytes python34.dll!PyObject_Call(_object * func=0xe41ef808, _object * arg=0xef4ad308, _object * kw=0xdeda7148) Line 2068 python34.dll!ext_do_call(_object * func=0x02bdcbf8, _object * * * pp_stack=0xf166f589, int flags=-280305184, int na=0, int nk=0) Line 4558 + 0xe bytes python34.dll!PyEval_EvalFrameEx(_frame * f=0xef4ae048, int throwflag=-244910200) Line 2879
[issue23187] Segmentation fault, possibly asyncio related
Guido van Rossum added the comment: So what do you make of this? PyObject_GC_Malloc(unsigned __int64 basicsize=4046951880) That's nearly 4 GB. I somehow doubt your app is actually trying to allocate that much memory -- maybe the type object a few lines below in the stack is overwritten, or maybe the compiler is lying about everything. Is there any chance you can run an unoptimized build? On Fri, Mar 13, 2015 at 7:27 PM, Michael Goldish rep...@bugs.python.org wrote: Michael Goldish added the comment: I caught another crash just now, this time in update_refs(). A stack trace is attached below. I still think this is the same issue. static void update_refs(PyGC_Head *containers) { PyGC_Head *gc = containers-gc.gc_next; 67382D60 mov rdx,qword ptr [rcx] for (; gc != containers; gc = gc-gc.gc_next) { 67382D63 cmp rdx,rcx 67382D66 je update_refs+28h (67382D88h) 67382D68 nop dword ptr [rax+rax] assert(_PyGCHead_REFS(gc) == GC_REACHABLE); _PyGCHead_SET_REFS(gc, Py_REFCNT(FROM_GC(gc))); 67382D70 and qword ptr [rdx+10h],1 Here rdx is 0, which means we're writing to 0x10, probably because the member gc_refs is at offset 0x10 of struct PyGC_Head.gc. So I suppose containers-gc.gc_next was NULL. (In case this is helpful: somehow, possibly due to compiler optimizations, Visual Studio claims that containers-gc.gc_next is 0x34. I'm not sure what to make of this. It also claims that containers-gc.gc_prev-gc.gc_next is 0x3e, and that containers-gc.gc_prev-gc.gc_prev-gc.gc_next is 0x3e, and so on... gc_prev always seems fine and gc_next is always 0x3e, except for the first one which is 0x34. I'm attaching a screenshot to make this clearer.) Stack trace (Python 3.4.2, 64 bit, Windows): python34.dll!update_refs(_gc_head * containers=0x676af8e0) Line 345 python34.dll!collect(int generation=-290088656, __int64 * n_collected=0xf166e920, __int64 * n_uncollectable=0x, int nofail=0) Line 969 python34.dll!collect_with_callback(int generation=-290088656) Line 1141 python34.dll!_PyObject_GC_Malloc(unsigned __int64 basicsize=4046951880) Line 1739 python34.dll!_PyObject_GC_New(_typeobject * tp=0x01c624f0) Line 1749 python34.dll!PyList_New(__int64 size=0) Line 159 + 0xc bytes python34.dll!PyEval_EvalFrameEx(_frame * f=0xbab65b08, int throwflag=-244913096) Line 2346 python34.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0xf77684e0, int n=102445400, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166ec29, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0x601cbd68, int throwflag=-244912600) Line 2838 python34.dll!fast_function(_object * func=0x0003, _object * * * pp_stack=0xf7768f28, int n=56017240, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166ee19, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0xc65ff8f8, int throwflag=0) Line 2838 python34.dll!PyEval_EvalCodeEx(_object * _co=0x0002, _object * globals=0x0002, _object * locals=0x, _object * * args=0x0358d248, int argcount=2, _object * * kws=0x01c50060, int kwcount=0, _object * * defs=0x, int defcount=0, _object * kwdefs=0x, _object * closure=0x) Line 3585 + 0xa bytes python34.dll!function_call(_object * func=0x0355f048, _object * arg=0xf12f7688, _object * kw=0x) Line 638 + 0x45 bytes python34.dll!PyObject_Call(_object * func=0xf12f7688, _object * arg=0xf11c4d08, _object * kw=0xf4e9ba58) Line 2068 python34.dll!ext_do_call(_object * func=0x0355f048, _object * * * pp_stack=0xf166f0d9, int flags=-200649216, int na=1, int nk=0) Line 4558 + 0xe bytes python34.dll!PyEval_EvalFrameEx(_frame * f=0xf40a5458, int throwflag=-244911400) Line 2879 python34.dll!fast_function(_object * func=0x0001, _object * * * pp_stack=0xf1380f98, int n=45993224, int na=1732453353, int nk=0) Line 4332 python34.dll!call_function(_object * * * pp_stack=0xf166f2c9, int oparg=131) Line 4260 python34.dll!PyEval_EvalFrameEx(_frame * f=0xd6922548, int throwflag=0) Line 2838 python34.dll!PyEval_EvalCodeEx(_object * _co=0x0002, _object * globals=0x0002, _object * locals=0x, _object * * args=0x02334200, int argcount=2, _object * * kws=0x01c50060, int kwcount=0, _object * * defs=0x, int defcount=0, _object * kwdefs=0x, _object * closure=0x) Line 3585 + 0xa bytes python34.dll!function_call(_object * func=0x02bdcbf8,
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: I don't see a reason to assume the machine was running out of memory. After each crash the process was kept alive by one of those Windows crash dialogs - the process terminated unexpectedly or similar. I could see exactly how much memory was occupied by the process and it was never a significant percentage of the machine's total memory. It sometimes exceeded 2GB but I'm using a 64 bit build so this shouldn't be a problem. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: Why do you consider that it's the same issue? Because it's a very similar stack trace, the crash itself is in subtype_dealloc(), it happens once every few days, asyncio is involved, and the issue is relatively new - I couldn't find another bug report with a similar description. I have core dumps if you're interested but am not sure where to upload them. You can try to use faulthandler to write the traceback of all Python threads where the crash occurred I've enabled faulthandler, the program crashed, and nothing was written to the file. For reference, I caused an artificial crash in the same Python program (using ctypes) and the stack traces *were* written. I verified this several times, both with the real crash and the artificial one, and the real crash always left the file empty. I guess that Visual Studio can load such minidump files. I've already looked at one of my dumps in Visual Studio but couldn't make sense of what was going on. The structures involved seemed to contain data that made no sense to me (e.g. some pointers had values like 0x10, which usually makes no sense for a pointer and seems to indicate memory corruption). Did you see some errors or warnings when running your application with asyncio in debug mode? No, but I'll try. I doubt the problem is in asyncio itself because it's mostly written in Python. This looks like a garbage collection issue. Can you maybe send me the source code of your application? I'll ask to see if that's permissible, but either way you will probably have a hard time reproducing the crash - the program is a server which maintains active TCP connections to more than 1000 client instances, writes large text files in background threads, pickles large objects... The frequency of the crash seems to be correlated to how busy the server is. Without a proper setup it'll be difficult to make it busy. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Guido van Rossum added the comment: Are we all in agreement that it's probably somehow running out of memory? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: I'm not sure where to post this as the issue is closed: Please open a new issue. I've encountered the same problem. Why do you consider that it's the same issue? The crash seems to occur at random, with anything between 1 and 20 days separating consecutive crashes. Outch. I hate such random bugs. I have core dumps if you're interested but am not sure where to upload them. I don't know how to analyze a Windows core dump. Sorry, but without more information, it will be almost impossible to fix your issue. You can try to use faulthandler to write the traceback of all Python threads where the crash occurred: https://docs.python.org/dev/library/faulthandler.html#faulthandler.enable Write the ouput into a file. The output is text. Brian Curtin wrote a similar tool but more specific to Windows: https://bitbucket.org/briancurtin/minidumper/ minidumper writes minidumps for post-mortem analysis of crashes in Python or its extensions. Problem: I don't know how to use such dump!? I guess that Visual Studio can load such minidump files. https://msdn.microsoft.com/en-us/library/windows/desktop/ee416349%28v=vs.85%29.aspx#analyzing_a_minidump Did you see some errors or warnings when running your application with asyncio in debug mode? https://docs.python.org/dev/library/asyncio-dev.html Please use at least Python 3.4.3 which embeds a recent version of asyncio, it enables more checks in debug mode. Can you maybe send me the source code of your application? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Michael Goldish added the comment: I'm not sure where to post this as the issue is closed: I've encountered the same problem. I don't have a stack trace to share at the moment but I know it's almost (or entirely) identical. My context: - Windows - Happens with Python 3.4.2 and 3.4.3 (and possibly earlier versions) - Pure Python (no external binary modules whatsoever) - Pure asyncio (no Tornado) - Not using json/ujson - Plenty of memory available The crash seems to occur at random, with anything between 1 and 20 days separating consecutive crashes. I have core dumps if you're interested but am not sure where to upload them. -- nosy: +michaelg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: Since the issue looks to be related to a third party module (ujson), I close this issue. Note: On UNIX, asyncio is fully implemented in Python, I don't see how it could corrupt internal Python structures. -- resolution: - not a bug status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: @Ivailo: Any progress on your investigation? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Ivailo Karamanolev added the comment: @Victor It has been running for 5 days now on 3.4.2 using json instead of ujson. I want to give at least 10 days to crash and if it's still up, I plan to switch to simplejson to see how that will go, since json is quite slow. I'll keep playing with using different modules and when/if I have something significant that I can reproduce - I'll post here. If the problems are resolved, I'll close the ticket. If that doesn't sound reasonable, please let me know. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Changes by Rickard Englund gefle.rick...@gmail.com: -- nosy: +r-englund ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
New submission from Ivailo Karamanolev: I have a program that's running Tornado on top of asyncio and uses mysql-connector-python to monitor a set of HTTP servers and save the status to MySQL. Initially I ran it with the current version of Python in the Ubuntu 14.04 repos, that's 3.4.0 and started getting segmentation faults after keeping it running for a few days. I compiled and installed 3.4.2 and the same thing keeps happening, with the same stack trace (per gdb). Running it with and without gdb attached gives the same results - seg fault after 3-4 days of running. The machine is running on a Xeon with ECC memory and doesn't show signs of illness otherwise, so it's probably not a hardware problem. Also, it's happening consistently when the programs runs for 3-4 days, with or without a system reboot. Program received signal SIGSEGV, Segmentation fault. 0x00471ed7 in subtype_dealloc (self=0x716a3860) at Objects/typeobject.c:1128 1128_PyObject_GC_UNTRACK(self); Stack trace: #0 0x00471ed7 in subtype_dealloc (self=0x716a3860) at Objects/typeobject.c:1128 #1 0x004563f4 in dict_dealloc (mp=0x7fffe95444c8) at Objects/dictobject.c:1383 #2 0x00472140 in subtype_dealloc (self=0x7fffe8e1fcf8) at Objects/typeobject.c:1157 #3 0x0058ad22 in frame_dealloc (f=0x1145258) at Objects/frameobject.c:429 #4 0x00588fd7 in gen_send_ex (gen=gen@entry=0x7fffe8c5cea0, gen=gen@entry=0x7fffe8c5cea0, exc=0, arg=arg@entry=0x0) at Objects/genobject.c:145 #5 _PyGen_Send (gen=gen@entry=0x7fffe8c5cea0, arg=arg@entry=0x87fec0 _Py_FalseStruct) at Objects/genobject.c:158 #6 0x004da329 in PyEval_EvalFrameEx (f=f@entry=0x11faaf8, throwflag=throwflag@entry=0) at Python/ceval.c:1906 #7 0x00588e5c in gen_send_ex (gen=gen@entry=0x7fffe8c5ccf0, gen=gen@entry=0x7fffe8c5ccf0, exc=0, arg=arg@entry=0x71bb6420) at Objects/genobject.c:104 #8 _PyGen_Send (gen=gen@entry=0x7fffe8c5ccf0, arg=arg@entry=0x87fec0 _Py_FalseStruct) at Objects/genobject.c:158 #9 0x004da329 in PyEval_EvalFrameEx (f=f@entry=0x10daa48, throwflag=throwflag@entry=0) at Python/ceval.c:1906 #10 0x00588e5c in gen_send_ex (gen=gen@entry=0x7fffe8c5cc18, gen=gen@entry=0x7fffe8c5cc18, exc=0, arg=arg@entry=0x102aac8) at Objects/genobject.c:104 #11 _PyGen_Send (gen=gen@entry=0x7fffe8c5cc18, arg=arg@entry=0x87fec0 _Py_FalseStruct) at Objects/genobject.c:158 #12 0x004da329 in PyEval_EvalFrameEx (f=f@entry=0x10da7f8, throwflag=throwflag@entry=0) at Python/ceval.c:1906 #13 0x00588e5c in gen_send_ex (gen=0x716b65a0, gen=0x716b65a0, exc=0, arg=optimized out) at Objects/genobject.c:104 #14 _PyGen_Send (gen=0x716b65a0, arg=optimized out) at Objects/genobject.c:158 #15 0x004d9d0b in call_function (oparg=optimized out, pp_stack=0x7fffd290) at Python/ceval.c:4221 #16 PyEval_EvalFrameEx (f=f@entry=0x10da568, throwflag=throwflag@entry=0) at Python/ceval.c:2836 #17 0x004d28d4 in PyEval_EvalCodeEx (_co=0x71e0c030, globals=optimized out, locals=optimized out, args=optimized out, argcount=3, kws=0x12e38b8, kwcount=0, defs=0x71e0a8e0, defcount=2, kwdefs=0x0, closure=0x71e10588) at Python/ceval.c:3585 #18 0x004d8e75 in fast_function (nk=optimized out, na=3, n=3, pp_stack=0x7fffd490, func=0x71e122f0) at Python/ceval.c:4341 #19 call_function (oparg=optimized out, pp_stack=0x7fffd490) at Python/ceval.c:4259 #20 PyEval_EvalFrameEx (f=f@entry=0x12e3708, throwflag=throwflag@entry=0) at Python/ceval.c:2836 #21 0x004d28d4 in PyEval_EvalCodeEx (_co=0x71e0c0c0, globals=optimized out, locals=locals@entry=0x0, args=args@entry=0x7fffeae8a720, argcount=2, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3585 #22 0x0058c67c in function_call (func=0x71e12378, arg=0x7fffeae8a708, kw=0x0) at Objects/funcobject.c:632 #23 0x0042686a in PyObject_Call (func=func@entry=0x71e12378, arg=arg@entry=0x7fffeae8a708, kw=kw@entry=0x0) at Objects/abstract.c:2067 #24 0x004d48a4 in ext_do_call (nk=optimized out, na=1, flags=optimized out, pp_stack=0x7fffd760, func=0x71e12378) at Python/ceval.c:4558 #25 PyEval_EvalFrameEx (f=f@entry=0x145ac48, throwflag=throwflag@entry=0) at Python/ceval.c:2876 #26 0x004d9e82 in fast_function (nk=optimized out, na=1, n=1, pp_stack=0x7fffd8a0, func=0x71e2d488) at Python/ceval.c:4331 #27 call_function (oparg=optimized out, pp_stack=0x7fffd8a0) at Python/ceval.c:4259 #28 PyEval_EvalFrameEx (f=f@entry=0x11aee28, throwflag=throwflag@entry=0) at Python/ceval.c:2836 #29 0x004d9e82 in fast_function (nk=optimized out, na=1, n=1, pp_stack=0x7fffd9f0, func=0x71bc1400) at Python/ceval.c:4331 #30 call_function (oparg=optimized out, pp_stack=0x7fffd9f0) at Python/ceval.c:4259 #31 PyEval_EvalFrameEx
[issue23187] Segmentation fault, possibly asyncio related
Guido van Rossum added the comment: That's going to be really hard to debug. Is it always the same stack trace? Could it be slowly running out of memory? (The consistent time until it happens points to this.) A segfault in dealloc might point to a refcounting bug. The rarity of the even might mean that it only happens in some rare case (maybe the refcount bug is in an error path that is only triggered by running out of memory somewhere?). Is mysql totally unsuspected? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Ivailo Karamanolev added the comment: Running out of memory is unlikely. There are many other processes running on the machine and none of them show signs of memory issues. In addition, the munin memory usage reporting doesn't show signs of steady growing or drops when the process crashes (munin screenshot attached). 3.4.0 and 3.4.2 on all runs showed exactly the same stack trace. AFAIK mysql-connector-python is pure Python, so even if the problem was there, I think it's still Python core. I'm not 100% on this though. -- Added file: http://bugs.python.org/file37636/Screenshot 2015-01-07 15.57.01.png ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Ivailo Karamanolev added the comment: What I forgot to mention is that I'm using aiohttp for making lots of HTTP requests. http://bugs.python.org/issue21435 mentions asyncio and aiohttp and is somewhat memory management related so I thought I'm hitting a different manifestation of it and upgrading to 3.4.2 would fix it, but it didn't. I don't know enough about cpython to know if they are related. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: that's 3.4.0 I suggest you to upgrade to Python 3.4.2. Your issue may already be fixed. For example, there was the issue #21435 in the garbage collector. The issue was found with an asyncio code. Your crash occurs at _PyObject_GC_UNTRACK(self); which is also related to the garbage collector. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Ivailo Karamanolev added the comment: @haypo Maybe I didn't put it prominently enough, but I upgraded to 3.4.2 by downloading the source and installing it. The same crash with the same stack trace happens. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: upgrading to 3.4.2 would fix it, but it didn't Oh sorry, I missed the part where you wrote that you already tested Python 3.4.2. By the way, if you already compile your own Python: try to configure Python with ./configure --with-pydebug. It adds much more debug checks to check internal consistency. Is mysql totally unsuspected? It looks like mysql-connector-python is written in pure Python. Are you using other external modules which are implemented in C? -- Would it be possible to share your source code to try to reproduce the issue? You might send it to me privately if you prefer. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
Ivailo Karamanolev added the comment: try to configure Python with ./configure --with-pydebug I'm installing 3.4.2 with this option right now and I'll reboot and run the script with gdb attached. I might be a few days before I have a result. Are you using other external modules which are implemented in C The only module that I'm using that's implemented in C is ujson https://pypi.python.org/pypi/ujson . If you think this is relevant, I can easily try changing it to use the built-in json. Would it be possible to share your source code I can share it privately, yes, but running it on your machine might be too complicated. Sharing the source code will probably be only useful for you to look at what libraries I am using and how I'm using asyncio. What I can do is try running it on another machine with another setup. While it is currently running on a physical machine that I set up, I can run it in parallel on a OpenVZ VPS that is installed by the VPS provider, so it should give a completely independent environment. Would that help? If yes, should I also install 3.4.2 from source with --with-pydebug? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23187] Segmentation fault, possibly asyncio related
STINNER Victor added the comment: The only module that I'm using that's implemented in C is ujson https://pypi.python.org/pypi/ujson . If you think this is relevant, I can easily try changing it to use the built-in json. To investigate an issue, we have to reduce the scope. If your program works with json, please use it instead yes. If the program still crash with the builtin json, it means that usjon is not the problem. If the program doesn't crash anymore, your issue may come from ujson and the investigate would be completly different... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23187 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com