[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
Tony Reix added the comment: Hi Stefan, In your message https://bugs.python.org/issue41540#msg375462 , you said: "However, instead of freezing the machine, the process gets a proper SIGKILL almost instantly." That's probably due to a very small size of the Paging Space of the AIX machine you used for testing. With very small PS, the OS quickly reaches the step where PS and memory are full and it tries to kill possible culprits (but often killing innocent processes, like my bash shell). However, with a large PS (size of the Memory, or half), it takes some time for the OS to consume the PS, and, during this time (many seconds if not minutes), the OS looks like frozen and it takes many seconds or minutes for a "kill -9 PID" to take effect. About -bmaxdata, I always used it for extending default memory of a 32bit process, but I never used it for reducing the possible memory of a 64bit process since some users may want to use python with hundreds of GigaBytes of memory. And the python executable used for tests is the same one that is delivered to users. About PSALLOC=early , I confirm that it perfectly fixes the issue. So, we'll use it when testing Python. Our customers should use it or use ulimit -d . But using -bmaxdata for building python process in 64bit would reduce the possibilities of the python process. In the future, we'll probably improve the compatibility with Linux so that this (rare) case no more appear. BTW, on AIX, we have only 12 test cases failing out of about 32,471 test cases run in 64bit, with probably only 5 remaining serious failures. Both with GCC and XLC. Not bad. Less in 32bit. Now studying these few remaining issues and the still skipped tests. -- ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
Tony Reix added the comment: I forgot to say that this behavior was not present in stable version 3.8.5 . Sorry. On 2 machines AIX 7.2, testing Python 3.8.5 with: + cd /opt/freeware/src/packages/BUILD/Python-3.8.5 + ulimit -d unlimited + ulimit -m unlimited + ulimit -s unlimited + export LIBPATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/64bit:/usr/lib64:/usr/lib:/opt/lib + export PYTHONPATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/64bit/Modules + ./python Lib/test/regrtest.py -v test_decimal ... gave: 507 tests in 227 items. 507 passed and 0 failed. Test passed. So, this issue with v3.10 (master) appeared to me as a regression. However, after hours debugging the issue, I forgot to say it in this defect, sorry. (Previously, I was using limits for -d -m and -s : max 4GB. However, that appeared to be an issue when running tests with Python test option -M12Gb, which requires up and maybe more than 12GB of my 16GB memory machine in order to be able to run a large part of the Python Big Memory tests. And thus I unlimited these 3 resources, with no problem at all with version 3.8.5 .) -- ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
Tony Reix added the comment: Is it a 64bit AIX ? Yes, AIX is 64bit by default and only since ages, but it manages 32bit applications as well as 64bit applications. The experiments were done with 64bit Python executables on both AIX and Linux. The AIX machine has 16GB Memory and 16GB Paging Space. The Linux Fdora32/x86_64 machine has 16GB Memory and 8269820 Paging Space (swapon -s). Yes, I agree that the behavior of AIX malloc() under "ulimit -d unlimited" is... surprising. And the manual of malloc() does not talk about this. Anyway, does the test: if (size > (size_t)PY_SSIZE_T_MAX) was aimed to prevent calling malloc() with such a huge size? If yes, that does not work. -- ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
Tony Reix added the comment: Hi Pablo, I'm only surprised that the maximum size generated in the test is always lower than the PY_SSIZE_T_MAX. And this appears both on AIX and on Linux, which both compute the same values. On AIX, it appears (I've just discovered this now) that malloc() does not ALWAYS check that there is enough memory to allocate before starting to claim memory (and thus paging space). This appears when Data Segment size is unlimited. On Linux/Fedora, I had no limit too. But it behaves differently and malloc() always checks that the size is correct. -- ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
Tony Reix added the comment: Some more explanations. On AIX, the memory is controlled by the ulimit command. "Global memory" comprises the physical memory and the paging space, associated with the Data Segment. By default, both Memory and Data Segment are limited: # ulimit -a data seg size (kbytes, -d) 131072 max memory size (kbytes, -m) 32768 ... However, it is possible to remove the limit, like: # ulimit -d unlimited Now, when the "data seg size" is limited, the malloc() routine checks if enough memory/paging-space are available, and it immediately returns a NULL pointer. But, when the "data seg size" is unlimited, the malloc() routine first tries to allocate and quickly consumes the paging space, which is much slower than acquiring memory since it consumes disk space. And it nearly hangs the OS. Thus, in that case, it does NOT check if enough memory of data segments are available. Bad. So, this issue appears on AIX only if we have: # ulimit -d unlimited Anyway, the test: if (size > (size_t)PY_SSIZE_T_MAX) in: Objects/obmalloc.c: PyMem_RawMalloc() seems weird to me since the max of size is always lower than PY_SSIZE_T_MAX . -- nosy: -facundobatista, mark.dickinson, pablogsal, rhettinger, skrah ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41540] Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX
New submission from Tony Reix : Python master of 2020/08/11 Test test_maxcontext_exact_arith (test.test_decimal.CWhitebox) checks that Python correctly handles a case where an object of size 421052631578947376 is created. maxcontext = Context(prec=C.MAX_PREC, Emin=C.MIN_EMIN, Emax=C.MAX_EMAX) Both on Linux and AIX, we have: Context(prec=99, rounding=ROUND_HALF_EVEN, Emin=-99, Emax=99, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow]) The test appears in: Lib/test/test_decimal.py 5665 def test_maxcontext_exact_arith(self): and the issue (on AIX) exactly appears at: self.assertEqual(Decimal(4) / 2, 2) The issue is due to code in: Objects/obmalloc.c : void * PyMem_RawMalloc(size_t size) { /* * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes. * Most python internals blindly use a signed Py_ssize_t to track * things without checking for overflows or negatives. * As size_t is unsigned, checking for size < 0 is not required. */ if (size > (size_t)PY_SSIZE_T_MAX) return NULL; return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size); Both on Fedora/x86_64 and AIX, we have: size:421052631578947376 PY_SSIZE_T_MAX: 9223372036854775807 thus: size < PY_SSIZE_T_MAX and _PyMem_Raw.malloc() is called. However, on Linux, the malloc() returns a NULL pointer in that case, and then Python handles this and correctly runs the test. However, on AIX, the malloc() tries to allocate the requested memory, and the OS gets stucked till the Python process is killed by the OS. Either size is too small, or PY_SSIZE_T_MAX is not correctly computed: ./Include/pyport.h : /* Largest positive value of type Py_ssize_t. */ #define PY_SSIZE_T_MAX ((Py_ssize_t)(((size_t)-1)>>1)) Anyway, the following code added in PyMem_RawMalloc() before the call to _PyMem_Raw.malloc() , which in turns calls malloc() : if (size == 421052631578947376) { printf("TONY: 421052631578947376: --> PY_SSIZE_T_MAX: %ld \n", PY_SSIZE_T_MAX); return NULL; } does fix the issue on AIX. However, it is simply a way to show where the issue can be fixed. Another solution (fix size < PY_SSIZE_T_MAX) is needed. -- components: C API messages: 375302 nosy: T.Rex priority: normal severity: normal status: open title: Test test_maxcontext_exact_arith (_decimal) consumes all memory on AIX type: crash versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue41540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: I do agree that the example with memchr is not correct. About your suggestion, I've done it. With 32. And that works fine. All 3 values are passed by value. # cat Pb-3.8.5.py #!/usr/bin/env python3 from ctypes import * mine = CDLL('./MemchrArgsHack2.so') class MemchrArgsHack2(Structure): _fields_ = [("s", c_char_p), ("c_n", c_ulong * 2)] memchr_args_hack2 = MemchrArgsHack2() memchr_args_hack2.s = b"abcdef" memchr_args_hack2.c_n[0] = ord('d') memchr_args_hack2.c_n[1] = 7 print( "sizeof(MemchrArgsHack2): ", sizeof(MemchrArgsHack2) ) print( CFUNCTYPE(c_char_p, MemchrArgsHack2, c_void_p) (('my_memchr', mine)) (memchr_args_hack2, None) ) # cat MemchrArgsHack2.c #include #include struct MemchrArgsHack2 { char *s; unsigned long c_n[2]; }; extern char *my_memchr(struct MemchrArgsHack2 args) { printf("s element : char pointer: %p %s\n", args.s, args.s); printf("c_n element 0: a Long: %ld\n", args.c_n[0]); printf("c_n element 1: a Long: %ld\n", args.c_n[1]); return(args.s +3); } TONY Modules/_ctypes/stgdict.c: MAX_STRUCT_SIZE=32 sizeof(MemchrArgsHack2): 24 TONY: libffi: src/powerpc/ffi_darwin.c : ffi_prep_cif_machdep() TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size:24 s->type:13 : FFI_TYPE_STRUCT TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() FFI_TYPE_STRUCT Before s->size:24 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size: 8 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size:16 s->type:13 : FFI_TYPE_STRUCT TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() FFI_TYPE_STRUCT Before s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:11 : FFI_TYPE_UINT64 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size: 8 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:11 : FFI_TYPE_UINT64 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() After ALIGN s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size:16 s->size:24 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() After ALIGN s->size:24 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c: ffi_call: FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd210 fn: 9001000a0227760 FFI_FN(ffi_prep_args) : 9001000a050a108 s element : char pointer: a154d40 abcdef c_n element 0: a Long: 100 c_n element 1: a Long: 7<<<< Correct value appears. b'def' With the regular version (16), I have: sizeof(MemchrArgsHack2): 24 TONY: libffi: src/powerpc/ffi_darwin.c : ffi_prep_cif_machdep() TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size:24 s->type:13 : FFI_TYPE_STRUCT TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() FFI_TYPE_STRUCT Before s->size:24 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size: 8 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() After ALIGN s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c: ffi_call: FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd210 fn: 9001000a0227760 FFI_FN(ffi_prep_args) : 9001000a050a108 s element : char pointer: a154d40 abcdef c_n element 0: a Long: 100 c_n element 1: a Long: 0<<< Python pushed nothing for this. -- ___ Python tracker
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: After more investigations, we (Damien and I) think that there are several issues in Python 3.8.5 : 1) Documentation. a) AFAIK, the only place in the Python ctypes documentation where it talks about how arrays in a structure are managed appears at: https://docs.python.org/3/library/ctypes.html#arrays b) the size of the structure in the example given here is much greater than in our case. c) The documentation does NOT talk that a structure <= 16 bytes and a structure greater than 16 bytes are managed differently. That's a bug in the documentation vs the code. 2) Tests Looking at tests, there are NO test about our case. 3) There is a bug in Python About the issue here, we see with gdb that Python provides libffi with a description saying that our case is passed as pointers. However, Python does NOT provides libffi with pointers for the array c_n, but with values. 4) libffi obeys Python directives given in description, thinking that it deals with 2 pointers, and thus it pushes only 2 values in registers R3 and R4. = Bug in Python: - 1) gdb (gdb) b ffi_call Breakpoint 1 at 0x900016fab80: file ../src/powerpc/ffi_darwin.c, line 919. (gdb) run Starting program: /home2/freeware/bin/python3 /tmp/Pb_damien2.py Thread 2 hit Breakpoint 1, ffi_call (cif=0xfffd108, fn=@0x9001000a0082640: 0x91b0d60 , rvalue=0xfffd1d0, avalue=0xfffd1c0) (gdb) p *(ffi_cif *)$r3 $9 = {abi = FFI_AIX, nargs = 2, arg_types = 0xfffd1b0, rtype = 0xa435cb8, bytes = 144, flags = 8} (gdb) x/2xg 0xfffd1b0 0xfffd1b0: 0x0a43ca48 0x08001000a0002a10 (gdb) p *(ffi_type *)0x0a43ca48 $11 = {size = 16, alignment = 8, type = 13, elements = 0xa12eed0} <= 13==FFI_TYPE_STRUCT size == 16 on AIX!!! == 24 on Linux (gdb) p *(ffi_type *)0x08001000a0002a10 $12 = {size = 8, alignment = 8, type = 14, elements = 0x0} <= FFI_TYPE_POINTER (gdb) x/3xg *(long *)$r6 0xa436050: 0x0a152200 0x0064 0xa436060: 0x0007 <= 7 is present in avalue[2] (gdb) x/s 0x0a152200 0xa152200: "abcdef" - 2) prints in libffi: AIX : aix_adjust_aggregate_sizes() TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size:24 s->type:13 : FFI_TYPE_STRUCT TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() FFI_TYPE_STRUCT Before s->size:24 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size: 8 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() p->size: 8 s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() After ALIGN s->size:16 TONY: libffi: src/powerpc/ffi_darwin.c : aix_adjust_aggregate_sizes() s->size: 8 s->type:14 : FFI_TYPE_POINTER TONY: libffi: src/powerpc/ffi_darwin.c: ffi_call: FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd200 fn: 9001000a0227760 FFI_FN(ffi_prep_args) : 9001000a050a108 s element : char pointer: a153d40 abcdef c_n element 0: a Long: 100 0X64 = 100 instead of a pointer c_n element 1: a Long: 0 libffi obeys description given by Python and pushes to R4 only what it thinks is a pointer (100 instead), and nothing in R5 Summary: - Python documentation is uncomplete vs the code - Python code gives libffi a description about pointers but Python code provides libffi with values. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: Fedora32/x86_64 : Python v3.8.5 : optimized : uint type. If, instead of using ulong type, the Pb.py program makes use of uint, the issue is different: see below. This means that the issue depends on the length of the data. BUILD=optimized TYPE=int export LD_LIBRARY_PATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized:/usr/lib64:/usr/lib export PYTHONPATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized/Modules ./Pb-3.8.5-int-optimized.py b'def' None None # cat ./Pb-3.8.5-int-optimized.py #!/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized/python # #!/opt/freeware/src/packages/BUILD/Python-3.8.5/python # #!/usr/bin/env python3 from ctypes import * libc = CDLL('/usr/lib64/libc-2.31.so') class MemchrArgsHack(Structure): _fields_ = [("s", c_char_p), ("c", c_uint), ("n", c_uint)] memchr_args_hack = MemchrArgsHack() memchr_args_hack.s = b"abcdef" memchr_args_hack.c = ord('d') memchr_args_hack.n = 7 class MemchrArgsHack2(Structure): _fields_ = [("s", c_char_p), ("c_n", c_uint * 2)] memchr_args_hack2 = MemchrArgsHack2() memchr_args_hack2.s = b"abcdef" memchr_args_hack2.c_n[0] = ord('d') memchr_args_hack2.c_n[1] = 7 print( CFUNCTYPE(c_char_p, c_char_p, c_uint, c_uint, c_void_p)(('memchr', libc))(b"abcdef", c_uint(ord('d')), c_uint(7), None)) print( CFUNCTYPE(c_char_p, MemchrArgsHack, c_void_p)(('memchr', libc))(memchr_args_hack, None)) print( CFUNCTYPE(c_char_p, MemchrArgsHack2, c_void_p)(('memchr', libc))(memchr_args_hack2, None)) -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Change by Tony Reix : -- versions: +Python 3.8 -Python 3.7 ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: Fedora32/x86_64 : Python v3.8.5 has been built. Issue is still there, but different in debug or optimized mode. Thus, change done in https://bugs.python.org/issue22273 did not fix this issue. ./Pb-3.8.5-debug.py : #!/opt/freeware/src/packages/BUILD/Python-3.8.5/build/debug/python ... i./Pb-3.8.5-optimized.py : #!/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized/python BUILD=debug export LD_LIBRARY_PATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/debug:/usr/lib64:/usr/lib export PYTHONPATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/debug/Modules ./Pb-3.8.5-debug.py b'def' None None BUILD=optimized export LD_LIBRARY_PATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized:/usr/lib64:/usr/lib export PYTHONPATH=/opt/freeware/src/packages/BUILD/Python-3.8.5/build/optimized/Modules + ./Pb-3.8.5-optimized.py b'def' Pb-3.8.5.sh: line 6: 103569 Segmentation fault (core dumped) ./Pb-3.8.5-$BUILD.py -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: After adding traces and after rebuilding Python and libffi with -O0 -g -gdwarf, it appears that, still in 64bit, the bug is still there, but that ffi_call_AIX is called now instead of ffi_call_DARWIN from ffi_call() routine of ../src/powerpc/ffi_darwin.c (lines 915...). ??? # ./Pb.py TONY: libffi: src/powerpc/ffi_darwin.c : FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd1f0 fn: 9001000a0082640 FFI_FN(ffi_prep_args) : 9001000a0483be8 b'def' TONY: libffi: src/powerpc/ffi_darwin.c : FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd220 fn: 9001000a0082640 FFI_FN(ffi_prep_args) : 9001000a0483be8 b'def' TONY: libffi: src/powerpc/ffi_darwin.c : FFI_AIX TONY: libffi: cif->abi: 1 -(long)cif->bytes : -144 cif->flags : 8 ecif.rvalue : fffd220 fn: 9001000a0082640 FFI_FN(ffi_prep_args) : 9001000a0483be8 None In 32bit with same build environment, a different code is run since the traces are not printed. Thus, 32bit and 64bit are managed very differently. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On AIX 7.2, with libffi compiled with -O0 -g, I have: 1) Call to memchr thru memchr_args_hack #0 0x091b0d60 in memchr () from /usr/lib/libc.a(shr_64.o) #1 0x0900058487a0 in ffi_call_DARWIN () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0x090005847eec in ffi_call (cif=0xfff, fn=0xca90, rvalue=0xfff, avalue=0xca80) at ../src/powerpc/ffi_darwin.c:31 #3 0x0900058f9900 in ?? () #4 0x0900058ebb6c in ?? () #5 0x09000109fc18 in _PyObject_MakeTpCall () from /opt/freeware/lib64/libpython3.8.so r3 0xa3659e0720575940382841312 r4 0x64 100 r5 0x7 7 (gdb) x/s $r3 0xa3659e0: "abcdef" 2) Call to memchr thru memchr_args_hack2 #0 0x091b0d60 in memchr () from /usr/lib/libc.a(shr_64.o) #1 0x0900058487a0 in ffi_call_DARWIN () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0x090005847eec in ffi_call (cif=0xfff, fn=0xca90, rvalue=0xfff, avalue=0xca80) at ../src/powerpc/ffi_darwin.c:31 #3 0x0900058f9900 in ?? () #4 0x0900058ebb6c in ?? () #5 0x09000109fc18 in _PyObject_MakeTpCall () from /opt/freeware/lib64/libpython3.8.so r3 0xa3659e0720575940382841312 r4 0x64 100 r5 0x0 0 So, it looks like, when libffi is not compiled with -O but with -O0 -g, that in 64bit ffi_call_DARWIN() is call in both cases (memchr_args_hack and memchr_args_hack2). However, as seen previously, it was not the case with libffi built with -O . Moreover, we have in source code: switch (cif->abi) { case FFI_AIX: ffi_call_AIX(, -(long)cif->bytes, cif->flags, ecif.rvalue, fn, FFI_FN(ffi_prep_args)); break; case FFI_DARWIN: ffi_call_DARWIN(, -(long)cif->bytes, cif->flags, ecif.rvalue, fn, FFI_FN(ffi_prep_args), cif->rtype); Why calling ffi_call_DARWIN instead of ffi_call_AIX ? Hummm Will rebuild libffi and python both with gcc -O0 -g -gdwarf and look at details. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: # pwd /opt/freeware/src/packages/BUILD/libffi-3.2.1 # grep -R ffi_closure_ASM * powerpc-ibm-aix7.2.0.0/.libs/libffi.exp: ffi_closure_ASM powerpc-ibm-aix7.2.0.0/include/ffitarget.h:void * code_pointer; /* Pointer to ffi_closure_ASM */ src/powerpc/aix_closure.S:.globl ffi_closure_ASM src/powerpc/darwin_closure.S:.globl _ffi_closure_ASM src/powerpc/ffi_darwin.c: extern void ffi_closure_ASM (void); *((unsigned long *)[2]) = (unsigned long) ffi_closure_ASM; /* function */ src/powerpc/ffitarget.h: void * code_pointer; /* Pointer to ffi_closure_ASM */ # grep -R ffi_call_AIX * powerpc-ibm-aix7.2.0.0/.libs/libffi.exp: ffi_call_AIX src/powerpc/aix.S:.globl ffi_call_AIX src/powerpc/ffi_darwin.c: extern void ffi_call_AIX(extended_cif *, long, unsigned, unsigned *, In 64bit, I see that: ffi_darwin.c is compiled and used for building libffi.so.6 . Same in 32bit. The code of file src/powerpc/ffi_darwin.c seems to be able to handle both FFI_AIX and FFI_DARWIN , dynamically based on cif->abi . The code looks like VERY complex! The hypothesis is that the 64bit code has a bug vs the 32bit version. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: AIX: difference between 32bit and 64bit. After the second print, the stack is: 32bit: #0 0xd01407e0 in memchr () from /usr/lib/libc.a(shr.o) #1 0xd438f480 in ffi_call_AIX () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0xd438effc in ffi_call () from /opt/freeware/lib/libffi.a(libffi.so.6) #3 0xd14979bc in ?? () #4 0xd148995c in ?? () #5 0xd20fd5d8 in _PyObject_MakeTpCall () from /opt/freeware/lib/libpython3.8.so 64bit: #0 0x091b0d60 in memchr () from /usr/lib/libc.a(shr_64.o) #1 0x090001217f00 in ffi_closure_ASM () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0x090001217aac in ffi_prep_closure_loc () from /opt/freeware/lib/libffi.a(libffi.so.6) #3 0x09d30900 in ?? () #4 0x09d22b6c in ?? () #5 0x09ebbc18 in _PyObject_MakeTpCall () from /opt/freeware/lib64/libpython3.8.so So, the execution does not run in the same ffi routines in 32bit and in 64bit. Bug ? It should be interesting to do the same with Python3 and libffi built with -O0 -g maybe. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On AIX in 32bit, we have: Thread 2 hit Breakpoint 2, 0xd01407e0 in memchr () from /usr/lib/libc.a(shr.o) (gdb) where #0 0xd01407e0 in memchr () from /usr/lib/libc.a(shr.o) #1 0xd438f480 in ffi_call_AIX () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0xd438effc in ffi_call () from /opt/freeware/lib/libffi.a(libffi.so.6) (gdb) i register r0 0xd01407e0 3490973664 r1 0x2ff20f80 804392832 r2 0xf07a3cc0 4034542784 r3 0xb024c558 2955199832 r4 0x64 100 r5 0x7 7 r6 0x0 0 ... (gdb) x/s 0xb024c558 0xb024c558: "abcdef" r5 is OK. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On Fedora/PPC64LE, where it is OK, the same debug with gdb gives: (gdb) where #0 0x77df03b0 in __memchr_power8 () from /lib64/libc.so.6 #1 0x7fffea167680 in ?? () from /lib64/libffi.so.6 #2 0x7fffea166284 in ffi_call () from /lib64/libffi.so.6 #3 0x7fffea1a7fdc in _ctypes_callproc () from /usr/lib64/python3.8/lib-dynload/_ctypes.cpython-38-ppc64le-linux-gnu.so .. (gdb) i register r0 0x7fffea167614 140737120728596 r1 0x7fffc490 140737488340112 r2 0x7fffea187f00 140737120861952 r3 0x7fffea33a140 140737122640192 r4 0x6464 25700 r5 0x7 7 r6 0x0 0 r7 0x7fffea33a147 140737122640199 r8 0x7fffea33a140 140737122640192 (gdb) x/s 0x7fffea33a140 0x7fffea33a140: "abcdef" r3: string r4 : 0x6464 : "d" ?? r5: 7 : length of the string !!! -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On Fedora/x86_64, in order to get the core, one must do: coredumpctl -o /tmp/core dump /usr/bin/python3.8 -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On AIX: root@castor4## gdb /opt/freeware/bin/python3 ... (gdb) run -m pdb Pb.py ... (Pdb) n b'def' > /home2/freeware/src/packages/BUILD/Python-3.8.5/32bit/Pb.py(35)() -> print( (Pdb) n > /home2/freeware/src/packages/BUILD/Python-3.8.5/32bit/Pb.py(36)() -> CFUNCTYPE(c_char_p, MemchrArgsHack2, (Pdb) Thread 2 received signal SIGINT, Interrupt. [Switching to Thread 1] 0x0916426c in __fd_select () from /usr/lib/libc.a(shr_64.o) (gdb) b ffi_call Breakpoint 1 at 0x1217918 (gdb) c ... (Pdb) n Thread 2 hit Breakpoint 1, 0x090001217918 in ffi_call () from /opt/freeware/lib/libffi.a(libffi.so.6) (gdb) where #0 0x090001217918 in ffi_call () from /opt/freeware/lib/libffi.a(libffi.so.6) #1 0x090001217780 in ffi_prep_cif_machdep () from /opt/freeware/lib/libffi.a(libffi.so.6) #2 0x090001216fb8 in ffi_prep_cif_var () from /opt/freeware/lib/libffi.a(libffi.so.6) .. (gdb) b memchr Breakpoint 2 at 0x91b0d60 (gdb) c Continuing. Thread 2 hit Breakpoint 2, 0x091b0d60 in memchr () from /usr/lib/libc.a(shr_64.o) (gdb) i register r0 0x91b0d60648518346343124320 r1 0xfffc8d01152921504606832848 r2 0x9001000a008e8b8648535941212334264 r3 0xa3669e0720575940382845408 r4 0x64 100 r5 0x0 0 r6 0x9001000a04ee730648535941216921392 r7 0x0 0 ... (gdb) x/s $r3 0xa3669e0: "abcdef" So: - the string is passed as r3. - r4 contains "d" = 0x64=100 - but the size 7 is missing Anyway, it seems that ffi does not pass the pointer, but values. However, the length 7 is missing. Not in r5, and nowhere in the other registers. -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: Fedora32/x86_64 [root@destiny10 tmp]# gdb /usr/bin/python3.8 core ... Core was generated by `python3 ./Pb.py'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7f898a02a1d8 in __memchr_sse2 () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install python3-3.8.3-2.fc32.x86_64 (gdb) where #0 0x7f898a02a1d8 in __memchr_sse2 () from /lib64/libc.so.6 #1 0x7f898982caf0 in ffi_call_unix64 () from /lib64/libffi.so.6 #2 0x7f898982c2ab in ffi_call () from /lib64/libffi.so.6 #3 0x7f8989851ef1 in _ctypes_callproc.cold () from /usr/lib64/python3.8/lib-dynload/_ctypes.cpython-38-x86_64-linux-gnu.so #4 0x7f898985ba2f in PyCFuncPtr_call () from /usr/lib64/python3.8/lib-dynload/_ctypes.cpython-38-x86_64-linux-gnu.so #5 0x7f8989d6c7a1 in _PyObject_MakeTpCall () from /lib64/libpython3.8.so.1.0 #6 0x7f8989d69111 in _PyEval_EvalFrameDefault () from /lib64/libpython3.8.so.1.0 #7 0x7f8989d62ec4 in _PyEval_EvalCodeWithName () from /lib64/libpython3.8.so.1.0 #8 0x7f8989dde109 in PyEval_EvalCodeEx () from /lib64/libpython3.8.so.1.0 #9 0x7f8989dde0cb in PyEval_EvalCode () from /lib64/libpython3.8.so.1.0 #10 0x7f8989dff028 in run_eval_code_obj () from /lib64/libpython3.8.so.1.0 #11 0x7f8989dfe763 in run_mod () from /lib64/libpython3.8.so.1.0 #12 0x7f8989cea81b in PyRun_FileExFlags () from /lib64/libpython3.8.so.1.0 #13 0x7f8989cea19d in PyRun_SimpleFileExFlags () from /lib64/libpython3.8.so.1.0 #14 0x7f8989ce153c in Py_RunMain.cold () from /lib64/libpython3.8.so.1.0 #15 0x7f8989dd1bf9 in Py_BytesMain () from /lib64/libpython3.8.so.1.0 #16 0x7f8989fb7042 in __libc_start_main () from /lib64/libc.so.6 #17 0x557a1f3c407e in _start () -- ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38628] Issue with ctypes in AIX
Tony Reix added the comment: On Fedora32/PPC64LE (5.7.9-200.fc32.ppc64le), with little change: libc = CDLL('/usr/lib64/libc.so.6') I get the correct answer: b'def' b'def' b'def' # python3 --version Python 3.8.3 libffi : 3.1-24 On Fedora32/x86_64 (5.7.9-200.fc32.x86_64), with a little change: libc = CDLL('/usr/lib64/libc-2.31.so') that crashes: b'def' Segmentation fault (core dumped) # python3 --version Python 3.8.3 libffi : 3.1-24 AIX : libffi-3.2.1 On AIX 7.2, with Python 3.8.5 compiled with XLC v13, in 64bit: b'def' b'def' None On AIX 7.2, with Python 3.8.5 compiled with GCC 8.4, in 64bit: b'def' b'def' None On AIX 7.2, with Python 3.8.5 compiled with XLC v13, in 32bit: ( libc = CDLL('libc.a(shr.o)') ) b'def' b'def' b'def' On AIX 7.2, with Python 3.8.5 compiled with GCC 8.4, in 32bit: b'def' b'def' b'def' Preliminary conclusions: - this is a 64bit issue on AIX and it is independent of the compiler - it is worse on Fedora/x86_64 - it works perfectly on Fedora/PPC64LE what a mess. -- nosy: +T.Rex ___ Python tracker <https://bugs.python.org/issue38628> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com