[issue21233] Add *Calloc functions to CPython memory allocation API
Roundup Robot added the comment: New changeset 6374c2d957a9 by Victor Stinner in branch 'default': Issue #21233: Rename the C structure PyMemAllocator to PyMemAllocatorEx to http://hg.python.org/cpython/rev/6374c2d957a9 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: I'm not sure: The usual case with ABI changes is that extensions may segfault if they are *not* recompiled [1]. Ok, I renamed the structure PyMemAllocator to PyMemAllocatorEx, so the compilation fails because PyMemAllocator name is not defined. Modules compiled for Python 3.4 will crash on Python 3.5 if they are not recompiled, but I hope that you recompile your modules when you don't use the stable ABI. Using PyMemAllocator is now more complex because it depends on the Python version. See for example the patch for pyfailmalloc: https://bitbucket.org/haypo/pyfailmalloc/commits/9db92f423ac5f060d6ff499ee4bb74ebc0cf4761 Using the C preprocessor, it's possible to limit the changes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Okay, then let's please call it: _PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize) _PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize) void * PyMem_RawCalloc(size_t nelem, size_t elsize); prototype comes from the POSIX standad: http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html I'm don't want to change the prototype in Python. Extract of Python documentation: .. c:function:: void* PyMem_RawCalloc(size_t nelem, size_t elsize) Allocates *nelem* elements each whose size in bytes is *elsize* (...) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Roundup Robot added the comment: New changeset dff6b4b61cac by Victor Stinner in branch 'default': Issue #21233: Revert bytearray(int) optimization using calloc() http://hg.python.org/cpython/rev/dff6b4b61cac -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: 2) I'm not happy with the refactoring in bytearray_init(). (...) 3) Somewhat similarly, I wonder if it was necessary to refactor PyBytes_FromStringAndSize(). (...) Ok, I reverted the change on bytearray(int) and opened the issue #21644 to discuss these two optimizations. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: I reread the issue. I hope that I now addressed all issues. The remaining issue, bytearray(int) is now tracked by the new issue #21644. -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Roundup Robot added the comment: New changeset 358a12f4d4bc by Victor Stinner in branch 'default': Issue #21233: Fix _PyObject_Alloc() when compiled with WITH_VALGRIND defined http://hg.python.org/cpython/rev/358a12f4d4bc -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: STINNER Victor rep...@bugs.python.org wrote: My final commit includes an addition to What's New in Python 3.5 doc, including a notice in the porting section. It is not enough? I'm not sure: The usual case with ABI changes is that extensions may segfault if they are *not* recompiled [1]. In that case documenting it in What's New is standard procedure. Here the extension *is* recompiled and still segfaults. Even if the API is public, the PyMemAllocator thing is low level. It's not part of the stable ABI. Except failmalloc, I don't know any user. I don't expect a lot of complain and it's easy to port the code. Perhaps it's worth asking on python-dev. Nathaniel's suggestion isn't bad either (e.g. name it PyMemAllocatorEx). [1] I was told on python-dev that many people in fact do not recompile. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: I did a post-commit review. A couple of things: 1) I think Victor and I have a different view of the calloc() parameters. calloc(size_t nmemb, size_t size) If a memory region of bytes is allocated, IMO 'nbytes' should be in the place of 'nmemb' and '1' should be in the place of 'size'. That is, allocate nbytes elements of size 1: calloc(nbytes, 1) In the commit the parameters are reversed in many places, which confuses me quite a bit, since it means allocate one element of size nbytes. calloc(1, nbytes) 2) I'm not happy with the refactoring in bytearray_init(). I think it would be safer to make focused minimal changes in PyByteArray_Resize() instead. In fact, there is a behavior change which isn't correct: Before: === x = bytearray(0) m = memoryview(x) x.__init__(10) Traceback (most recent call last): File stdin, line 1, in module BufferError: Existing exports of data: object cannot be re-sized Now: x = bytearray(0) m = memoryview(x) x.__init__(10) x[0] 0 m[0] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds 3) Somewhat similarly, I wonder if it was necessary to refactor PyBytes_FromStringAndSize(). I find the new version more difficult to understand. 4) _PyObject_Alloc(): assert(nelem = PY_SSIZE_T_MAX / elsize) can be called with elsize = 0. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: I forgot one thing: 5) If WITH_VALGRIND is defined, nbytes is uninitialized in _PyObject_Alloc(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: Another thing: 6) We need some kind of prominent documentation that existing programs need to be changed: Python 3.5.0a0 (default:62438d1b11c7+, May 3 2014, 23:35:03) [GCC 4.8.2] on linux Type help, copyright, credits or license for more information. import failmalloc failmalloc.enable() bytes(1) Segmentation fault (core dumped) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Nathaniel Smith added the comment: A simple solution would be to change the name of the struct, so that non-updated libraries will get a compile error instead of a runtime crash. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: 6) We need some kind of prominent documentation that existing programs need to be changed: My final commit includes an addition to What's New in Python 3.5 doc, including a notice in the porting section. It is not enough? Even if the API is public, the PyMemAllocator thing is low level. It's not part of the stable ABI. Except failmalloc, I don't know any user. I don't expect a lot of complain and it's easy to port the code. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: 5) If WITH_VALGRIND is defined, nbytes is uninitialized in _PyObject_Alloc(). Did you see my second commit? It's nlt already fixed? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: 5) If WITH_VALGRIND is defined, nbytes is uninitialized in _PyObject_Alloc(). Did you see my second commit? It's nlt already fixed? I don't think so, I have revision 5d076506b3f5 here. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: allocate nbytes elements of size 1 PyObject_Malloc(100) asks to allocate one object of 100 bytes. For PyMem_Malloc() and PyMem_RawMalloc(), it's more difficult to guess, but IMO it's sane to bet that a single memory block of size bytes is requested. I consider that char data[100] is a object of 100 bytes, but you call it 100 object of 1 byte. I don't think that using nelem or elsize matters in practice. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: STINNER Victor rep...@bugs.python.org wrote: PyObject_Malloc(100) asks to allocate one object of 100 bytes. Okay, then let's please call it: _PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize) _PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Roundup Robot added the comment: New changeset 5b0fda8f5718 by Victor Stinner in branch 'default': Issue #21233: Add new C functions: PyMem_RawCalloc(), PyMem_Calloc(), http://hg.python.org/cpython/rev/5b0fda8f5718 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: There is no need to hurry. I changed my mind :-p It should be easier for numpy to test the development version of Python. Let's wait for buildbots. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Antoine Pitrou wrote: The real use case I envision is with huge powers of two. If I write: x = 2 ** 100 I created the issue #21419 for this idea. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Roundup Robot added the comment: New changeset 62438d1b11c7 by Victor Stinner in branch 'default': Issue #21233: Oops, Fix _PyObject_Alloc(): initialize nbytes before going to http://hg.python.org/cpython/rev/62438d1b11c7 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: @Stefan: Can you please review calloc-6.patch? Charles-François wrote that the patch looks good, but for such critical operation (memory allocation), I would prefer a second review ;) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: Victor, sure, maybe not right away. If you prefer to commit very soon, I promise to do a post commit review. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: If you prefer to commit very soon, I promise to do a post commit review. There is no need to hurry. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Patch version 6: - I renamed int zero parameter to int use_calloc and move the new parameter at the first position to avoid confusion with nelem. For example, _PyObject_Alloc(ctx, 1, nbytes, 0) becomes _PyObject_Alloc(0, ctx, 1, nbytes). It also more logical to put it in the first position. In bytesobject.c, I leaved it at the parameter at the end since its meaning is different (fill bytes with zero or not) IMO. - I removed my hack (premature optimization) assert(nelem == 1); ... malloc(elsize); and replaced it with a less surprising ... malloc(nelem * elsize); Stefan Charles-François: I hope that the patch looks better to you. -- Added file: http://bugs.python.org/file35097/calloc-6.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: LGTM! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: It looks like Windows supports also lazy initialization of memory pages initialized to zero. According to my microbenchmark on Linux and Windows, only bytes(n) and bytearray(n) are really faster with use_calloc.patch. Most changes of use_calloc.patch are maybe useless since all bytes are initilized to zero, but just after that they are replaced with new bytes. Results of bench_alloc2.py on Windows 7: original vs calloc-4.patch+use_calloc.patch: Common platform: Timer: time.perf_counter Python unicode implementation: PEP 393 Bits: int=32, long=32, long long=64, size_t=32, void*=32 Platform: Windows-7-6.1.7601-SP1 CFLAGS: None Timer info: namespace(adjustable=False, implementation='QueryPerformanceCounter( )', monotonic=True, resolution=1e-08) Platform of campaign orig: SCM: hg revision=4b97092aa4bd branch=default date=2014-04-27 18:02 +0100 Date: 2014-04-28 09:35:30 Python version: 3.5.0a0 (default, Apr 28 2014, 09:33:30) [MSC v.1600 32 bit (Int el)] Timer precision: 4.47 us Platform of campaign calloc: SCM: hg revision=4f0aaa8804c6 tag=tip branch=default date=2014-04-28 09:27 +020 0 Date: 2014-04-28 09:37:37 Python version: 3.5.0a0 (default:4f0aaa8804c6, Apr 28 2014, 09:37:03) [MSC v.160 0 32 bit (Intel)] Timer precision: 4.44 us ---+-+ Tests | orig | calloc ---+-+ object() | 121 ns (*) | 109 ns (-10%) b'A' * 10 | 77 ns (*) | 79 ns b'A' * 10**3 | 159 ns (*) | 168 ns (+5%) b'A' * 10**6 | 428 us (*) | 415 us 'A' * 10 | 87 ns (*) | 89 ns 'A' * 10**3 | 175 ns (*) | 177 ns 'A' * 10**6 | 429 us (*) | 454 us (+6%) 'A' * 10**8 | 48.4 ms (*) | 49 ms (None,) * 10**0 | 49 ns (*) | 51 ns (None,) * 10**1 | 115 ns (*) | 99 ns (-14%) (None,) * 10**2 | 433 ns (*) | 422 ns (None,) * 10**3 | 3.58 us (*) | 3.57 us (None,) * 10**4 | 34.9 us (*) | 34.9 us (None,) * 10**5 | 347 us (*) | 351 us (None,) * 10**6 | 5.14 ms (*) | 4.85 ms (-6%) (None,) * 10**7 | 53.2 ms (*) | 50.2 ms (-6%) (None,) * 10**8 | 563 ms (*) | 515 ms (-9%) ([None] * 10)[1:-1] | 217 ns (*) | 217 ns ([None] * 10**3)[1:-1] | 3.89 us (*) | 3.92 us ([None] * 10**6)[1:-1] | 5.13 ms (*) | 5.17 ms ([None] * 10**8)[1:-1] | 634 ms (*) | 533 ms (-16%) bytes(10) | 193 ns (*) | 206 ns (+7%) bytes(10**3) | 266 ns (*) | 296 ns (+12%) bytes(10**6) | 414 us (*) | 3.89 us (-99%) bytes(10**8) | 44.2 ms (*) | 4.56 us (-100%) bytearray(10) | 229 ns (*) | 243 ns (+6%) bytearray(10**3) | 301 ns (*) | 330 ns (+10%) bytearray(10**6) | 421 us (*) | 3.89 us (-99%) bytearray(10**8) | 44.4 ms (*) | 4.56 us (-100%) ---+-+ Total | 1.4 sec (*) | 1.16 sec (-17%) ---+-+ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Changes on the pickle module don't look like an interesting optimization. It even looks slower. $ python perf.py -b fastpickle,fastunpickle,pickle,pickle_dict,pickle_list,slowpickle,slowunpickle,unpickle ../default/python.orig ../default/python.calloc ... Report on Linux selma 3.13.9-200.fc20.x86_64 #1 SMP Fri Apr 4 12:13:05 UTC 2014 x86_64 x86_64 Total CPU cores: 4 ### fastpickle ### Min: 0.364510 - 0.374144: 1.03x slower Avg: 0.367882 - 0.377714: 1.03x slower Significant (t=-11.54) Stddev: 0.00493 - 0.00347: 1.4209x smaller The following not significant results are hidden, use -v to show them: fastunpickle, pickle_dict, pickle_list. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Patch version 5. This patch is ready for a review. Summary of calloc-5.patch: - add the following functions: * void* PyMem_RawCalloc(size_t nelem, size_t elsize) * void* PyMem_Calloc(size_t nelem, size_t elsize) * void* PyObject_Calloc(size_t nelem, size_t elsize) * PyObject* _PyObject_GC_Calloc(size_t basicsize) - add void* calloc(void *ctx, size_t nelem, size_t elsize) field to the PyMemAllocator structure - optimize bytes(n) and bytearray(n) to allocate objects using calloc() instead of malloc() - update tracemalloc to trace also calloc() - document new functions and add unit tests for the calloc hook (in _testcapi) Changes between versions 4 and 5: - revert all changes except bytes(n) and bytearray(n) of use_calloc.patch: they were useless according to benchmarks - _PyObject_GC_Calloc() now takes a single parameter - add versionadded and versionchanged fields in the documentation According to benchmarks, calloc() is only useful for large allocation (1 MB?) if only a part of the memory block is modified (to non-zero bytes) just after the allocation. Untouched memory pages don't use physical memory and don't use RSS memory pages, but it is possible to read their content (null bytes). Using calloc() instead of malloc()+memset(0) doens't look to be faster (it may be a little bit slower) if all bytes are set just after the allocation. I chose to only use one parameter for _PyObject_GC_Calloc() because this function is used to allocate Python objects. A structure of a Python object must start with PyObject_HEAD or PyObject_VAR_HEAD and so the total size of an object cannot be expressed as NELEM * ELEMSIZE. I have no use case for _PyObject_GC_Calloc(), but it makes sense to use it to allocate a large Python object tracked by the GC and using a single memory block for the Python header + data. PyObject_Calloc() simply use memset(0) for small objects (= 512 bytes). It delegates the allocation to PyMem_RawCalloc(), and so indirectly to calloc(), for larger objects. Note: use_calloc.patch is no more needed, I merged the two patches since only bytes(n) and bytearray(n) now use calloc(). -- Added file: http://bugs.python.org/file35072/calloc-5.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Demo of calloc-5.patch on Linux. Thanks to calloc(), bytes(50 * 1024 * 1024) doesn't allocate memory for null bytes and so the RSS memory is unchanged (+148 kB, not +50 MB), but tracemalloc says that 50 MB were allocated. $ ./python -X tracemalloc Python 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 10:40:53) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux Type help, copyright, credits or license for more information. import os, tracemalloc os.system(grep RSS /proc/%s/status % os.getpid()) VmRSS: 10736 kB 0 before = tracemalloc.get_traced_memory()[0] large = bytes(50 * 1024 * 1024) import sys sys.getsizeof(large) / 1024. 51200.0478515625 (tracemalloc.get_traced_memory()[0] - before) / 1024. 51198.1962890625 os.system(grep RSS /proc/%s/status % os.getpid()) VmRSS: 10884 kB 0 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: With the latest patch the decimal benchmark with a lot of small allocations is consistently 2% slower. Large factorials (where the operands are initialized to zero for the number-theoretic transform) have the same performance with and without the patch. It would be interesting to see some NumPy benchmarks (Nathaniel?). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: With the latest patch the decimal benchmark with a lot of small allocations is consistently 2% slower. Does your benchmark use bytes(int) or bytearray(int)? If not, I guess that your benchmark is not reliable because only these two functions are changed by calloc-5.patch, except if there is a bug in my patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: Hmm, obmalloc.c changed as well, so already the gcc optimizer can take different paths and produce different results. Also I did set mpd_callocfunc to PyMem_Calloc(). 2% slowdown is far from being a tragic result, so I guess we can ignore that. The bytes() speedup is very nice. Allocations that took one second are practically instant now. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Also I did set mpd_callocfunc to PyMem_Calloc(). 2% slowdown is far from being a tragic result, so I guess we can ignore that. Agreed. The bytes() speedup is very nice. Allocations that took one second are practically instant now. Indeed. Victor, thanks for the great work! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Hmm, obmalloc.c changed as well, so already the gcc optimizer can take different paths and produce different results. If decimal depends on allocator performances, you should maybe try to implement a freelist. Also I did set mpd_callocfunc to PyMem_Calloc(). I don't understand. 2% slowdown is when you use calloc? Do you have the same speed if you don't use calloc? According to my benchmarks, calloc is slower if some bytes are modified later. The bytes() speedup is very nice. Allocations that took one second are practically instant now. Is it really useful? Who need bytes(10**8) object? Faster creation of bytearray(int) may be useful in real applications. I really like bytearray and memoryview to avoid memory copies. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: The order of the nelem/elsize matters for readability. Otherwise it is not intuitive what happens after the jump to redirect in _PyObject_Alloc(). Why would you assert that 'nelem' is one? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Nathaniel Smith added the comment: It would be interesting to see some NumPy benchmarks (Nathaniel?). What is it you want to see? NumPy already uses calloc; we benchmarked it when we added it and it made a huge difference to various realistic workloads :-). What NumPy gets out of this isn't calloc, it's access to tracemalloc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in #18835. The alignement issue is really orthogonal to the calloc one, so IMO this shouldn't be discussed here (and FWIW I don't think we should expose those: alignement only matters either for concurrency or SIMD instructions, and I don't think we should try to standardize this kind of API, it's way to special-purpose (then we'd have to think about huge pages, etc...). Whereas calloc is a simple and immediately useful addition, not only for Numpy but also CPython). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: 2014-04-27 10:30 GMT+02:00 Charles-François Natali rep...@bugs.python.org: I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in #18835. The alignement issue is really orthogonal to the calloc one, so IMO this shouldn't be discussed here (and FWIW I don't think we should expose those: alignement only matters either for concurrency or SIMD instructions, and I don't think we should try to standardize this kind of API, it's way to special-purpose (then we'd have to think about huge pages, etc...). Whereas calloc is a simple and immediately useful addition, not only for Numpy but also CPython). This issue was opened to be able to use tracemalloc on numpy. I would like to make sure that calloc is enough for numpy. I would prefer to change the malloc API only once. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: This issue was opened to be able to use tracemalloc on numpy. I would like to make sure that calloc is enough for numpy. I would prefer to change the malloc API only once. Then please at least rename the issue. Also, I don't see why everything should be done at once: calloc support is a self-contained change, which is useful outside of numpy. Enhanced tracemalloc support for numpy certainly belongs to another issue. Regarding the *Calloc functions: how about we provide a sane API instead of reproducing the cumbersome C API? I mean, why not expose: PyAPI_FUNC(void *) PyMem_Calloc(size_t size); insteaf of PyAPI_FUNC(void *) PyMem_Calloc(size_t nelem, size_t elsize); AFAICT, the two arguments are purely historical (it was used when malloc() didn't guarantee suitable alignment, and has the advantage of performing overflow check when doing the multiplication, but in our code we always check for it anyway). See https://groups.google.com/forum/#!topic/comp.lang.c/jZbiyuYqjB4 http://stackoverflow.com/questions/4083916/two-arguments-to-calloc And http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/malloc/malloc.c?view=markup to check that calloc(nelem, elsize) is implemented as calloc(nelem * elsize) I'm also concerned about the change to _PyObject_GC_Malloc(): it now calls calloc() instead of malloc(): it can definitely be slower. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Note to numpy devs: it would be great if some of you followed the python-dev mailing list (I know it can be quite volume intensive, but maybe simple filters could help keep the noise down): you guys have definitely both expertise and real-life applications that could be very valuable in helping us design the best possible public/private APIs. It's always great to have downstream experts/end-users! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: I wrote a short microbenchmark allocating objects using my benchmark.py script. It looks like the operation (None,) * N is slower with calloc-3.patch, but it's unclear how much times slower it is. I don't understand why only this operation has different speed. Do you have ideas for other benchmarks? Using the timeit module: $ ./python.orig -m timeit '(None,) * 10**5' 1000 loops, best of 3: 357 usec per loop $ ./python.calloc -m timeit '(None,) * 10**5' 1000 loops, best of 3: 698 usec per loop But with different parameters, the difference is lower: $ ./python.orig -m timeit -r 20 -n '1000' '(None,) * 10**5' 1000 loops, best of 20: 362 usec per loop $ ./python.calloc -m timeit -r 20 -n '1000' '(None,) * 10**5' 1000 loops, best of 20: 392 usec per loop Results of bench_alloc.py: Common platform: CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Python unicode implementation: PEP 393 Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Timer: time.perf_counter SCM: hg revision=462470859e57+ branch=default date=2014-04-26 19:01 -0400 Platform: Linux-3.13.8-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug Bits: int=32, long=64, long long=64, size_t=64, void*=64 Platform of campaign orig: Timer precision: 42 ns Date: 2014-04-27 12:27:26 Python version: 3.5.0a0 (default:462470859e57, Apr 27 2014, 11:52:55) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Platform of campaign calloc: Timer precision: 45 ns Date: 2014-04-27 12:29:10 Python version: 3.5.0a0 (default:462470859e57+, Apr 27 2014, 12:04:57) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] ---+--+--- Tests | orig | calloc ---+--+--- object() | 61 ns (*) | 62 ns b'A' * 10 | 55 ns (*) | 51 ns (-7%) b'A' * 10**3 | 99 ns (*) | 94 ns b'A' * 10**6 | 37.5 us (*) | 36.6 us 'A' * 10 | 62 ns (*) | 58 ns (-7%) 'A' * 10**3 | 107 ns (*) | 104 ns 'A' * 10**6 | 37 us (*) | 36.6 us 'A' * 10**8 | 16.2 ms (*) | 16.4 ms decode 10 null bytes from ASCII | 253 ns (*) | 248 ns decode 10**3 null bytes from ASCII | 359 ns (*) | 357 ns decode 10**6 null bytes from ASCII | 78.8 us (*) | 78.7 us decode 10**8 null bytes from ASCII | 26.2 ms (*) | 25.9 ms (None,) * 10**0 | 30 ns (*) | 30 ns (None,) * 10**1 | 78 ns (*) | 77 ns (None,) * 10**2 | 427 ns (*) | 460 ns (+8%) (None,) * 10**3 | 3.5 us (*) | 3.7 us (+6%) (None,) * 10**4 | 34.7 us (*) | 37.2 us (+7%) (None,) * 10**5 | 357 us (*) | 390 us (+9%) (None,) * 10**6 | 3.86 ms (*) | 4.43 ms (+15%) (None,) * 10**7 | 50.4 ms (*) | 50.3 ms (None,) * 10**8 | 505 ms (*) | 504 ms ([None] * 10)[1:-1] | 121 ns (*) | 120 ns ([None] * 10**3)[1:-1] | 3.57 us (*) | 3.57 us ([None] * 10**6)[1:-1] | 4.61 ms (*) | 4.59 ms ([None] * 10**8)[1:-1] | 585 ms (*) | 582 ms ---+--+--- Total | 1.19 sec (*) | 1.19 sec ---+--+--- -- Added file: http://bugs.python.org/file35052/bench_alloc.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Regarding the *Calloc functions: how about we provide a sane API instead of reproducing the cumbersome C API? Isn't the point of reproducing the C API to allow quickly switching from calloc() to PyObject_Calloc()? (besides, it seems the OpenBSD guys like the two-argument form :-)) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: Just to add another data point, I don't find the calloc() API cumbersome. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: It looks like calloc-3.patch is wrong: it modify _PyObject_GC_Malloc() to fill the newly allocated buffer with zeros, but _PyObject_GC_Malloc() is not only called by PyType_GenericAlloc(): it is also used by _PyObject_GC_New() and _PyObject_GC_NewVar(). The patch is maybe a little bit slower because it writes zeros twice. calloc.patch adds PyObject* _PyObject_GC_Calloc(size_t); and doesn't have this issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: Actually, I think we have to match the C-API: For instance, in Modules/_decimal/_decimal.c:5527 the libmpdec allocators are set to the Python allocators. So I'd need to do: mpd_callocfunc = PyMem_Calloc; I suppose that's a common use case. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: It looks like calloc-3.patch is wrong: it modify _PyObject_GC_Malloc() to fill the newly allocated buffer with zeros, but _PyObject_GC_Malloc() is not only called by PyType_GenericAlloc(): it is also used by _PyObject_GC_New() and _PyObject_GC_NewVar(). The patch is maybe a little bit slower because it writes zeros twice. Exactly (sorry, I thought you'd already seen that, otherwise I could have told you!) Actually, I think we have to match the C-API: For instance, in Modules/_decimal/_decimal.c:5527 the libmpdec allocators are set to the Python allocators. Hmm, ok then, I didn't know we were plugging our allocators for external libraries: that's indeed a very good reason to keep the same prototype. But I still find this API cumbersome: calloc is exactly like malloc except for the zeroing, so the prototype could be simpler (a quick look at Victor's patch shows a lot of calloc(1, n), which is a sign something's wrong). Maybe it's just me ;-) Otherwise, a random thought: by changing PyType_GenericAlloc() from malloc() + memset(0) to calloc(), there could be a subtle side effect: if a given type relies on the 0-setting (which is documented), and doesn't do any other work on the allocated area behind the scenes (think about a mmap-like object), we could lose our capacity to detect MemoryError, and run into segfaults instead. Because if a code creates many such objects which basically just do calloc(), on operating systems with memory overommitting (such as Linux), the calloc() allocations will pretty much always succeed, but will segfault when the page is first written to in case of low memory. I don't think such use cases should be common: I would expect most types to use tp_alloc(type, 0) and then use an internal additional pointer for the allocations it needs, or immediately write to the allocated memory area right after allocation, but that's something to keep in mind. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: And http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/malloc/malloc.c?view=markup to check that calloc(nelem, elsize) is implemented as calloc(nelem * elsize) __libc_calloc() starts with a check on integer overflow. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: __libc_calloc() starts with a check on integer overflow. Yes, see my previous message: AFAICT, the two arguments are purely historical (it was used when malloc() didn't guarantee suitable alignment, and has the advantage of performing overflow check when doing the multiplication, but in our code we always check for it anyway). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: list: items are allocated in a second memory block. PyList_New() uses memset(0) to set all items to NULL. tuple: header and items are stored in a single structure (PyTupleObject), in a single memory block. PyTuple_New() fills the items will NULL (so write again null bytes). Something can be optimized here. dict: header, keys and values are stored in 3 different memory blocks. It may be interesting to use calloc() to allocate keys and values. Initialization of keys and values to NULL uses a dummy loop. I expect that memset(0) would be faster. Anyway, I expect that all items of builtin containers (tuple, list, dict, etc.) are set to non-NULL values. So the lazy initialization to zeros may be useless for them. It means that benchmarking builtin containers should not show any speedup. Something else (numpy?) should be used to see an interesting speedup. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Because if a code creates many such objects which basically just do calloc(), on operating systems with memory overommitting (such as Linux), the calloc() allocations will pretty much always succeed, but will segfault when the page is first written to in case of low memory. Overcommit leads to segmentation fault when there is no more memory, but I don't see how calloc() is worse then malloc()+memset(0). It will crash in both cases, no? In my experience (embedded device with low memory), programs crash because they don't check the result of malloc() (return NULL on allocation failure), not because of overcommit. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Nathaniel Smith added the comment: @Charles-François: I think your worries about calloc and overcommit are unjustified. First, calloc and malloc+memset actually behave the same way here -- with a large allocation and overcommit enabled, malloc and calloc will both go ahead and return the large allocation, and then the actual out-of-memory (OOM) event won't occur until the memory is accessed. In the malloc+memset case this access will occur immediately after the malloc, during the memset -- but this is still too late for us to detect the malloc failure. Second, OOM does not cause segfaults on any system I know. On Linux it wakes up the OOM killer, which shoots some random (possibly guilty) process in the head. The actual program which triggered the OOM is quite likely to escape unscathed. In practice, the *only* cases where you can get a MemoryError on modern systems are (a) if the user has turned overcommit off, (b) you're on a tiny embedded system that doesn't have overcommit, (c) if you run out of virtual address space. None of these cases are affected by the differences between malloc and calloc. Regarding the calloc API: it's a wart, but it seems like a pretty unavoidable wart at this point, and the API compatibility argument is strong. I think we should just keep the two argument form and live with it... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: @Charles-François: I think your worries about calloc and overcommit are unjustified. First, calloc and malloc+memset actually behave the same way here -- with a large allocation and overcommit enabled, malloc and calloc will both go ahead and return the large allocation, and then the actual out-of-memory (OOM) event won't occur until the memory is accessed. In the malloc+memset case this access will occur immediately after the malloc, during the memset -- but this is still too late for us to detect the malloc failure. Not really: what you describe only holds for a single object. But if you allocate let's say 1000 such objects at once: - in the malloc + memset case, the committed pages are progressively accessed (i.e. the pages for object N are accessed before the memory is allocated for object N+1), so they will be counted not only as committed, but also as active (for example the RSS will increase gradually): so at some point, even though by default the Linux VM subsystem is really lenient toward overcommitting, you'll likely have malloc/mmap return NULL because of this - in the calloc() case, all the memory is first committed, but not touched: the kernel will likely happily overcommit all of this. Only when you start progressively accessing the pages will the OOM kick in. Second, OOM does not cause segfaults on any system I know. On Linux it wakes up the OOM killer, which shoots some random (possibly guilty) process in the head. The actual program which triggered the OOM is quite likely to escape unscathed. Ah, did I say segfault? Sorry, I of course meant that the process will get nuked by the OOM killer. In practice, the *only* cases where you can get a MemoryError on modern systems are (a) if the user has turned overcommit off, (b) you're on a tiny embedded system that doesn't have overcommit, (c) if you run out of virtual address space. None of these cases are affected by the differences between malloc and calloc. That's a common misconception: provided that the memory allocated is accessed progressively (see above point), you'll often get ENOMEM, even with overcommitting: $ /sbin/sysctl -a | grep overcommit vm.nr_overcommit_hugepages = 0 vm.overcommit_memory = 0 vm.overcommit_ratio = 50 $ cat /tmp/test.py l = [] with open('/proc/self/status') as f: try: for i in range(5000): l.append(i) except MemoryError: for line in f: if 'VmPeak' in line: print(line) raise $ python /tmp/test.py VmPeak: 720460 kB Traceback (most recent call last): File /tmp/test.py, line 7, in module l.append(i) MemoryError I have a 32-bit machine, but the process definitely has more than 720MB of address space ;-) If your statement were true, this would mean that it's almost impossible to get ENOMEM with overcommitting on a 64-bit machine, which is - fortunately - not true. Just try python -c [i for i in range(large value)] on a 64-bit machine, I'll bet you'll get a MemoryError (ENOMEM). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Just try python -c [i for i in range(large value)] on a 64-bit machine, I'll bet you'll get a MemoryError (ENOMEM). Hmm, I get an OOM kill here. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Nathaniel Smith added the comment: On my laptop (x86-64, Linux 3.13, 12 GB RAM): $ python3 -c [i for i in range(9)] zsh: killed python3 -c [i for i in range(9)] $ dmesg | tail -n 2 [404714.401901] Out of memory: Kill process 10752 (python3) score 687 or sacrifice child [404714.401903] Killed process 10752 (python3) total-vm:17061508kB, anon-rss:10559004kB, file-rss:52kB And your test.py produces the same result. Are you sure you don't have a ulimit set on address space? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: And your test.py produces the same result. Are you sure you don't have a ulimit set on address space? Yep, I'm sure: $ ulimit -v unlimited It's probably due to the exponential over-allocation used by the array (to guarantee amortized constant cost). How about: python -c b = bytes('x' * large) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Dammit, read: python -c 'bx * (2**48)' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Nathaniel Smith added the comment: Right, python3 -c 'bx * (2 ** 48)' does give an instant MemoryError for me. So I was wrong about it being the VM limit indeed. The documentation on this is terrible! But, if I'm reading this right: http://lxr.free-electrons.com/source/mm/util.c#L434 the actual rules are: overcommit mode 1: allocating a VM range always succeeds. overcommit mode 2: (Slightly simplified) You can allocate total VM ranges up to (swap + RAM * overcommit_ratio), and overcommit_ratio is 50% by default. So that's a bit odd, but whatever. This is still entirely a limit on VM size. overcommit mode 0 (guess, the default): when allocating a VM range, the kernel imagines what would happen if you immediately used all those pages. If that would put you OOM, then we fall back to mode 2 rules. If that would *not* put you OOM, then the allocation unconditionally succeeds. So yeah, touching pages can affect whether a later malloc returns ENOMEM. I'm not sure any of this actually matters in the Python case though :-). There's still no reason to go touching pages pre-emptively just in case we might write to them later -- all that does is increase the interpreter's memory footprint, which can't help anything. If people are worried about overcommit, then they should turn off overcommit, not try and disable it on a piece-by-piece basis by trying to get individual programs to memory before they need it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Alright, it bothered me so I wrote a small C testcase (attached), which calls malloc in a loop, and can call memset upon the allocated block right after allocation: $ gcc -o /tmp/test /tmp/test.c; /tmp/test malloc() returned NULL after 3050MB $ gcc -DDO_MEMSET -o /tmp/test /tmp/test.c; /tmp/test malloc() returned NULL after 2130MB Without memset, the kernel happily allocates until we reach the 3GB user address space limit. With memset, it bails out way before. I don't know what this'll give on 64-bit, but I assume one should get comparable result. I would guess that the reason why the Python list allocation fails is because of the exponential allocation scheme: since memory is allocated in large chunks before being used, the kernel happily overallocates. With a more progressive allocation+usage, it should return ENOMEM at some point. Anyway, that's probably off-topic! -- Added file: http://bugs.python.org/file35059/test.c ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___#include stdio.h #include stdlib.h #include string.h #define BLOCK_SIZE (10*1024*1024) int main(int argc, char *argv[]) { unsigned long size = 0; char *p; while ((p = malloc(BLOCK_SIZE)) != NULL) { #ifdef DO_MEMSET memset(p, 0, BLOCK_SIZE); #endif size += BLOCK_SIZE; } printf(malloc() returned NULL after %uMB\n, (size/(1024*1024))); exit(EXIT_SUCCESS); } ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: So yeah, touching pages can affect whether a later malloc returns ENOMEM. I'm not sure any of this actually matters in the Python case though :-). There's still no reason to go touching pages pre-emptively just in case we might write to them later -- all that does is increase the interpreter's memory footprint, which can't help anything. If people are worried about overcommit, then they should turn off overcommit, not try and disable it on a piece-by-piece basis by trying to get individual programs to memory before they need it. Absolutely: that's why I'm really in favor of exposing calloc, this could definitely help many workloads. Victor, did you run any non-trivial benchmark, like pybench Co? As I said, I'm not expecting any improvement, I just want to make sure there's not hidden regression somewhere (like the one for GC-tracked objects above). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: $ gcc -o /tmp/test /tmp/test.c; /tmp/test malloc() returned NULL after 3050MB $ gcc -DDO_MEMSET -o /tmp/test /tmp/test.c; /tmp/test malloc() returned NULL after 2130MB Without memset, the kernel happily allocates until we reach the 3GB user address space limit. With memset, it bails out way before. I don't know what this'll give on 64-bit, but I assume one should get comparable result. Both OOM here (3.11.0-20-generic, 64-bit, Ubuntu). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: This is probably offtopic, but I think people who want reliable MemoryErrors can use limits, e.g. via djb's softlimit (daemontools): $ softlimit -m 1 ./python Python 3.5.0a0 (default:462470859e57+, Apr 27 2014, 19:34:06) [GCC 4.7.2] on linux Type help, copyright, credits or license for more information. [i for i in range(999)] Traceback (most recent call last): File stdin, line 1, in module File stdin, line 1, in listcomp MemoryError -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Both OOM here (3.11.0-20-generic, 64-bit, Ubuntu). Hm... What's /proc/sys/vm/overcommit_memory ? If it's set to 0, then the kernel will always overcommit. If you set it to 2, normally you'd definitely get ENOMEM (which is IMO much nicer than getting nuked by the OOM killer, especially because, like in real life, there's often collateral damage ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Hm... What's /proc/sys/vm/overcommit_memory ? If it's set to 0, then the kernel will always overcommit. I meant 1 (damn, I need sleep). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Hm... What's /proc/sys/vm/overcommit_memory ? If it's set to 0, then the kernel will always overcommit. Ah, indeed. If you set it to 2, normally you'd definitely get ENOMEM You're right, but with weird results: $ gcc -o /tmp/test test.c; /tmp/test malloc() returned NULL after 600MB $ gcc -DDO_MEMSET -o /tmp/test test.c; /tmp/test malloc() returned NULL after 600MB (I'm supposed to have gigabytes free?!) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Hm... What's /proc/sys/vm/overcommit_memory ? If it's set to 0, then the kernel will always overcommit. Ah, indeed. See above, I mistyped: 0 is the default (which is already quite optimistic), 1 is always. If you set it to 2, normally you'd definitely get ENOMEM You're right, but with weird results: $ gcc -o /tmp/test test.c; /tmp/test malloc() returned NULL after 600MB $ gcc -DDO_MEMSET -o /tmp/test test.c; /tmp/test malloc() returned NULL after 600MB (I'm supposed to have gigabytes free?!) The formula is RAM * vm.overcommit_ratio /100 + swap So if you don't have swap, or a low overcommit_ratio, it could explain why it returns so early. Or maybe you have some processes with a lot of mapped-yet-unused memory (chromium is one of those for example). Anyway, it's really a mess! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Changes by STINNER Victor victor.stin...@gmail.com: Added file: http://bugs.python.org/file35064/use_calloc.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: I splitted my patch into two parts: - calloc-4.patch: add new Calloc functions including _PyObject_GC_Calloc() - use_calloc.patch: patch types (bytes, dict, list, set, tuple, etc.) and various modules to use calloc I reverted my changes on _PyObject_GC_Malloc() and added _PyObject_GC_Calloc(), performance regressions are gone. Creating a large tuple is a little bit (8%) faster. But the real speedup is to build a large bytes strings of null bytes: $ ./python.orig -m timeit 'bytes(50*1024*1024)' 100 loops, best of 3: 5.7 msec per loop $ ./python.calloc -m timeit 'bytes(50*1024*1024)' 10 loops, best of 3: 4.12 usec per loop On Linux, no memory is allocated, even if you read the bytes content. RSS is almost unchanged. Ok, now the real use case where it becomes faster: I implemented the same optimization for bytearray. $ ./python.orig -m timeit 'bytearray(50*1024*1024)' 100 loops, best of 3: 6.33 msec per loop $ ./python.calloc -m timeit 'bytearray(50*1024*1024)' 10 loops, best of 3: 4.09 usec per loop If you overallocate a bytearray and only write a few bytes, the bytes of end of bytearray will not be allocated (at least on Linux). Result of bench_alloc.py comparing original Python to patched Python (calloc-4.patch + use_calloc.patch). Common platform: SCM: hg revision=4b97092aa4bd+ tag=tip branch=default date=2014-04-27 18:02 +0100 Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Python unicode implementation: PEP 393 CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Bits: int=32, long=64, long long=64, size_t=64, void*=64 Timer: time.perf_counter CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug Platform of campaign orig: Timer precision: 42 ns Date: 2014-04-28 00:27:19 Python version: 3.5.0a0 (default:4b97092aa4bd, Apr 28 2014, 00:24:03) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Platform of campaign calloc: Timer precision: 54 ns Date: 2014-04-28 00:28:35 Python version: 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 00:25:56) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] ---+-+-- Tests | orig | calloc ---+-+-- object() | 61 ns (*) | 71 ns (+16%) b'A' * 10 | 54 ns (*) | 52 ns b'A' * 10**3 | 124 ns (*) | 110 ns (-12%) b'A' * 10**6 | 38.4 us (*) | 38.5 us 'A' * 10 | 59 ns (*) | 62 ns 'A' * 10**3 | 132 ns (*) | 107 ns (-19%) 'A' * 10**6 | 38.5 us (*) | 38.5 us 'A' * 10**8 | 10.3 ms (*) | 10.6 ms decode 10 null bytes from ASCII | 264 ns (*) | 263 ns decode 10**3 null bytes from ASCII | 403 ns (*) | 379 ns (-6%) decode 10**6 null bytes from ASCII | 80.5 us (*) | 80.5 us decode 10**8 null bytes from ASCII | 17.7 ms (*) | 17.3 ms (None,) * 10**0 | 29 ns (*) | 28 ns (None,) * 10**1 | 75 ns (*) | 76 ns (None,) * 10**2 | 461 ns (*) | 460 ns (None,) * 10**3 | 3.6 us (*) | 3.57 us (None,) * 10**4 | 35.7 us (*) | 35.7 us (None,) * 10**5 | 364 us (*) | 365 us (None,) * 10**6 | 4.12 ms (*) | 4.11 ms (None,) * 10**7 | 43.5 ms (*) | 40.3 ms (-7%) (None,) * 10**8 | 433 ms (*) | 400 ms (-8%) ([None] * 10)[1:-1] | 121 ns (*) | 134 ns (+11%) ([None] * 10**3)[1:-1] | 3.62 us (*) | 3.61 us ([None] * 10**6)[1:-1] | 4.24 ms (*) | 4.22 ms ([None] * 10**8)[1:-1] | 440 ms (*) | 402 ms (-9%) ---+-+-- Total | 954 ms (*) | 880 ms (-8%) ---+-+-- -- Added file: http://bugs.python.org/file35063/calloc-4.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: bench_alloc2.py: updated benchmark script. I added bytes(n) and bytearray(n) tests and removed the test decoding from ASCII. Common platform: Timer: time.perf_counter Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug SCM: hg revision=4b97092aa4bd+ tag=tip branch=default date=2014-04-27 18:02 +0100 Python unicode implementation: PEP 393 Bits: int=32, long=64, long long=64, size_t=64, void*=64 CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz Platform of campaign orig: Date: 2014-04-28 01:11:49 Timer precision: 39 ns Python version: 3.5.0a0 (default:4b97092aa4bd, Apr 28 2014, 01:02:01) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Platform of campaign calloc: Date: 2014-04-28 01:12:29 Timer precision: 44 ns Python version: 3.5.0a0 (default:4b97092aa4bd+, Apr 28 2014, 01:06:54) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] ---+-+ Tests | orig | calloc ---+-+ object() | 62 ns (*) | 72 ns (+16%) b'A' * 10 | 53 ns (*) | 52 ns b'A' * 10**3 | 96 ns (*) | 110 ns (+15%) b'A' * 10**6 | 38.5 us (*) | 38.6 us 'A' * 10 | 59 ns (*) | 61 ns 'A' * 10**3 | 105 ns (*) | 108 ns 'A' * 10**6 | 38.6 us (*) | 38.6 us 'A' * 10**8 | 10.3 ms (*) | 10.4 ms (None,) * 10**0 | 29 ns (*) | 29 ns (None,) * 10**1 | 75 ns (*) | 76 ns (None,) * 10**2 | 432 ns (*) | 461 ns (+7%) (None,) * 10**3 | 3.58 us (*) | 3.6 us (None,) * 10**4 | 35.8 us (*) | 35.7 us (None,) * 10**5 | 365 us (*) | 365 us (None,) * 10**6 | 4.1 ms (*) | 4.13 ms (None,) * 10**7 | 43.6 ms (*) | 40.3 ms (-8%) (None,) * 10**8 | 433 ms (*) | 401 ms (-7%) ([None] * 10)[1:-1] | 122 ns (*) | 134 ns (+10%) ([None] * 10**3)[1:-1] | 3.6 us (*) | 3.62 us ([None] * 10**6)[1:-1] | 4.22 ms (*) | 4.2 ms ([None] * 10**8)[1:-1] | 441 ms (*) | 402 ms (-9%) bytes(10) | 137 ns (*) | 136 ns bytes(10**3) | 181 ns (*) | 191 ns (+5%) bytes(10**6) | 38.7 us (*) | 39.2 us bytes(10**8) | 10.3 ms (*) | 4.36 us (-100%) bytearray(10) | 138 ns (*) | 153 ns (+11%) bytearray(10**3) | 184 ns (*) | 211 ns (+14%) bytearray(10**6) | 38.7 us (*) | 39.3 us bytearray(10**8) | 10.3 ms (*) | 4.32 us (-100%) ---+-+ Total | 957 ms (*) | 862 ms (-10%) ---+-+ -- Added file: http://bugs.python.org/file35065/bench_alloc2.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Common platform: Timer: time.perf_counter Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Platform: Linux-3.13.9-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug ^ Are you sure this is a good platform for performance reports? :) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Are you sure this is a good platform for performance reports? :) Don't hesitate to rerun my benchmark on more different platforms? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Don't hesitate to rerun my benchmark on more different platforms? Oops, I wanted to write ;-) not ?. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Ok, now the real use case where it becomes faster: I implemented the same optimization for bytearray. The real use case I envision is with huge powers of two. If I write: x = 2 ** 100 then all of x's bytes except the highest one will be zeros. If we map those to /dev/zero, it will be a massive saving for programs using huge powers of two. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: The real use case I envision is with huge powers of two. I'm not sure that it's a common use case, but it can be nice to optimize this case if it doesn't make longobject.c more complex. It looks like calloc() becomes interesting for objects larger than 1 MB. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in #18835. If Python doesn't provide such memory allocators, it was suggested to provide a trace function which can be called on the result of a successful allocator to trace an allocation (and a similar function for free). But this is very different from the design of the PEP 445 (new malloc API). Basically, it requires to rewrite the PEP 445. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: Do you have benchmarks? (I'm not looking for an improvement, just no regression.) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Julian Taylor added the comment: won't replacing _PyObject_GC_Malloc with a calloc cause Var objects (PyObject_NewVar) to be completely zeroed which I think they didn't before? Some numeric programs stuff a lot of data into var objects and could care about python suddenly setting them to zero when they don't need it. An example would be tinyarray. -- nosy: +jtaylor ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Josh Rosenberg added the comment: Julian: No. See the diff: http://bugs.python.org/review/21233/diff/11644/Objects/typeobject.c The original GC_Malloc was explicitly memset-ing after confirming that it received a non-NULL pointer from the underlying malloc call; that memset is removed in favor of using the calloc call. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Josh Rosenberg added the comment: Well, to be more specific, PyType_GenericAlloc was originally calling one of two methods that didn't zero the memory (one of which was GC_Malloc), then memset-ing. Just realized you're talking about something else; not sure if you're correct about this now, but I have to get to work, will check later if no one else does. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Julian Taylor added the comment: I just tested it, PyObject_NewVar seems to use RawMalloc not the GC malloc so its probably fine. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Charles-François Natali added the comment: So what is the point of _PyObject_GC_Calloc ? It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see: https://mail.python.org/pipermail/python-dev/2014-April/133985.html It will only make a difference if the allocated region is large enough to be allocated by mmap (so not for 90% of objects). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: So what is the point of _PyObject_GC_Calloc ? It calls calloc(size) instead of malloc(size) No, the question is why you didn't simply change _PyObject_GC_Malloc (which is a private function). Oh ok, I didn't understand. I don't like changing the behaviour of functions, but it's maybe fine if the function is private. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: 2014-04-16 3:18 GMT-04:00 Charles-François Natali rep...@bugs.python.org: It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see: https://mail.python.org/pipermail/python-dev/2014-April/133985.html It will only make a difference if the allocated region is large enough to be allocated by mmap (so not for 90% of objects). Even if there are only 10% of cases where it may be faster, I think that it's interesting to use calloc() to allocate Python objects. You may create large Python objects ;-) I didn't check which objects use (indirectly) _PyObject_GC_Calloc(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Stefan Krah added the comment: I left a Rietveld comment, which probably did not get mailed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: On mer., 2014-04-16 at 08:06 +, STINNER Victor wrote: I didn't check which objects use (indirectly) _PyObject_GC_Calloc(). I've checked: lists, tuples, dicts and sets at least seem to use it. Obviously, objects which are not tracked by the GC (such as str and bytes) won't use it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Patch version 3: remove _PyObject_GC_Calloc(), modify _PyObject_GC_Malloc() instead of use calloc() instead of malloc()+memset(0). -- Added file: http://bugs.python.org/file34924/calloc-3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
New submission from Nathaniel Smith: Numpy would like to switch to using the CPython allocator interface in order to take advantage of the new tracemalloc infrastructure in 3.4. But, numpy relies on the availability of calloc(), and the CPython allocator API does not expose calloc(). https://docs.python.org/3.5/c-api/memory.html#c.PyMemAllocator So, we should add *Calloc variants. This met general approval on python-dev. Thread here: https://mail.python.org/pipermail/python-dev/2014-April/133985.html This would involve adding a new .calloc field to the PyMemAllocator struct, exposed through new API functions PyMem_RawCalloc, PyMem_Calloc, PyObject_Calloc. [It's not clear that all 3 would really be used, but since we have only one PyMemAllocator struct that they all share, it'd be hard to add support to only one or two of these domains and not the rest. And the higher-level calloc variants might well be used. Numpy array buffers are often small (e.g., holding only a single value), and these small buffers benefit from small-alloc optimizations; meanwhile, large buffers benefit from calloc optimizations. So it might be optimal to use a single allocator that has both.] We might also have to rename the PyMemAllocator struct to ensure that compiling old code with the new headers doesn't silently leave garbage in the .calloc field and lead to crashes. -- components: Interpreter Core messages: 216281 nosy: njs priority: normal severity: normal status: open title: Add *Calloc functions to CPython memory allocation API type: enhancement versions: Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Changes by Stefan Krah stefan-use...@bytereef.org: -- nosy: +skrah ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Changes by Éric Araujo mer...@netwok.org: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: Here is a first patch adding the following functions: void* PyMem_RawCalloc(size_t n); void* PyMem_Calloc(size_t n); void* PyObject_Calloc(size_t n); PyObject* _PyObject_GC_Calloc(size_t); It adds the following field after malloc field to PyMemAllocator structure: void* (*calloc) (void *ctx, size_t size); It changes the tracemalloc module to trace calloc allocations, add new tests and document new functions. The patch also contains an important change: PyType_GenericAlloc() uses calloc instead of malloc+memset(0). It may be faster, I didn't check. -- keywords: +patch Added file: http://bugs.python.org/file34897/calloc.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +neologix, pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Changes by Josh Rosenberg shadowranger+pyt...@gmail.com: -- nosy: +josh.rosenberg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: So what is the point of _PyObject_GC_Calloc ? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Josh Rosenberg added the comment: General comment on patch: For the flag value that toggles zero-ing, perhaps use a different name, e.g. setzero, clearmem, initzero or somesuch instead of calloc? calloc already gets used to refer to both the C standard function and the function pointer structure member; it's mildly confusing to have it *also* refer to a boolean flag as well. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Josh Rosenberg added the comment: Additional comment on clarity: Might it make sense to make the calloc structure member take both the num and size arguments that the underlying calloc takes? That is, instead of: void* (*calloc) (void *ctx, size_t size); Declare it as: void* (*calloc) (void *ctx, size_t num, size_t size); Beyond potentially allowing more detailed tracing info at some later point (and much like the original calloc, potentially allowing us to verify that the components do not overflow on multiply, instead of assuming every caller must multiply and check for themselves), it also seems like it's a bit more friendly to have the prototype for the structure calloc to follow the same pattern as the other members for consistency (Principle of Least Surprise): A context pointer, plus the arguments expected by the equivalent C function. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Josh Rosenberg added the comment: Sorry for breaking it up, but the same comment on consistent prototypes mirroring the C standard lib calloc would apply to all the API functions as well, e.g. PyMem_RawCalloc, PyMem_Calloc, PyObject_Calloc and _PyObject_GC_Calloc, not just the structure function pointer. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: So what is the point of _PyObject_GC_Calloc ? It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see: https://mail.python.org/pipermail/python-dev/2014-April/133985.html _PyObject_GC_Calloc() is used by PyType_GenericAlloc(). If I understand directly, it is the default allocator to allocate Python objects. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: In numpy, I found the two following functions: /*NUMPY_API * Allocates memory for array data. */ void* PyDataMem_NEW(size_t size); /*NUMPY_API * Allocates zeroed memory for array data. */ void* PyDataMem_NEW_ZEROED(size_t size, size_t elsize); So it looks like it needs two size_t parameters. Prototype of the C function calloc(): void *calloc(size_t nmemb, size_t size); I agree that it's better to provide the same prototype than calloc(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
STINNER Victor added the comment: New patch: - replace size_t size with size_t nelem, size_t elsize in the prototype of calloc functions (the parameter names come from the POSIX standard) - replace int calloc with int zero in helper functions -- Added file: http://bugs.python.org/file34903/calloc-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21233] Add *Calloc functions to CPython memory allocation API
Antoine Pitrou added the comment: Le 16/04/2014 04:40, STINNER Victor a écrit : STINNER Victor added the comment: So what is the point of _PyObject_GC_Calloc ? It calls calloc(size) instead of malloc(size) No, the question is why you didn't simply change _PyObject_GC_Malloc (which is a private function). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21233 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com