[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
+1 for overall idea. Some comments: > > Also note that "fork" isn't the only operating system mechanism > that uses copy-on-write semantics. > Could you elaborate? mmap, maybe? Generally speaking, fork is very difficult to use in safe. My company's web apps load applications and libraries *after* fork, not *before* fork for safety. We had changed multiprocessing to use spawn by default on macOS. So I don't recommend many Python users to use fork. So if you know how to get benefit from CoW without fork, I want to know it. > > Immortal Global Objects > --- > > The following objects will be made immortal: > > * singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``) > * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``) > * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers, > small ints) > > There will likely be others we have not enumerated here. > How about interned strings? Should the intern dict be belonging to runtime, or (sub)interpreter? If the interned dict is belonging to runtime, all interned dict should be immortal to be shared between subinterpreters. If the interned dict is belonging to interpreter, should we register immortalized string to all interpreters? Regards, -- Inada Naoki ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DQV6ECSUB2VD2EXX6CVCC45RJA6NR2ZZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount"
Eddie and I would appreciate your feedback on this proposal to support treating some objects as "immortal". The fundamental characteristic of the approach is that we would provide stronger guarantees about immutability for some objects. A few things to note: * this is essentially an internal-only change: there are no user-facing changes (aside from affecting any 3rd party code that directly relies on specific refcounts) * the naive implementation shows a 4% slowdown * we have a number of strategies that should reduce that penalty * without immortal objects, the implementation for per-interpreter GIL will require a number of non-trivial workarounds That last one is particularly meaningful to me since it means we would definitely miss the 3.11 feature freeze. With immortal objects, 3.11 would still be in reach. -eric --- PEP: 683 Title: Immortal Objects, Using a Fixed Refcount Author: Eric Snow , Eddie Elizondo Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-Feb-2022 Python-Version: 3.11 Post-History: Resolution: Abstract Under this proposal, any object may be marked as immortal. "Immortal" means the object will never be cleaned up (at least until runtime finalization). Specifically, the `refcount`_ for an immortal object is set to a sentinel value, and that refcount is never changed by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``. For immortal containers, the ``PyGC_Head`` is never changed by the garbage collector. Avoiding changes to the refcount is an essential part of this proposal. For what we call "immutable" objects, it makes them truly immutable. As described further below, this allows us to avoid performance penalties in scenarios that would otherwise be prohibitive. This proposal is CPython-specific and, effectively, describes internal implementation details. .. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts Motivation == Without immortal objects, all objects are effectively mutable. That includes "immutable" objects like ``None`` and ``str`` instances. This is because every object's refcount is frequently modified as it is used during execution. In addition, for containers the runtime may modify the object's ``PyGC_Head``. These runtime-internal state currently prevent full immutability. This has a concrete impact on active projects in the Python community. Below we describe several ways in which refcount modification has a real negative effect on those projects. None of that would happen for objects that are truly immutable. Reducing Cache Invalidation --- Every modification of a refcount causes the corresponding cache line to be invalidated. This has a number of effects. For one, the write must be propagated to other cache levels and to main memory. This has small effect on all Python programs. Immortal objects would provide a slight relief in that regard. On top of that, multi-core applications pay a price. If two threads are interacting with the same object (e.g. ``None``) then they will end up invalidating each other's caches with each incref and decref. This is true even for otherwise immutable objects like ``True``, ``0``, and ``str`` instances. This is also true even with the GIL, though the impact is smaller. Avoiding Data Races --- Speaking of multi-core, we are considering making the GIL a per-interpreter lock, which would enable true multi-core parallelism. Among other things, the GIL currently protects against races between multiple threads that concurrently incref or decref. Without a shared GIL, two running interpreters could not safely share any objects, even otherwise immutable ones like ``None``. This means that, to have a per-interpreter GIL, each interpreter must have its own copy of *every* object, including the singletons and static types. We have a viable strategy for that but it will require a meaningful amount of extra effort and extra complexity. The alternative is to ensure that all shared objects are truly immutable. There would be no races because there would be no modification. This is something that the immortality proposed here would enable for otherwise immutable objects. With immortal objects, support for a per-interpreter GIL becomes much simpler. Avoiding Copy-on-Write -- For some applications it makes sense to get the application into a desired initial state and then fork the process for each worker. This can result in a large performance improvement, especially memory usage. Several enterprise Python users (e.g. Instagram, YouTube) have taken advantage of this. However, the above refcount semantics drastically reduce the benefits and has led to some sub-optimal workarounds. Also note that "fork" isn't the only operating system mechanism that uses copy-on-write semantics. Rationale = The proposed solution
[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track
> > However, I still wonder: is there anyway to support `PyTraceMalloc_Track` > API without being dependent to `tracemalloc`? I know there is not many > memory tracing tools but I mean I still feel like there should be a generic > way of doing this: A very vague example for demonstration: > > PyMemHooks_GetAllocators/PyMemHooks_SetAllocators/PyMemHooks_TrackAlloc > which are public APIs and tracemalloc using them internally instead of > allocator APIs? > This can be investigated, but this is equivalent to add tracing support to the memory allocators and then make tracemalloc a client of such APIs. This can be advantageous to tool authors but it can raise the maintenance cost of the allocators as we would need to make these APIs generic enough as different tools have different constraints (like having to hold the Gil or not to call these APIs, what information can be passed down (do you want to know the pointer and the kind of allocator it was used...etc). There are even tools that want to link the allocated blocks to python objects directly (which is not possible without a considerable redesign). I would recommend opening an issue in bugs.python.org where this can be discussed and (maybe) implemented. Please, feel free to add me to the issue once open :) Regards from rainy London, Pablo Galindo Salgado ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NDZCL6OB2B535ZXZYVVCFMV4S6RO32GQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track
> In other words, the allocators are concerned about > memory, not tracing or anything else that can be done by overriding them. Ok I now understand that `tracemalloc`'s use of allocator APIs is just an implementation detail. Allocator APIs were used for tracing but they are not designed for that: which makes perfect sense. However, I still wonder: is there anyway to support `PyTraceMalloc_Track` API without being dependent to `tracemalloc`? I know there is not many memory tracing tools but I mean I still feel like there should be a generic way of doing this: A very vague example for demonstration: PyMemHooks_GetAllocators/PyMemHooks_SetAllocators/PyMemHooks_TrackAlloc which are public APIs and tracemalloc using them internally instead of allocator APIs? >Regards from rainy London, Cheers from a cold Istanbul, :) ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BBHAJQGIUIBRTW7ZWUYHSP3JACLCTIHS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track
The memory allocators don't have any context of tracing, they just allocate. Tracemalloc is a trampoline based allocator that also trace what's going on, bit from the point of view of the python allocator system is just another allocator. There is no concept of "notify the python allocator" because the python allocator system doesn't have the concept of external allocation events. You can override it entirely to do something extra, but it is designed (as many other allocators) to not be aware of the existence of other systems running in parallel. In other words, the allocators are concerned about memory, not tracing or anything else that can be done by overriding them. That's why there is no "notify allocator" APIs in the python allocators. Regards from rainy London, Pablo Galindo Salgado On Tue, 15 Feb 2022, 13:57 Sümer Cip, wrote: > Hi everyone, > > I would like to ask a question about an issue that we faced regarding > profiling memory usage: > > We have implemented a custom memory profiler using > `PyMem_GetAllocator/PyMem_SetAllocator` APIs like `tracemalloc`. Right now, > we are facing an issue with numpy: numpy seems to have its own memory > allocator and they use `PyTraceMalloc_Track` APIs to notify tracemalloc > about the allocation. A simple search on GitHub reveals there are more > projects using this approach: > https://github.com/search?q=PyTraceMalloc_Track=code which is fine. > I believe it is common to use custom memory allocators for scientific > computation which makes perfect sense. > > However, I would have expected to have at least some form of > `PyMem_NotifyAllocator` type of API instead of a one that is specific to > `tracemalloc`? I might be missing some context here. > > WDYT? > > Best, > ___ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/BHOIDGRUWPM5WEOB3EIDPOJLDMU4WQ4F/ > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ATULLVYZMDGNOE2NYC53ZUZXBBAHLNSO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Custom memory profiler and PyTraceMalloc_Track
Hi everyone, I would like to ask a question about an issue that we faced regarding profiling memory usage: We have implemented a custom memory profiler using `PyMem_GetAllocator/PyMem_SetAllocator` APIs like `tracemalloc`. Right now, we are facing an issue with numpy: numpy seems to have its own memory allocator and they use `PyTraceMalloc_Track` APIs to notify tracemalloc about the allocation. A simple search on GitHub reveals there are more projects using this approach: https://github.com/search?q=PyTraceMalloc_Track=code which is fine. I believe it is common to use custom memory allocators for scientific computation which makes perfect sense. However, I would have expected to have at least some form of `PyMem_NotifyAllocator` type of API instead of a one that is specific to `tracemalloc`? I might be missing some context here. WDYT? Best, ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BHOIDGRUWPM5WEOB3EIDPOJLDMU4WQ4F/ Code of Conduct: http://python.org/psf/codeofconduct/