[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-15 Thread Inada Naoki
+1 for overall idea.

Some comments:

>
> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.
>

Could you elaborate? mmap, maybe?

Generally speaking, fork is very difficult to use in safe.
My company's web apps load applications and libraries *after* fork,
not *before* fork for safety.
We had changed multiprocessing to use spawn by default on macOS.
So I don't recommend many Python users to use fork.

So if you know how to get benefit from CoW without fork, I want to know it.

>
> Immortal Global Objects
> ---
>
> The following objects will be made immortal:
>
> * singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
> * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
>   small ints)
>
> There will likely be others we have not enumerated here.
>

How about interned strings?
Should the intern dict be belonging to runtime, or (sub)interpreter?

If the interned dict is belonging to runtime, all interned dict should
be immortal to be shared between subinterpreters.
If the interned dict is belonging to interpreter, should we register
immortalized string to all interpreters?

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DQV6ECSUB2VD2EXX6CVCC45RJA6NR2ZZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-15 Thread Eric Snow
Eddie and I would appreciate your feedback on this proposal to support
treating some objects as "immortal".  The fundamental characteristic
of the approach is that we would provide stronger guarantees about
immutability for some objects.

A few things to note:

* this is essentially an internal-only change:  there are no
user-facing changes (aside from affecting any 3rd party code that
directly relies on specific refcounts)
* the naive implementation shows a 4% slowdown
* we have a number of strategies that should reduce that penalty
* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds

That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
would still be in reach.

-eric

---

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History:
Resolution:


Abstract


Under this proposal, any object may be marked as immortal.
"Immortal" means the object will never be cleaned up (at least until
runtime finalization).  Specifically, the `refcount`_ for an immortal
object is set to a sentinel value, and that refcount is never changed
by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
For immortal containers, the ``PyGC_Head`` is never
changed by the garbage collector.

Avoiding changes to the refcount is an essential part of this
proposal.  For what we call "immutable" objects, it makes them
truly immutable.  As described further below, this allows us
to avoid performance penalties in scenarios that
would otherwise be prohibitive.

This proposal is CPython-specific and, effectively, describes
internal implementation details.

.. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts


Motivation
==

Without immortal objects, all objects are effectively mutable.  That
includes "immutable" objects like ``None`` and ``str`` instances.
This is because every object's refcount is frequently modified
as it is used during execution.  In addition, for containers
the runtime may modify the object's ``PyGC_Head``.  These
runtime-internal state currently prevent
full immutability.

This has a concrete impact on active projects in the Python community.
Below we describe several ways in which refcount modification has
a real negative effect on those projects.  None of that would
happen for objects that are truly immutable.

Reducing Cache Invalidation
---

Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.

Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple threads that concurrently incref or decref.  Without a shared
GIL, two running interpreters could not safely share any objects,
even otherwise immutable ones like ``None``.

This means that, to have a per-interpreter GIL, each interpreter must
have its own copy of *every* object, including the singletons and
static types.  We have a viable strategy for that but it will
require a meaningful amount of extra effort and extra
complexity.

The alternative is to ensure that all shared objects are truly immutable.
There would be no races because there would be no modification.  This
is something that the immortality proposed here would enable for
otherwise immutable objects.  With immortal objects,
support for a per-interpreter GIL
becomes much simpler.

Avoiding Copy-on-Write
--

For some applications it makes sense to get the application into
a desired initial state and then fork the process for each worker.
This can result in a large performance improvement, especially
memory usage.  Several enterprise Python users (e.g. Instagram,
YouTube) have taken advantage of this.  However, the above
refcount semantics drastically reduce the benefits and
has led to some sub-optimal workarounds.

Also note that "fork" isn't the only operating system mechanism
that uses copy-on-write semantics.


Rationale
=

The proposed solution 

[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track

2022-02-15 Thread Pablo Galindo Salgado
>
> However, I still wonder: is there anyway to support `PyTraceMalloc_Track`
> API without being dependent to `tracemalloc`? I know there is not many
> memory tracing tools but I mean I still feel like there should be a generic
> way of doing this: A very vague example for demonstration:
>
> PyMemHooks_GetAllocators/PyMemHooks_SetAllocators/PyMemHooks_TrackAlloc
> which are public APIs and tracemalloc using them internally instead of
> allocator APIs?
>

This can be investigated, but this is equivalent to add tracing support to
the memory allocators and then make tracemalloc a client of such APIs.

This can be advantageous to tool authors but it can raise the maintenance
cost of the allocators as we would need to make these APIs generic enough
as different tools have different constraints (like having to hold the Gil
or not to call these APIs, what information can be passed down (do you want
to know the pointer and the kind of allocator it was used...etc). There are
even tools that want to link the allocated blocks to python objects
directly (which is not possible without a considerable redesign).

I would recommend opening an issue in bugs.python.org where this can be
discussed and (maybe) implemented. Please, feel free to add me to the issue
once open :)

Regards from rainy London,
Pablo Galindo Salgado
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NDZCL6OB2B535ZXZYVVCFMV4S6RO32GQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track

2022-02-15 Thread Sümer Cip
> In other words, the allocators are concerned about
> memory, not tracing or anything else that can be done by overriding them.

Ok I now understand that `tracemalloc`'s use of allocator APIs is just an 
implementation detail. Allocator APIs were used for tracing but they are not 
designed for that: which makes perfect sense.

However, I still wonder: is there anyway to support `PyTraceMalloc_Track` API 
without being dependent to `tracemalloc`? I know there is not many memory 
tracing tools but I mean I still feel like there should be a generic way of 
doing this: A very vague example for demonstration:

PyMemHooks_GetAllocators/PyMemHooks_SetAllocators/PyMemHooks_TrackAlloc which 
are public APIs and tracemalloc using them internally instead of allocator APIs?

>Regards from rainy London,
Cheers from a cold Istanbul, :)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BBHAJQGIUIBRTW7ZWUYHSP3JACLCTIHS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Custom memory profiler and PyTraceMalloc_Track

2022-02-15 Thread Pablo Galindo Salgado
The memory allocators don't have any context of tracing, they just
allocate. Tracemalloc is a trampoline based allocator that also trace
what's going on, bit from the point of view of the python allocator system
is just another allocator.

There is no concept of "notify the python allocator" because the python
allocator system doesn't have the concept of external allocation events.
You can override it entirely to do something extra, but it is designed (as
many other allocators) to not be aware of the existence of other systems
running in parallel. In other words, the allocators are concerned about
memory, not tracing or anything else that can be done by overriding them.

That's why there is no "notify allocator" APIs in the python allocators.

Regards from rainy London,
Pablo Galindo Salgado

On Tue, 15 Feb 2022, 13:57 Sümer Cip,  wrote:

> Hi everyone,
>
> I would like to ask a question about an issue that we faced regarding
> profiling memory usage:
>
> We have implemented a custom memory profiler using
> `PyMem_GetAllocator/PyMem_SetAllocator` APIs like `tracemalloc`. Right now,
> we are facing an issue with numpy: numpy seems to have its own memory
> allocator and they use `PyTraceMalloc_Track` APIs to notify tracemalloc
> about the allocation. A simple search on GitHub reveals there are more
> projects using this approach:
> https://github.com/search?q=PyTraceMalloc_Track=code which is fine.
> I believe it is common to use custom memory allocators for scientific
> computation which makes perfect sense.
>
> However, I would have expected to have at least some form of
> `PyMem_NotifyAllocator` type of API instead of a one that is specific to
> `tracemalloc`? I might be missing some context here.
>
> WDYT?
>
> Best,
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/BHOIDGRUWPM5WEOB3EIDPOJLDMU4WQ4F/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ATULLVYZMDGNOE2NYC53ZUZXBBAHLNSO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Custom memory profiler and PyTraceMalloc_Track

2022-02-15 Thread Sümer Cip
Hi everyone,

I would like to ask a question about an issue that we faced regarding profiling 
memory usage:

We have implemented a custom memory profiler using 
`PyMem_GetAllocator/PyMem_SetAllocator` APIs like `tracemalloc`. Right now, we 
are facing an issue with numpy: numpy seems to have its own memory allocator 
and they use `PyTraceMalloc_Track` APIs to notify tracemalloc about the 
allocation. A simple search on GitHub reveals there are more projects using 
this approach: https://github.com/search?q=PyTraceMalloc_Track=code which 
is fine. I believe it is common to use custom memory allocators for scientific 
computation which makes perfect sense.

However, I would have expected to have at least some form of 
`PyMem_NotifyAllocator` type of API instead of a one that is specific to 
`tracemalloc`? I might be missing some context here. 

WDYT?

Best,
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHOIDGRUWPM5WEOB3EIDPOJLDMU4WQ4F/
Code of Conduct: http://python.org/psf/codeofconduct/