[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
On Wed, Feb 23, 2022 at 4:21 PM Antonio Cuni  wrote:
> When refcheck=True (the default), numpy raises an error if you try to resize 
> an array inplace whose refcnt > 2 (although I don't understand why > 2 and 
> not > 1, and the docs aren't very clear about this).
>
> That said, relying on the exact value of the refcnt is very bad for 
> alternative implementations and for HPy, and in particular it is impossible 
> to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this 
> point of view, a wording which explicitly restricts the "legal" usage of the 
> refcnt details would be very welcome.

Thanks for the feedback and example.  It helps.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D23Z3C7CQIIGALDRSU4RDDM7GVUAASGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
On Wed, Feb 23, 2022 at 9:16 AM Petr Viktorin  wrote:
>>> But tp_dict is also public C-API. How will that be handled?
>>> Perhaps naively, I thought static types' dicts could be treated as
>>> (deeply) immutable, and shared?
>>
>> They are immutable from Python code but not from C (due to tp_dict).
>> Basically, we will document that tp_dict should not be used directly
>> (in the public API) and refer users to a public getter function.  I'll
>> note this in the PEP.
>
> What worries me is that existing users of the API haven't read the new
> documentation. What will happen if users do use it?
> Or worse, add things to it?

We will probably set it to NULL, so the user code would fail or crash.
I suppose we could set it to a dummy object that emits helpful errors.

However, I don't think that is worth it.  We're talking about where
users are directly accessing tp_dict of the builtin static types, not
their own.  That is already something they should definitely not be
doing.

> (Hm, the current docs are already rather confusing -- 3.2 added a note
> that "It is not safe to ... modify tp_dict with the dictionary C-API.",
> but above that it says "extra attributes for the type may be added to
> this dictionary [in some cases]")

Yeah, the docs will have to be clarified.

>> Having thought about it some more, I don't think this PEP should be
>> strictly bound to per-interpreter GIL.  That is certainly my personal
>> motivation.  However, we have a small set of users that would benefit
>> significantly, the change is relatively small and simple, and the risk
>> of breaking users is also small.
>
> Right, with the recent performance improvements it's looking like it
> might stand on its own after all.

Great!

>> Honestly, it might not have needed a PEP in the first place if I
>> had been a bit more clear about the idea earlier.
>
> Maybe it's good to have a PEP to clear that up :)

Yeah, the PEP process has been helpful for that. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AKFMFZ45UJXED24YRB4NHQ4HT442XVSP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
Responses inline below.

-eric

On Tue, Feb 22, 2022 at 7:22 PM Inada Naoki  wrote:
> > For a recent example, see
> > https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>
> It is not proven example, but just a hope at the moment. So option is
> fine to prove the idea.
>
> Although I can not read the code, they said "patching ASLR by patching
> `ob_type` fields;".
> It will cause CoW for most objects, isn't it?
>
> So reducing memory write don't directly means reducing CoW.
> Unless we can stop writing on a page completely, the page will be copied.

Yeah, they would have to address that.

> > CPU cache invalidation exists regardless.  With the current GIL the
> > effect it is reduced significantly.
>
> It's an interesting point. We can not see the benefit from
> pypeformance, because it doesn't use much data and it runs one process
> at a time.
> So the pyperformance can not make enough stress to the last level
> cache which is shared by many cores.
>
> We need multiprocess performance benchmark apart from pyperformance,
> to stress the last level cache from multiple cores.
> It helps not only this PEP, but also optimizing containers like dict and set.

+1

> Can proposed optimizations to eliminate the penalty guarantee that
> every __del__, weakref are not broken,
> and no memory leak occurs when the Python interpreter is initialized
> and finalized multiple times?
> I haven't confirmed it yet.

They will not break __del__ or weakrefs.  No memory will leak after
finalization.  If any of that happens then it is a bug.

> FWIW, I filed an issue to remove hash cache from bytes objects.
> https://github.com/faster-cpython/ideas/issues/290
>
> Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...)
> Removing it will save some RAM usage and make immortal bytes truly
> immutable, safe to be shared between interpreters.

+1  Thanks!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QKMPALMWGF5366C6PQRSIIFVNXKF4UAM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Sebastian Berg
On Thu, 2022-02-24 at 00:21 +0100, Antonio Cuni wrote:
> On Mon, Feb 21, 2022 at 5:18 PM Petr Viktorin 
> wrote:
> 
> Should we care about hacks/optimizations that rely on having the only
> > reference (or all references), e.g. mutating a tuple if it has
> > refcount
> > 1? Immortal objects shouldn't break them (the special case simply
> > won't
> > apply), but this wording would make them illegal.
> > AFAIK CPython uses this internally, but I don't know how
> > prevalent/useful it is in third-party code.
> > 
> 
> FWIW, a real world example of this is numpy.ndarray.resize(...,
> refcheck=True):
> https://numpy.org/doc/stable/reference/generated/numpy.ndarray.resize.html#numpy.ndarray.resize
> https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/shape.c#L114
> 
> When refcheck=True (the default), numpy raises an error if you try to
> resize an array inplace whose refcnt > 2 (although I don't understand
> why >
> 2 and not > 1, and the docs aren't very clear about this).
> 
> That said, relying on the exact value of the refcnt is very bad for
> alternative implementations and for HPy, and in particular it is
> impossible
> to implement ndarray.resize(refcheck=True) correctly on PyPy. So from
> this
> point of view, a wording which explicitly restricts the "legal" usage
> of
> the refcnt details would be very welcome.

Yeah, NumPy resizing is a bit of an awkward point, I would be on-board
for just replacing resize for non

NumPy does also have a bit of magic akin to the "string concat" trick
for operations like:

a + b + c

where it will try do magic and use the knowledge that it can
mutate/reuse the temporary array, effectively doing:

tmp = a + b
tmp += c

(which requires some stack walking magic additionally to the refcount!)

Cheers,

Sebastian


> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ACJIER45M6XLKUWT6TCLB6QXVZSB74EH/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HSCF5XPQMWRX45Y2PVNPVSCDT4GC6PTB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Antonio Cuni
On Mon, Feb 21, 2022 at 5:18 PM Petr Viktorin  wrote:

Should we care about hacks/optimizations that rely on having the only
> reference (or all references), e.g. mutating a tuple if it has refcount
> 1? Immortal objects shouldn't break them (the special case simply won't
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how
> prevalent/useful it is in third-party code.
>

FWIW, a real world example of this is numpy.ndarray.resize(...,
refcheck=True):
https://numpy.org/doc/stable/reference/generated/numpy.ndarray.resize.html#numpy.ndarray.resize
https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/shape.c#L114

When refcheck=True (the default), numpy raises an error if you try to
resize an array inplace whose refcnt > 2 (although I don't understand why >
2 and not > 1, and the docs aren't very clear about this).

That said, relying on the exact value of the refcnt is very bad for
alternative implementations and for HPy, and in particular it is impossible
to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this
point of view, a wording which explicitly restricts the "legal" usage of
the refcnt details would be very welcome.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ACJIER45M6XLKUWT6TCLB6QXVZSB74EH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Brett Cannon
On Wed, Feb 23, 2022 at 8:19 AM Petr Viktorin  wrote:

> On 23. 02. 22 2:46, Eric Snow wrote:
>
>
[SNIP]


>
> > So it seems like the bar should be pretty low for this one (assuming
> > we get the performance penalty low enough).  If it were some massive
> > or broadly impactful (or even clearly public) change then I suppose
> > you could call the motivation weak.  However, this isn't that sort of
> > PEP.


Yes, but PEPs are not just about complexity, but also impact on users. And
"impact" covers backwards-compatibility which includes performance
regressions (i.e. making Python slower means it may no longer be a viable
for someone with specific performance requirements). So with the initial 4%
performance regression it made sense to write a PEP.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z4SXVNRLHFWRPLB4UQZQVW7SKDUJH6GY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Petr Viktorin

On 23. 02. 22 2:46, Eric Snow wrote:

Thanks for the responses.  I've replied inline below.


Same here :)



Immortal Global Objects
---

All objects that we expect to be shared globally (between interpreters)
will be made immortal.  That includes the following:

* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
small ints)

All such objects will be immutable.  In the case of the static types,
they will be effectively immutable.  ``PyTypeObject`` has some mutable
start (``tp_dict`` and ``tp_subclasses``), but we can work around this
by storing that state on ``PyInterpreterState`` instead of on the
respective static type object.  Then the ``__dict__``, etc. getter
will do a lookup on the current interpreter, if appropriate, instead
of using ``tp_dict``.


But tp_dict is also public C-API. How will that be handled?
Perhaps naively, I thought static types' dicts could be treated as
(deeply) immutable, and shared?


They are immutable from Python code but not from C (due to tp_dict).
Basically, we will document that tp_dict should not be used directly
(in the public API) and refer users to a public getter function.  I'll
note this in the PEP.


What worries me is that existing users of the API haven't read the new 
documentation. What will happen if users do use it?

Or worse, add things to it?

(Hm, the current docs are already rather confusing -- 3.2 added a note 
that "It is not safe to ... modify tp_dict with the dictionary C-API.", 
but above that it says "extra attributes for the type may be added to 
this dictionary [in some cases]")



[...]

And from the other thread:

On 17. 02. 22 18:23, Eric Snow wrote:
  > On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
   Weren't you planning a PEP on subinterpreter GIL as well? Do you
want to
   submit them together?
  >>>
  >>> I'd have to think about that.  The other PEP I'm writing for
  >>> per-interpreter GIL doesn't require immortal objects.  They just
  >>> simplify a number of things.  That's my motivation for writing this
  >>> PEP, in fact. :)
  >>
  >> Please think about it.
  >> If you removed the benefits for per-interpreter GIL, the motivation
  >> section would be reduced to is memory savings for fork/CoW. (And lots of
  >> performance improvements that are great in theory but sum up to a 4%
loss.)
  >
  > Sounds good.  Would this involve more than a note at the top of the PEP?

No, a note would work great. If you read the motivation carefully, it's
(IMO) clear that it's rather weak without the other PEP. But that
realization shouldn't come as a surprise to the reader.


Having thought about it some more, I don't think this PEP should be
strictly bound to per-interpreter GIL.  That is certainly my personal
motivation.  However, we have a small set of users that would benefit
significantly, the change is relatively small and simple, and the risk
of breaking users is also small.  In fact, we regularly have more
disruptive changes to internals that do not require a PEP.


Right, with the recent performance improvements it's looking like it 
might stand on its own after all.



So it seems like the bar should be pretty low for this one (assuming
we get the performance penalty low enough).  If it were some massive
or broadly impactful (or even clearly public) change then I suppose
you could call the motivation weak.  However, this isn't that sort of
PEP.  Honestly, it might not have needed a PEP in the first place if I
had been a bit more clear about the idea earlier.


Maybe it's good to have a PEP to clear that up :)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZTON72YXUUFV5MX5KIEM3DDNAUAZT4M6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Tue, Feb 22, 2022, 20:26 Larry Hastings  wrote:

> Are these optimizations specifically for the PR, or are these
> optimizations we could apply without taking the immortal objects?  Kind of
> like how Sam tried to offset the nogil slowdown by adding optimizations
> that we went ahead and added anyway ;-)
>

Basically all the optimizations require immortal objects.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7VJVBFBWE3HWTPRVZH3WLSR7EZHZD337/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Larry Hastings

On 2/22/22 6:00 PM, Eric Snow wrote:

On Sat, Feb 19, 2022 at 12:46 AM Eric Snow  wrote:

Performance
---

A naive implementation shows `a 4% slowdown`_.
Several promising mitigation strategies will be pursued in the effort
to bring it closer to performance-neutral.  See the `mitigation`_
section below.

FYI, Eddie has been able to get us back to performance-neutral after
applying several of the mitigation strategies we discussed. :)



Are these optimizations specifically for the PR, or are these 
optimizations we could apply without taking the immortal objects? Kind 
of like how Sam tried to offset the nogil slowdown by adding 
optimizations that we went ahead and added anyway ;-)



//arry/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7GX4C3DQ23B2K5JXTOYQQPT2ZLJD7CP4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Inada Naoki
On Wed, Feb 23, 2022 at 10:12 AM Eric Snow  wrote:
>
> Thanks for the feedback.  I've responded inline below.
>
> -eric
>
> On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> > I hope per-interpreter GIL success at some point, and I know this is
> > needed for per-interpreter GIL.
> >
> > But I am worrying about per-interpreter GIL may be too complex to
> > implement and maintain for core developers and extension writers.
> > As you know, immortal don't mean sharable between interpreters. It is
> > too difficult to know which object can be shared, and where the
> > shareable objects are leaked to other interpreters.
> > So I am not sure that per interpreter GIL is achievable goal.
>
> I plan on addressing this in the PEP I am working on for
> per-interpreter GIL.  In the meantime, I doubt the issue will impact
> any core devs.
>

It's nice to hear!


> > So I think it's too early to introduce the immortal objects in Python
> > 3.11, unless it *improve* performance without per-interpreter GIL
> > Instead, we can add a configuration option such as
> > `--enalbe-experimental-immortal`.
>
> I agree that immortal objects aren't quite as appealing in general
> without per-interpreter GIL.  However, there are actual users that
> will benefit from it, assuming we can reduce the performance penalty
> to acceptable levels.  For a recent example, see
> https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>

It is not proven example, but just a hope at the moment. So option is
fine to prove the idea.

Although I can not read the code, they said "patching ASLR by patching
`ob_type` fields;".
It will cause CoW for most objects, isn't it?

So reducing memory write don't directly means reducing CoW.
Unless we can stop writing on a page completely, the page will be copied.


> > On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  
> > wrote:
> > >
> > > Reducing CPU Cache Invalidation
> > > ---
> > >
> > > Avoiding Data Races
> > > ---
> > >
> >
> > Both benefits require a per-interpreter GIL.
>
> CPU cache invalidation exists regardless.  With the current GIL the
> effect it is reduced significantly.
>

It's an interesting point. We can not see the benefit from
pypeformance, because it doesn't use much data and it runs one process
at a time.
So the pyperformance can not make enough stress to the last level
cache which is shared by many cores.

We need multiprocess performance benchmark apart from pyperformance,
to stress the last level cache from multiple cores.
It helps not only this PEP, but also optimizing containers like dict and set.


> >
> > As I wrote before, fork is very difficult to use safely. We can not
> > recommend to use it for many users.
> > And I don't think reducing the size of patch in Instagram or YouTube
> > is not good rational for this kind of change.
>
> What do you mean by "this kind of change"?  The proposed change is
> relatively small.  It certainly isn't nearly as intrusive as many
> changes we make to internals without a PEP.  If you are talking about
> the performance penalty, we should be able to eliminate it.
>

Can proposed optimizations to eliminate the penalty guarantee that
every __del__, weakref are not broken,
and no memory leak occurs when the Python interpreter is initialized
and finalized multiple times?
I haven't confirmed it yet.


> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > > relies on copy-on-write, including sharing data from shared objects
> > > files between processes.
> > >
> >
> > It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
> >
> > You may need to write hash of bytes and unicode. You may be need to
> > write `tp_type`.
> > Immortal objects can "reduce" the memory write. But "at least one
> > memory write" is enough to trigger the CoW.
>
> Correct.  However, without immortal objects (AKA immutable per-object
> runtime-state) it goes from "very difficult" to "basically
> impossible".
>

Configuration option won't make it impossible.


> > >
> > > Constraints
> > > ---
> > >
> > > * ensure that otherwise immutable objects can be truly immutable
> > > * be careful when immortalizing objects that are not otherwise immutable
> >
> > I am not sure about what this means.
> > For example, unicode objects are not immutable because they have hash,
> > utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
> > 3.12).
>
> I think you understood it correctly.  In the case of str objects, they
> are close enough since a race on any of those values will not cause a
> different outcome.
>
> I will clarify the point in the PEP.
>

FWIW, I filed an issue to remove hash cache from bytes objects.
https://github.com/faster-cpython/ideas/issues/290

Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...)
Removing it will save some RAM usage and make 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Sat, Feb 19, 2022 at 12:46 AM Eric Snow  wrote:
> Performance
> ---
>
> A naive implementation shows `a 4% slowdown`_.
> Several promising mitigation strategies will be pursued in the effort
> to bring it closer to performance-neutral.  See the `mitigation`_
> section below.

FYI, Eddie has been able to get us back to performance-neutral after
applying several of the mitigation strategies we discussed. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYGZEQSVBS6ODVAHPL3QN4CJ7JN4FYWO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Mon, Feb 21, 2022 at 4:56 PM Terry Reedy  wrote:
> We could say that the only refcounts with any meaning are 0, 1, and > 1.

Yeah, that should work.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7HZ7VBJQOYHXFV3ZD4V7DCMLBL4Q34WP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Mon, Feb 21, 2022 at 10:56 AM  wrote:
> For what it's worth Cython does this for string concatenation to concatenate 
> in place if possible (this optimization was copied from CPython). It could be 
> disabled relatively easily if it became a problem (it's already CPython only 
> and version checked so it'd just need another upper-bound version check).

That's good to know.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OEZS4KGQJET5DL3M2OTB76I4W7F56FJC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
Thanks for the responses.  I've replied inline below.

-eric

On Mon, Feb 21, 2022 at 9:11 AM Petr Viktorin  wrote:
>
> On 19. 02. 22 8:46, Eric Snow wrote:
> > Thanks to all those that provided feedback.  I've worked to
> > substantially update the PEP in response.  The text is included below.
> > Further feedback is appreciated.
>
> Thank you! This version is much clearer. I like the PEP more and more!

Great!

> I've sent a PR with a some typo fixes:
> https://github.com/python/peps/pull/2348

Thank you.

> > Public Refcount Details
> [...]
> > As part of this proposal, we must make sure that users can clearly
> > understand on which parts of the refcount behavior they can rely and
> > which are considered implementation details.  Specifically, they should
> > use the existing public refcount-related API and the only refcount value
> > with any meaning is 0.  All other values are considered "not 0".
>
> Should we care about hacks/optimizations that rely on having the only
> reference (or all references), e.g. mutating a tuple if it has refcount
> 1? Immortal objects shouldn't break them (the special case simply won't
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how
> prevalent/useful it is in third-party code.

Good point.  As Terry suggested, we could also let 1 have meaning.

Regardless, any documented restriction would only apply to users of
the public C-API, not to internal code.

> > _Py_IMMORTAL_REFCNT
> > ---
> >
> > We will add two internal constants::
> >
> >  #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
> >  #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))
>
> As a nitpick: could you say this in prose?
>
> * ``_Py_IMMORTAL_BIT`` has the third top-most bit set.
> * ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set.

Sure.

> > Immortal Global Objects
> > ---
> >
> > All objects that we expect to be shared globally (between interpreters)
> > will be made immortal.  That includes the following:
> >
> > * singletons (``None``, ``True``, ``False``, ``Ellipsis``, 
> > ``NotImplemented``)
> > * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> > * all static objects in ``_PyRuntimeState.global_objects`` (e.g. 
> > identifiers,
> >small ints)
> >
> > All such objects will be immutable.  In the case of the static types,
> > they will be effectively immutable.  ``PyTypeObject`` has some mutable
> > start (``tp_dict`` and ``tp_subclasses``), but we can work around this
> > by storing that state on ``PyInterpreterState`` instead of on the
> > respective static type object.  Then the ``__dict__``, etc. getter
> > will do a lookup on the current interpreter, if appropriate, instead
> > of using ``tp_dict``.
>
> But tp_dict is also public C-API. How will that be handled?
> Perhaps naively, I thought static types' dicts could be treated as
> (deeply) immutable, and shared?

They are immutable from Python code but not from C (due to tp_dict).
Basically, we will document that tp_dict should not be used directly
(in the public API) and refer users to a public getter function.  I'll
note this in the PEP.

> Perhaps it would be best to leave it out here and say say "The details
> of sharing ``PyTypeObject`` across interpreters are left to another PEP"?
> Even so, I'd love to know the plan.

What else would you like to know?  There isn't much to it.  For each
of the builtin static types we will keep the relevant mutable state on
PyInterpreterState and look it up there in the relevant getters (e.g.
__dict__ and __subclasses__).

> (And even if these are internals,
> changes to them should be mentioned in What's New, for the sake of
> people who need to maintain old extensions.)

+1

> > Object Cleanup
> > --
> >
> > In order to clean up all immortal objects during runtime finalization,
> > we must keep track of them.
> >
> > For GC objects ("containers") we'll leverage the GC's permanent
> > generation by pushing all immortalized containers there.  During
> > runtime shutdown, the strategy will be to first let the runtime try
> > to do its best effort of deallocating these instances normally.  Most
> > of the module deallocation will now be handled by
> > ``pylifecycle.c:finalize_modules()`` which cleans up the remaining
> > modules as best as we can.  It will change which modules are available
> > during __del__ but that's already defined as undefined behavior by the
> > docs.  Optionally, we could do some topological disorder to guarantee
> > that user modules will be deallocated first before the stdlib modules.
> > Finally, anything leftover (if any) can be found through the permanent
> > generation gc list which we can clear after finalize_modules().
> >
> > For non-container objects, the tracking approach will vary on a
> > case-by-case basis.  In nearly every case, each such object is directly
> > accessible on the 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
Thanks for the feedback.  I've responded inline below.

-eric

On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> I hope per-interpreter GIL success at some point, and I know this is
> needed for per-interpreter GIL.
>
> But I am worrying about per-interpreter GIL may be too complex to
> implement and maintain for core developers and extension writers.
> As you know, immortal don't mean sharable between interpreters. It is
> too difficult to know which object can be shared, and where the
> shareable objects are leaked to other interpreters.
> So I am not sure that per interpreter GIL is achievable goal.

I plan on addressing this in the PEP I am working on for
per-interpreter GIL.  In the meantime, I doubt the issue will impact
any core devs.

> So I think it's too early to introduce the immortal objects in Python
> 3.11, unless it *improve* performance without per-interpreter GIL
> Instead, we can add a configuration option such as
> `--enalbe-experimental-immortal`.

I agree that immortal objects aren't quite as appealing in general
without per-interpreter GIL.  However, there are actual users that
will benefit from it, assuming we can reduce the performance penalty
to acceptable levels.  For a recent example, see
https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.

> On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
> >
> > Reducing CPU Cache Invalidation
> > ---
> >
> > Avoiding Data Races
> > ---
> >
>
> Both benefits require a per-interpreter GIL.

CPU cache invalidation exists regardless.  With the current GIL the
effect it is reduced significantly.

Per-interpreter GIL is only one situation where data races matter.
Any attempt to generally eliminate the GIL must deal with races on the
per-object runtime state.

> >
> > Avoiding Copy-on-Write
> > --
> >
> > For some applications it makes sense to get the application into
> > a desired initial state and then fork the process for each worker.
> > This can result in a large performance improvement, especially
> > memory usage.  Several enterprise Python users (e.g. Instagram,
> > YouTube) have taken advantage of this.  However, the above
> > refcount semantics drastically reduce the benefits and
> > has led to some sub-optimal workarounds.
> >
>
> As I wrote before, fork is very difficult to use safely. We can not
> recommend to use it for many users.
> And I don't think reducing the size of patch in Instagram or YouTube
> is not good rational for this kind of change.

What do you mean by "this kind of change"?  The proposed change is
relatively small.  It certainly isn't nearly as intrusive as many
changes we make to internals without a PEP.  If you are talking about
the performance penalty, we should be able to eliminate it.

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > relies on copy-on-write, including sharing data from shared objects
> > files between processes.
> >
>
> It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
>
> You may need to write hash of bytes and unicode. You may be need to
> write `tp_type`.
> Immortal objects can "reduce" the memory write. But "at least one
> memory write" is enough to trigger the CoW.

Correct.  However, without immortal objects (AKA immutable per-object
runtime-state) it goes from "very difficult" to "basically
impossible".

> > Accidental Immortality
> > --
> >
> > While it isn't impossible, this accidental scenario is so unlikely
> > that we need not worry.  Even if done deliberately by using
> > ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> > cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> > 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> > If that CPU were 32-bit then it is (technically) more possible though
> > still highly unlikely.
> >
>
> Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.

The question is if this matters.  If really necessary, the PEP can
demonstrate that it doesn't matter in practice.

(Also, the magic value on 32-bit would be 2**29.)

> >
> > Constraints
> > ---
> >
> > * ensure that otherwise immutable objects can be truly immutable
> > * be careful when immortalizing objects that are not otherwise immutable
>
> I am not sure about what this means.
> For example, unicode objects are not immutable because they have hash,
> utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
> 3.12).

I think you understood it correctly.  In the case of str objects, they
are close enough since a race on any of those values will not cause a
different outcome.

I will clarify the point in the PEP.

> > Object Cleanup
> > --
> >
> > In order to clean up all immortal objects during runtime finalization,
> > we must keep track of them.
> >
>
> I don't 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread Terry Reedy

On 2/21/2022 11:11 AM, Petr Viktorin wrote:

On 19. 02. 22 8:46, Eric Snow wrote:



As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and
which are considered implementation details.  Specifically, they should
use the existing public refcount-related API and the only refcount value
with any meaning is 0.  All other values are considered "not 0".


Should we care about hacks/optimizations that rely on having the only 
reference (or all references), e.g. mutating a tuple if it has refcount 
1? Immortal objects shouldn't break them (the special case simply won't 
apply), but this wording would make them illegal.
AFAIK CPython uses this internally, but I don't know how 
prevalent/useful it is in third-party code.


We could say that the only refcounts with any meaning are 0, 1, and > 1.


--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C3R4FKO7PZETOSI5DTGMAXWVUTQM26AW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread dw-git
Petr Viktorin wrote:
> Should we care about hacks/optimizations that rely on having the only 
> reference (or all references), e.g. mutating a tuple if it has refcount 
> 1? Immortal objects shouldn't break them (the special case simply won't 
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how 
> prevalent/useful it is in third-party code.

For what it's worth Cython does this for string concatenation to concatenate in 
place if possible (this optimization was copied from CPython). It could be 
disabled relatively easily if it became a problem (it's already CPython only 
and version checked so it'd just need another upper-bound version check).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDNQK5RMXSLLYFNIXRORL7GTKU6B4BVR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread Petr Viktorin

On 19. 02. 22 8:46, Eric Snow wrote:

Thanks to all those that provided feedback.  I've worked to
substantially update the PEP in response.  The text is included below.
Further feedback is appreciated.


Thank you! This version is much clearer. I like the PEP more and more!

I've sent a PR with a some typo fixes: 
https://github.com/python/peps/pull/2348

and I have a few comments:


[...]

Public Refcount Details

[...]

As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and
which are considered implementation details.  Specifically, they should
use the existing public refcount-related API and the only refcount value
with any meaning is 0.  All other values are considered "not 0".


Should we care about hacks/optimizations that rely on having the only 
reference (or all references), e.g. mutating a tuple if it has refcount 
1? Immortal objects shouldn't break them (the special case simply won't 
apply), but this wording would make them illegal.
AFAIK CPython uses this internally, but I don't know how 
prevalent/useful it is in third-party code.



[...]


_Py_IMMORTAL_REFCNT
---

We will add two internal constants::

 #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
 #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))


As a nitpick: could you say this in prose?

* ``_Py_IMMORTAL_BIT`` has the third top-most bit set.
* ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set.


[...]


Immortal Global Objects
---

All objects that we expect to be shared globally (between interpreters)
will be made immortal.  That includes the following:

* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
   small ints)

All such objects will be immutable.  In the case of the static types,
they will be effectively immutable.  ``PyTypeObject`` has some mutable
start (``tp_dict`` and ``tp_subclasses``), but we can work around this
by storing that state on ``PyInterpreterState`` instead of on the
respective static type object.  Then the ``__dict__``, etc. getter
will do a lookup on the current interpreter, if appropriate, instead
of using ``tp_dict``.


But tp_dict is also public C-API. How will that be handled?
Perhaps naively, I thought static types' dicts could be treated as 
(deeply) immutable, and shared?


Perhaps it would be best to leave it out here and say say "The details 
of sharing ``PyTypeObject`` across interpreters are left to another PEP"?
Even so, I'd love to know the plan. (And even if these are internals, 
changes to them should be mentioned in What's New, for the sake of 
people who need to maintain old extensions.)





Object Cleanup
--

In order to clean up all immortal objects during runtime finalization,
we must keep track of them.

For GC objects ("containers") we'll leverage the GC's permanent
generation by pushing all immortalized containers there.  During
runtime shutdown, the strategy will be to first let the runtime try
to do its best effort of deallocating these instances normally.  Most
of the module deallocation will now be handled by
``pylifecycle.c:finalize_modules()`` which cleans up the remaining
modules as best as we can.  It will change which modules are available
during __del__ but that's already defined as undefined behavior by the
docs.  Optionally, we could do some topological disorder to guarantee
that user modules will be deallocated first before the stdlib modules.
Finally, anything leftover (if any) can be found through the permanent
generation gc list which we can clear after finalize_modules().

For non-container objects, the tracking approach will vary on a
case-by-case basis.  In nearly every case, each such object is directly
accessible on the runtime state, e.g. in a ``_PyRuntimeState`` or
``PyInterpreterState`` field.  We may need to add a tracking mechanism
to the runtime state for a small number of objects.


Out of curiosity: How does this extra work affect in the performance? Is 
it part of the 4% slowdown?




And from the other thread:

On 17. 02. 22 18:23, Eric Snow wrote:
> On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
 Weren't you planning a PEP on subinterpreter GIL as well? Do you 
want to

 submit them together?
>>>
>>> I'd have to think about that.  The other PEP I'm writing for
>>> per-interpreter GIL doesn't require immortal objects.  They just
>>> simplify a number of things.  That's my motivation for writing this
>>> PEP, in fact. :)
>>
>> Please think about it.
>> If you removed the benefits for per-interpreter GIL, the motivation
>> section would be reduced to is memory savings for fork/CoW. (And lots of
>> performance improvements that are great in theory but sum up to a 4% 
loss.)

>
> 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-19 Thread Inada Naoki
Hi,

I hope per-interpreter GIL success at some point, and I know this is
needed for per-interpreter GIL.

But I am worrying about per-interpreter GIL may be too complex to
implement and maintain for core developers and extension writers.
As you know, immortal don't mean sharable between interpreters. It is
too difficult to know which object can be shared, and where the
shareable objects are leaked to other interpreters.
So I am not sure that per interpreter GIL is achievable goal.

So I think it's too early to introduce the immortal objects in Python
3.11, unless it *improve* performance without per-interpreter GIL
Instead, we can add a configuration option such as
`--enalbe-experimental-immortal`.


On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
>
> Reducing CPU Cache Invalidation
> ---
>
> Avoiding Data Races
> ---
>

Both benefits require a per-interpreter GIL.

>
> Avoiding Copy-on-Write
> --
>
> For some applications it makes sense to get the application into
> a desired initial state and then fork the process for each worker.
> This can result in a large performance improvement, especially
> memory usage.  Several enterprise Python users (e.g. Instagram,
> YouTube) have taken advantage of this.  However, the above
> refcount semantics drastically reduce the benefits and
> has led to some sub-optimal workarounds.
>

As I wrote before, fork is very difficult to use safely. We can not
recommend to use it for many users.
And I don't think reducing the size of patch in Instagram or YouTube
is not good rational for this kind of change.


> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.  Anything that uses ``mmap``
> relies on copy-on-write, including sharing data from shared objects
> files between processes.
>

It is very difficult to reduce CoW with mmap(MAP_PRIVATE).

You may need to write hash of bytes and unicode. You may be need to
write `tp_type`.
Immortal objects can "reduce" the memory write. But "at least one
memory write" is enough to trigger the CoW.


> Accidental Immortality
> --
>
> While it isn't impossible, this accidental scenario is so unlikely
> that we need not worry.  Even if done deliberately by using
> ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> If that CPU were 32-bit then it is (technically) more possible though
> still highly unlikely.
>

Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.


>
> Constraints
> ---
>
> * ensure that otherwise immutable objects can be truly immutable
> * be careful when immortalizing objects that are not otherwise immutable

I am not sure about what this means.
For example, unicode objects are not immutable because they have hash,
utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
3.12).


>
> Object Cleanup
> --
>
> In order to clean up all immortal objects during runtime finalization,
> we must keep track of them.
>

I don't think we need to clean up all immortal objects.

Of course, we should care immortal by default objects.
But for user-marked immortal objects, it's very difficult to guarantee
__del__ or weakref callback is called safely.

Additionally, if they are marked immortal for avoiding CoW, cleanup cause CoW.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7FCNNQOTIUZTBFZUPYRDSLND6WCVM3JO/
Code of Conduct: http://python.org/psf/codeofconduct/