[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-14 Thread Petr Viktorin

On 12. 03. 22 2:45, Eric Snow wrote:

responses inline


I'll snip some discussion for a reason I'll get to later, and get right 
to the third alternative:



[...]

"Special-casing immortal objects in tp_dealloc() for the relevant types
(but not int, due to frequency?)" sounds promising.

The "relevant types" are those for which we skip calling incref/decref
entirely, like in Py_RETURN_NONE. This skipping is one of the optional
optimizations, so we're entirely in control of if/when to apply it.


We would definitely do it for those types.  NoneType and bool already
have a tp_dealloc that calls Py_FatalError() if triggered.  The
tp_dealloc for str & tuple have special casing for some singletons
that do likewise.  In PyType_Type.tp_dealloc we have a similar assert
for static types.  In each case we would instead reset the refcount to
the initial immortal value.  Regardless, in practice we may only need
to worry (as noted above) about the problem for the most commonly used
global objects, so perhaps we could stop there.

However, it depends on what the level of risk is, such that it would
warrant incurring additional potential performance/maintenance costs.
What is the likelihood of actual crashes due to pathological
de-immortalization in older stable ABI extensions?  I don't have a
clear answer to offer on that but I'd only expect it to be a problem
if such extensions are used heavily in (very) long-running processes.


How much would it slow things back down if it wasn't done for ints at all?


I'll look into that.  We're talking about the ~260 small ints, so it
depends on how much they are used relative to all the other int
objects that are used in a program.


Not only that -- as far as I understand, it's only cases where we know 
at compile time that a small int is being returned. AFAIK, that would be 
fast branches of aruthmetic code, but not much else.


If not optimizing small ints is OK performance-wise, then everything 
looks good: we say that the “skip incref/decref” optimization can only 
be done for types whose instances are *all* immortal, leave it to future 
discussions to relax the requirement, and PEP 683 is good to go!


With that I mind I snipped your discussion of the previous alternative. 
Going with this one wouldn't prevent us from doing something more clever 
in the future.




Some more reasoning for not worrying about de-immortalizing in types
without this optimization:
These objects will be de-immortalized with refcount around 2^29, and
then incref/decref go back to being paired properly. If 2^29 is much
higher than the true reference count at de-immortalization, this'll just
cause a memory leak at shutdown.
And it's probably OK to assume that the true reference count of an
object can't be anywhere near 2^29: most of the time, to hold a
reference you also need to have a pointer to the referenced object, and
there ain't enough memory for that many pointers. This isn't a formally
sound assumption, of course -- you can incref a million times with a
single pointer if you pair the decrefs correctly. But it might be why we
had no issues with "int won't overflow", an assumption which would fail
with just 4× higher numbers.


Yeah, if we're dealing with properly paired incref/decref then the
worry about crashing after de-immortalization is mostly gone.  The
problem is where in the runtime we would simply not call Py_INCREF()
on certain objects because we know they are immortal.  For instance,
Py_RETURN_NONE (outside the older stable ABI) would no longer incref,
while the problematic stable ABI extension would keep actually
decref'ing until we crash.

Again, I'm not sure what the likelihood of this case is.  It seems
very unlikely to me.


Of course, the this argument would apply to immortalization and 64-bit
builds as well. I wonder if there are holes in it :)


With the numbers involved on 64-bit the problem is super unlikely due
to the massive numbers we're talking about, so we don't need to worry.
Or perhaps I misunderstood your point?


That's true. However, as we're adjusting incref/decref documentation for 
this PEP anyway, it looks like we could add “you should keep a pointer 
around for each reference you hold”, and go from  “super unlikely” to 
“impossible in well-behaved code” :)



Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is
valid, refcount values 1 and 0 can no longer be treated specially.
That's probably not a practical issue for the relevant types, but it's
one more thing to think about when applying the optimization.


Given the low chance of the pathological case, the nature of the
conditions where it might happen, and the specificity of 0 and 1
amongst all the possible values, I wouldn't consider this a problem.


+1. But it's worth mentioning that it's not a problem.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-11 Thread Eric Snow
responses inline

-eric

On Wed, Mar 9, 2022 at 8:23 AM Petr Viktorin  wrote:
> "periodically reset the refcount for immortal objects (only enable this
> if a stable ABI extension is imported?)" -- that sounds quite expensive,
> both at runtime and maintenance-wise.

Are you talking just about "(only enable this if a stable ABI
extension is imported?)"?  Such a check could certainly be expensive
but it doesn't have to be.  However, I'm guessing that you are
actually talking about the mechanism to periodically reset the
refcount.

The actual periodic reset doesn't seem like it needs to be all that
expensive overall.  It would just need to be in a place that gets
triggered often enough, but not too often such that the extra cost of
resetting the refcount would be a problem.

One important factor is whether we need to worry about potential
de-immortalization for all immortal objects or only for a specific
subset, like the most commonly used objects (at least most commonly
used by the problematic older stable ABI extensions),  Mostly, we only
need to be concerned with the objects that are likely to trigger
de-immortalization in those extensions.  Realistically, there aren't
many potential immortal objects that would be exposed to the
de-immortalization problem (e.g. None, True, False), so we could limit
this workaround to them.

A variety of options come to mind.  In each case we would reset the
refcount of a given object if it is immortal.  (We would also only do
so if the refcount actually changed--to avoid cache invalidation and
copy-on-write.)

If we need to worry about *all* immortal objects then I see several options:

1. in a single place where stable ABI extensions are likely to pass
all objects often enough
2. in a single place where all objects pass through often enough

On the other hand, if we only need to worry about a fixed set of
objects, the following options come to mind:

1. in a single place that is likely to be called by older stable ABI extensions
2. in a place that runs often enough, targeting a hard-coded group of
immortal objects (common static globals like None)
   * perhaps in the eval breaker code, in exception handling, etc.
3. like (2) but rotate through subsets of the hard-coded group (to
reduce the overall cost)
4. like (2), but in spread out in type-specific code (e.g. static
types could be reset in type_dealloc())

Again, none of those should be in code that runs often enough that the
overhead would add up.

> "provide a runtime flag for disabling immortality" also doesn't sound
> workable to me. We'd essentially need to run all tests twice every time
> to make sure it stays working.

Yeah, that makes it not worth it.

> "Special-casing immortal objects in tp_dealloc() for the relevant types
> (but not int, due to frequency?)" sounds promising.
>
> The "relevant types" are those for which we skip calling incref/decref
> entirely, like in Py_RETURN_NONE. This skipping is one of the optional
> optimizations, so we're entirely in control of if/when to apply it.

We would definitely do it for those types.  NoneType and bool already
have a tp_dealloc that calls Py_FatalError() if triggered.  The
tp_dealloc for str & tuple have special casing for some singletons
that do likewise.  In PyType_Type.tp_dealloc we have a similar assert
for static types.  In each case we would instead reset the refcount to
the initial immortal value.  Regardless, in practice we may only need
to worry (as noted above) about the problem for the most commonly used
global objects, so perhaps we could stop there.

However, it depends on what the level of risk is, such that it would
warrant incurring additional potential performance/maintenance costs.
What is the likelihood of actual crashes due to pathological
de-immortalization in older stable ABI extensions?  I don't have a
clear answer to offer on that but I'd only expect it to be a problem
if such extensions are used heavily in (very) long-running processes.

> How much would it slow things back down if it wasn't done for ints at all?

I'll look into that.  We're talking about the ~260 small ints, so it
depends on how much they are used relative to all the other int
objects that are used in a program.

> Some more reasoning for not worrying about de-immortalizing in types
> without this optimization:
> These objects will be de-immortalized with refcount around 2^29, and
> then incref/decref go back to being paired properly. If 2^29 is much
> higher than the true reference count at de-immortalization, this'll just
> cause a memory leak at shutdown.
> And it's probably OK to assume that the true reference count of an
> object can't be anywhere near 2^29: most of the time, to hold a
> reference you also need to have a pointer to the referenced object, and
> there ain't enough memory for that many pointers. This isn't a formally
> sound assumption, of course -- you can incref a million times with a
> single pointer if you pair the decrefs correctly. But it 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-10 Thread Petr Viktorin

On 10. 03. 22 3:35, Jim J. Jewett wrote:

"periodically reset the refcount for immortal objects (only enable this
if a stable ABI extension is imported?)" -- that sounds quite expensive,
both at runtime and maintenance-wise.


As I understand it, the plan is to represent an immortal object by setting two 
high-order bits to 1.  The higher bit is the actual test, and the one 
representing half of that is a safety margin.

When reducing the reference count, CPython already checks whether the refcount's 
new value is 0.  It could instead check whether refcount & (not !immortal_bit) 
is 0, which would detect when the safety margin has been reduced to 0 -- and could 
then add it back in.  Since the bit manipulation is not conditional, the only extra 
branch will occur when an object is about to be de-allocated, and that might be 
rare enough to be an acceptable cost.  (It still doesn't prevent rollover from too 
many increfs,  but ... that should indeed be rare in the wild.)


The problem is that Py_DECREF is a macro, and its old behavior is 
compiled into some extensions. We can't change this without breaking the 
stable ABI.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FPKK46N4RBAEJUU5PCQZ3SIZ7GREETTH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-09 Thread Jim J. Jewett
> "periodically reset the refcount for immortal objects (only enable this
> if a stable ABI extension is imported?)" -- that sounds quite expensive, 
> both at runtime and maintenance-wise.

As I understand it, the plan is to represent an immortal object by setting two 
high-order bits to 1.  The higher bit is the actual test, and the one 
representing half of that is a safety margin.

When reducing the reference count, CPython already checks whether the 
refcount's new value is 0.  It could instead check whether refcount & (not 
!immortal_bit) is 0, which would detect when the safety margin has been reduced 
to 0 -- and could then add it back in.  Since the bit manipulation is not 
conditional, the only extra branch will occur when an object is about to be 
de-allocated, and that might be rare enough to be an acceptable cost.  (It 
still doesn't prevent rollover from too many increfs,  but ... that should 
indeed be rare in the wild.)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O324Q4KMMXL2UHOQIZZWS52U7YHJGYEI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-09 Thread Petr Viktorin

On 09. 03. 22 4:58, Eric Snow wrote:

On Mon, Feb 28, 2022 at 6:01 PM Eric Snow  wrote:

The updated PEP text is included below.  The largest changes involve
either the focus of the PEP (internal mechanism to mark objects
immortal) or the possible ways that things can break on older 32-bit
stable ABI extensions.  All other changes are smaller.


In particular, I'm hoping to get your thoughts on the "Accidental
De-Immortalizing" section.  While I'm confident we will find a good
solution, I'm not yet confident about the specific solution.  So
feedback would be appreciated.  Thanks!


Hi,
I like the newest version, except this one section is concerning.


"periodically reset the refcount for immortal objects (only enable this 
if a stable ABI extension is imported?)" -- that sounds quite expensive, 
both at runtime and maintenance-wise.


"provide a runtime flag for disabling immortality" also doesn't sound 
workable to me. We'd essentially need to run all tests twice every time 
to make sure it stays working.



"Special-casing immortal objects in tp_dealloc() for the relevant types 
(but not int, due to frequency?)" sounds promising.


The "relevant types" are those for which we skip calling incref/decref 
entirely, like in Py_RETURN_NONE. This skipping is one of the optional 
optimizations, so we're entirely in control of if/when to apply it. How 
much would it slow things back down if it wasn't done for ints at all?




Some more reasoning for not worrying about de-immortalizing in types 
without this optimization:
These objects will be de-immortalized with refcount around 2^29, and 
then incref/decref go back to being paired properly. If 2^29 is much 
higher than the true reference count at de-immortalization, this'll just 
cause a memory leak at shutdown.
And it's probably OK to assume that the true reference count of an 
object can't be anywhere near 2^29: most of the time, to hold a 
reference you also need to have a pointer to the referenced object, and 
there ain't enough memory for that many pointers. This isn't a formally 
sound assumption, of course -- you can incref a million times with a 
single pointer if you pair the decrefs correctly. But it might be why we 
had no issues with "int won't overflow", an assumption which would fail 
with just 4× higher numbers.


Of course, the this argument would apply to immortalization and 64-bit 
builds as well. I wonder if there are holes in it :)


Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is 
valid, refcount values 1 and 0 can no longer be treated specially. 
That's probably not a practical issue for the relevant types, but it's 
one more thing to think about when applying the optimization.



There's also the other direction to consider: if an old stable-ABI 
extension does unpaired *increfs* on an immortal object, it'll 
eventually overflow the refcount.
When the refcount is negative, decref will currently crash if built with 
Py_DEBUG, and I think we want to keep that check/crash. (Note that 
either be Python itself or any extension could be built with Py_DEBUG.)
Hopefully we can live with that, and hope anyone running with Py_DEBUG 
will send a useful use case report.

Or is there another bit before the sign this'll mess up?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7ZSLUMOIOV676UH42LIWGQASFMXBWSBN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-08 Thread Eric Snow
On Mon, Feb 28, 2022 at 6:01 PM Eric Snow  wrote:
> The updated PEP text is included below.  The largest changes involve
> either the focus of the PEP (internal mechanism to mark objects
> immortal) or the possible ways that things can break on older 32-bit
> stable ABI extensions.  All other changes are smaller.

In particular, I'm hoping to get your thoughts on the "Accidental
De-Immortalizing" section.  While I'm confident we will find a good
solution, I'm not yet confident about the specific solution.  So
feedback would be appreciated.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2NNPKXRL6HY7IYUDMEQ6DS5RC3AYQKYQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
On Wed, Feb 23, 2022 at 4:21 PM Antonio Cuni  wrote:
> When refcheck=True (the default), numpy raises an error if you try to resize 
> an array inplace whose refcnt > 2 (although I don't understand why > 2 and 
> not > 1, and the docs aren't very clear about this).
>
> That said, relying on the exact value of the refcnt is very bad for 
> alternative implementations and for HPy, and in particular it is impossible 
> to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this 
> point of view, a wording which explicitly restricts the "legal" usage of the 
> refcnt details would be very welcome.

Thanks for the feedback and example.  It helps.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D23Z3C7CQIIGALDRSU4RDDM7GVUAASGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
On Wed, Feb 23, 2022 at 9:16 AM Petr Viktorin  wrote:
>>> But tp_dict is also public C-API. How will that be handled?
>>> Perhaps naively, I thought static types' dicts could be treated as
>>> (deeply) immutable, and shared?
>>
>> They are immutable from Python code but not from C (due to tp_dict).
>> Basically, we will document that tp_dict should not be used directly
>> (in the public API) and refer users to a public getter function.  I'll
>> note this in the PEP.
>
> What worries me is that existing users of the API haven't read the new
> documentation. What will happen if users do use it?
> Or worse, add things to it?

We will probably set it to NULL, so the user code would fail or crash.
I suppose we could set it to a dummy object that emits helpful errors.

However, I don't think that is worth it.  We're talking about where
users are directly accessing tp_dict of the builtin static types, not
their own.  That is already something they should definitely not be
doing.

> (Hm, the current docs are already rather confusing -- 3.2 added a note
> that "It is not safe to ... modify tp_dict with the dictionary C-API.",
> but above that it says "extra attributes for the type may be added to
> this dictionary [in some cases]")

Yeah, the docs will have to be clarified.

>> Having thought about it some more, I don't think this PEP should be
>> strictly bound to per-interpreter GIL.  That is certainly my personal
>> motivation.  However, we have a small set of users that would benefit
>> significantly, the change is relatively small and simple, and the risk
>> of breaking users is also small.
>
> Right, with the recent performance improvements it's looking like it
> might stand on its own after all.

Great!

>> Honestly, it might not have needed a PEP in the first place if I
>> had been a bit more clear about the idea earlier.
>
> Maybe it's good to have a PEP to clear that up :)

Yeah, the PEP process has been helpful for that. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AKFMFZ45UJXED24YRB4NHQ4HT442XVSP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow
Responses inline below.

-eric

On Tue, Feb 22, 2022 at 7:22 PM Inada Naoki  wrote:
> > For a recent example, see
> > https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>
> It is not proven example, but just a hope at the moment. So option is
> fine to prove the idea.
>
> Although I can not read the code, they said "patching ASLR by patching
> `ob_type` fields;".
> It will cause CoW for most objects, isn't it?
>
> So reducing memory write don't directly means reducing CoW.
> Unless we can stop writing on a page completely, the page will be copied.

Yeah, they would have to address that.

> > CPU cache invalidation exists regardless.  With the current GIL the
> > effect it is reduced significantly.
>
> It's an interesting point. We can not see the benefit from
> pypeformance, because it doesn't use much data and it runs one process
> at a time.
> So the pyperformance can not make enough stress to the last level
> cache which is shared by many cores.
>
> We need multiprocess performance benchmark apart from pyperformance,
> to stress the last level cache from multiple cores.
> It helps not only this PEP, but also optimizing containers like dict and set.

+1

> Can proposed optimizations to eliminate the penalty guarantee that
> every __del__, weakref are not broken,
> and no memory leak occurs when the Python interpreter is initialized
> and finalized multiple times?
> I haven't confirmed it yet.

They will not break __del__ or weakrefs.  No memory will leak after
finalization.  If any of that happens then it is a bug.

> FWIW, I filed an issue to remove hash cache from bytes objects.
> https://github.com/faster-cpython/ideas/issues/290
>
> Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...)
> Removing it will save some RAM usage and make immortal bytes truly
> immutable, safe to be shared between interpreters.

+1  Thanks!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QKMPALMWGF5366C6PQRSIIFVNXKF4UAM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Sebastian Berg
On Thu, 2022-02-24 at 00:21 +0100, Antonio Cuni wrote:
> On Mon, Feb 21, 2022 at 5:18 PM Petr Viktorin 
> wrote:
> 
> Should we care about hacks/optimizations that rely on having the only
> > reference (or all references), e.g. mutating a tuple if it has
> > refcount
> > 1? Immortal objects shouldn't break them (the special case simply
> > won't
> > apply), but this wording would make them illegal.
> > AFAIK CPython uses this internally, but I don't know how
> > prevalent/useful it is in third-party code.
> > 
> 
> FWIW, a real world example of this is numpy.ndarray.resize(...,
> refcheck=True):
> https://numpy.org/doc/stable/reference/generated/numpy.ndarray.resize.html#numpy.ndarray.resize
> https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/shape.c#L114
> 
> When refcheck=True (the default), numpy raises an error if you try to
> resize an array inplace whose refcnt > 2 (although I don't understand
> why >
> 2 and not > 1, and the docs aren't very clear about this).
> 
> That said, relying on the exact value of the refcnt is very bad for
> alternative implementations and for HPy, and in particular it is
> impossible
> to implement ndarray.resize(refcheck=True) correctly on PyPy. So from
> this
> point of view, a wording which explicitly restricts the "legal" usage
> of
> the refcnt details would be very welcome.

Yeah, NumPy resizing is a bit of an awkward point, I would be on-board
for just replacing resize for non

NumPy does also have a bit of magic akin to the "string concat" trick
for operations like:

a + b + c

where it will try do magic and use the knowledge that it can
mutate/reuse the temporary array, effectively doing:

tmp = a + b
tmp += c

(which requires some stack walking magic additionally to the refcount!)

Cheers,

Sebastian


> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ACJIER45M6XLKUWT6TCLB6QXVZSB74EH/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HSCF5XPQMWRX45Y2PVNPVSCDT4GC6PTB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Antonio Cuni
On Mon, Feb 21, 2022 at 5:18 PM Petr Viktorin  wrote:

Should we care about hacks/optimizations that rely on having the only
> reference (or all references), e.g. mutating a tuple if it has refcount
> 1? Immortal objects shouldn't break them (the special case simply won't
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how
> prevalent/useful it is in third-party code.
>

FWIW, a real world example of this is numpy.ndarray.resize(...,
refcheck=True):
https://numpy.org/doc/stable/reference/generated/numpy.ndarray.resize.html#numpy.ndarray.resize
https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/shape.c#L114

When refcheck=True (the default), numpy raises an error if you try to
resize an array inplace whose refcnt > 2 (although I don't understand why >
2 and not > 1, and the docs aren't very clear about this).

That said, relying on the exact value of the refcnt is very bad for
alternative implementations and for HPy, and in particular it is impossible
to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this
point of view, a wording which explicitly restricts the "legal" usage of
the refcnt details would be very welcome.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ACJIER45M6XLKUWT6TCLB6QXVZSB74EH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Brett Cannon
On Wed, Feb 23, 2022 at 8:19 AM Petr Viktorin  wrote:

> On 23. 02. 22 2:46, Eric Snow wrote:
>
>
[SNIP]


>
> > So it seems like the bar should be pretty low for this one (assuming
> > we get the performance penalty low enough).  If it were some massive
> > or broadly impactful (or even clearly public) change then I suppose
> > you could call the motivation weak.  However, this isn't that sort of
> > PEP.


Yes, but PEPs are not just about complexity, but also impact on users. And
"impact" covers backwards-compatibility which includes performance
regressions (i.e. making Python slower means it may no longer be a viable
for someone with specific performance requirements). So with the initial 4%
performance regression it made sense to write a PEP.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z4SXVNRLHFWRPLB4UQZQVW7SKDUJH6GY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-23 Thread Petr Viktorin

On 23. 02. 22 2:46, Eric Snow wrote:

Thanks for the responses.  I've replied inline below.


Same here :)



Immortal Global Objects
---

All objects that we expect to be shared globally (between interpreters)
will be made immortal.  That includes the following:

* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
small ints)

All such objects will be immutable.  In the case of the static types,
they will be effectively immutable.  ``PyTypeObject`` has some mutable
start (``tp_dict`` and ``tp_subclasses``), but we can work around this
by storing that state on ``PyInterpreterState`` instead of on the
respective static type object.  Then the ``__dict__``, etc. getter
will do a lookup on the current interpreter, if appropriate, instead
of using ``tp_dict``.


But tp_dict is also public C-API. How will that be handled?
Perhaps naively, I thought static types' dicts could be treated as
(deeply) immutable, and shared?


They are immutable from Python code but not from C (due to tp_dict).
Basically, we will document that tp_dict should not be used directly
(in the public API) and refer users to a public getter function.  I'll
note this in the PEP.


What worries me is that existing users of the API haven't read the new 
documentation. What will happen if users do use it?

Or worse, add things to it?

(Hm, the current docs are already rather confusing -- 3.2 added a note 
that "It is not safe to ... modify tp_dict with the dictionary C-API.", 
but above that it says "extra attributes for the type may be added to 
this dictionary [in some cases]")



[...]

And from the other thread:

On 17. 02. 22 18:23, Eric Snow wrote:
  > On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
   Weren't you planning a PEP on subinterpreter GIL as well? Do you
want to
   submit them together?
  >>>
  >>> I'd have to think about that.  The other PEP I'm writing for
  >>> per-interpreter GIL doesn't require immortal objects.  They just
  >>> simplify a number of things.  That's my motivation for writing this
  >>> PEP, in fact. :)
  >>
  >> Please think about it.
  >> If you removed the benefits for per-interpreter GIL, the motivation
  >> section would be reduced to is memory savings for fork/CoW. (And lots of
  >> performance improvements that are great in theory but sum up to a 4%
loss.)
  >
  > Sounds good.  Would this involve more than a note at the top of the PEP?

No, a note would work great. If you read the motivation carefully, it's
(IMO) clear that it's rather weak without the other PEP. But that
realization shouldn't come as a surprise to the reader.


Having thought about it some more, I don't think this PEP should be
strictly bound to per-interpreter GIL.  That is certainly my personal
motivation.  However, we have a small set of users that would benefit
significantly, the change is relatively small and simple, and the risk
of breaking users is also small.  In fact, we regularly have more
disruptive changes to internals that do not require a PEP.


Right, with the recent performance improvements it's looking like it 
might stand on its own after all.



So it seems like the bar should be pretty low for this one (assuming
we get the performance penalty low enough).  If it were some massive
or broadly impactful (or even clearly public) change then I suppose
you could call the motivation weak.  However, this isn't that sort of
PEP.  Honestly, it might not have needed a PEP in the first place if I
had been a bit more clear about the idea earlier.


Maybe it's good to have a PEP to clear that up :)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZTON72YXUUFV5MX5KIEM3DDNAUAZT4M6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Tue, Feb 22, 2022, 20:26 Larry Hastings  wrote:

> Are these optimizations specifically for the PR, or are these
> optimizations we could apply without taking the immortal objects?  Kind of
> like how Sam tried to offset the nogil slowdown by adding optimizations
> that we went ahead and added anyway ;-)
>

Basically all the optimizations require immortal objects.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7VJVBFBWE3HWTPRVZH3WLSR7EZHZD337/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Larry Hastings

On 2/22/22 6:00 PM, Eric Snow wrote:

On Sat, Feb 19, 2022 at 12:46 AM Eric Snow  wrote:

Performance
---

A naive implementation shows `a 4% slowdown`_.
Several promising mitigation strategies will be pursued in the effort
to bring it closer to performance-neutral.  See the `mitigation`_
section below.

FYI, Eddie has been able to get us back to performance-neutral after
applying several of the mitigation strategies we discussed. :)



Are these optimizations specifically for the PR, or are these 
optimizations we could apply without taking the immortal objects? Kind 
of like how Sam tried to offset the nogil slowdown by adding 
optimizations that we went ahead and added anyway ;-)



//arry/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7GX4C3DQ23B2K5JXTOYQQPT2ZLJD7CP4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-22 Thread Inada Naoki
On Wed, Feb 23, 2022 at 1:46 AM Eddie Elizondo via Python-Dev
 wrote:
>
>
> That article is five years old so it doesn't reflect the current state of the 
> system! We have continuous profiling and monitoring of Copy on Writes and 
> after introducing the techniques described in this PEP, we have largely fixed 
> the majority of scenarios where this happens.
>
> You are right in the fact that just addressing reference counting will not 
> fix all CoW issues. The trick here is also to leverage the permanent GC 
> generation used for the `gc.freeze` API. That is, if you have a container 
> that it's known to be immortal, it should be pushed into the permanent GC 
> generation. This will guarantee that the GC itself will not change the GC 
> headers of said instance.
>
> Thus, if you immortalize your heap before forking (using the techniques in: 
> https://github.com/python/cpython/pull/31489) then you'll end up removing the 
> vast majority of scenarios where CoW takes place. I can look into writing a 
> new technical article for Instagram with more up to date info but this might 
> take time to get through!
>
> Now, I said that we've largely fixed the CoW issue because there are still 
> places where it happens such as: free lists, the small object allocator, etc. 
> But these are relatively small compared to the ones coming from reference 
> counts and the GC head mutations.

Same technique don't guarantee same benefit. Like gc.freeze() is
needed before immortalize to avoid CoW, some other tricks may be
needed too.
New article is welcome, but I want sample application we can run,
profile, and measure the benefits.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AUF5R62E7YT22LL4DJ5HI3FCS3ZPHSTL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Inada Naoki
On Wed, Feb 23, 2022 at 10:12 AM Eric Snow  wrote:
>
> Thanks for the feedback.  I've responded inline below.
>
> -eric
>
> On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> > I hope per-interpreter GIL success at some point, and I know this is
> > needed for per-interpreter GIL.
> >
> > But I am worrying about per-interpreter GIL may be too complex to
> > implement and maintain for core developers and extension writers.
> > As you know, immortal don't mean sharable between interpreters. It is
> > too difficult to know which object can be shared, and where the
> > shareable objects are leaked to other interpreters.
> > So I am not sure that per interpreter GIL is achievable goal.
>
> I plan on addressing this in the PEP I am working on for
> per-interpreter GIL.  In the meantime, I doubt the issue will impact
> any core devs.
>

It's nice to hear!


> > So I think it's too early to introduce the immortal objects in Python
> > 3.11, unless it *improve* performance without per-interpreter GIL
> > Instead, we can add a configuration option such as
> > `--enalbe-experimental-immortal`.
>
> I agree that immortal objects aren't quite as appealing in general
> without per-interpreter GIL.  However, there are actual users that
> will benefit from it, assuming we can reduce the performance penalty
> to acceptable levels.  For a recent example, see
> https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>

It is not proven example, but just a hope at the moment. So option is
fine to prove the idea.

Although I can not read the code, they said "patching ASLR by patching
`ob_type` fields;".
It will cause CoW for most objects, isn't it?

So reducing memory write don't directly means reducing CoW.
Unless we can stop writing on a page completely, the page will be copied.


> > On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  
> > wrote:
> > >
> > > Reducing CPU Cache Invalidation
> > > ---
> > >
> > > Avoiding Data Races
> > > ---
> > >
> >
> > Both benefits require a per-interpreter GIL.
>
> CPU cache invalidation exists regardless.  With the current GIL the
> effect it is reduced significantly.
>

It's an interesting point. We can not see the benefit from
pypeformance, because it doesn't use much data and it runs one process
at a time.
So the pyperformance can not make enough stress to the last level
cache which is shared by many cores.

We need multiprocess performance benchmark apart from pyperformance,
to stress the last level cache from multiple cores.
It helps not only this PEP, but also optimizing containers like dict and set.


> >
> > As I wrote before, fork is very difficult to use safely. We can not
> > recommend to use it for many users.
> > And I don't think reducing the size of patch in Instagram or YouTube
> > is not good rational for this kind of change.
>
> What do you mean by "this kind of change"?  The proposed change is
> relatively small.  It certainly isn't nearly as intrusive as many
> changes we make to internals without a PEP.  If you are talking about
> the performance penalty, we should be able to eliminate it.
>

Can proposed optimizations to eliminate the penalty guarantee that
every __del__, weakref are not broken,
and no memory leak occurs when the Python interpreter is initialized
and finalized multiple times?
I haven't confirmed it yet.


> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > > relies on copy-on-write, including sharing data from shared objects
> > > files between processes.
> > >
> >
> > It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
> >
> > You may need to write hash of bytes and unicode. You may be need to
> > write `tp_type`.
> > Immortal objects can "reduce" the memory write. But "at least one
> > memory write" is enough to trigger the CoW.
>
> Correct.  However, without immortal objects (AKA immutable per-object
> runtime-state) it goes from "very difficult" to "basically
> impossible".
>

Configuration option won't make it impossible.


> > >
> > > Constraints
> > > ---
> > >
> > > * ensure that otherwise immutable objects can be truly immutable
> > > * be careful when immortalizing objects that are not otherwise immutable
> >
> > I am not sure about what this means.
> > For example, unicode objects are not immutable because they have hash,
> > utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
> > 3.12).
>
> I think you understood it correctly.  In the case of str objects, they
> are close enough since a race on any of those values will not cause a
> different outcome.
>
> I will clarify the point in the PEP.
>

FWIW, I filed an issue to remove hash cache from bytes objects.
https://github.com/faster-cpython/ideas/issues/290

Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...)
Removing it will save some RAM usage and make 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Sat, Feb 19, 2022 at 12:46 AM Eric Snow  wrote:
> Performance
> ---
>
> A naive implementation shows `a 4% slowdown`_.
> Several promising mitigation strategies will be pursued in the effort
> to bring it closer to performance-neutral.  See the `mitigation`_
> section below.

FYI, Eddie has been able to get us back to performance-neutral after
applying several of the mitigation strategies we discussed. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYGZEQSVBS6ODVAHPL3QN4CJ7JN4FYWO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Mon, Feb 21, 2022 at 4:56 PM Terry Reedy  wrote:
> We could say that the only refcounts with any meaning are 0, 1, and > 1.

Yeah, that should work.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7HZ7VBJQOYHXFV3ZD4V7DCMLBL4Q34WP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
On Mon, Feb 21, 2022 at 10:56 AM  wrote:
> For what it's worth Cython does this for string concatenation to concatenate 
> in place if possible (this optimization was copied from CPython). It could be 
> disabled relatively easily if it became a problem (it's already CPython only 
> and version checked so it'd just need another upper-bound version check).

That's good to know.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OEZS4KGQJET5DL3M2OTB76I4W7F56FJC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
Thanks for the responses.  I've replied inline below.

-eric

On Mon, Feb 21, 2022 at 9:11 AM Petr Viktorin  wrote:
>
> On 19. 02. 22 8:46, Eric Snow wrote:
> > Thanks to all those that provided feedback.  I've worked to
> > substantially update the PEP in response.  The text is included below.
> > Further feedback is appreciated.
>
> Thank you! This version is much clearer. I like the PEP more and more!

Great!

> I've sent a PR with a some typo fixes:
> https://github.com/python/peps/pull/2348

Thank you.

> > Public Refcount Details
> [...]
> > As part of this proposal, we must make sure that users can clearly
> > understand on which parts of the refcount behavior they can rely and
> > which are considered implementation details.  Specifically, they should
> > use the existing public refcount-related API and the only refcount value
> > with any meaning is 0.  All other values are considered "not 0".
>
> Should we care about hacks/optimizations that rely on having the only
> reference (or all references), e.g. mutating a tuple if it has refcount
> 1? Immortal objects shouldn't break them (the special case simply won't
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how
> prevalent/useful it is in third-party code.

Good point.  As Terry suggested, we could also let 1 have meaning.

Regardless, any documented restriction would only apply to users of
the public C-API, not to internal code.

> > _Py_IMMORTAL_REFCNT
> > ---
> >
> > We will add two internal constants::
> >
> >  #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
> >  #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))
>
> As a nitpick: could you say this in prose?
>
> * ``_Py_IMMORTAL_BIT`` has the third top-most bit set.
> * ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set.

Sure.

> > Immortal Global Objects
> > ---
> >
> > All objects that we expect to be shared globally (between interpreters)
> > will be made immortal.  That includes the following:
> >
> > * singletons (``None``, ``True``, ``False``, ``Ellipsis``, 
> > ``NotImplemented``)
> > * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> > * all static objects in ``_PyRuntimeState.global_objects`` (e.g. 
> > identifiers,
> >small ints)
> >
> > All such objects will be immutable.  In the case of the static types,
> > they will be effectively immutable.  ``PyTypeObject`` has some mutable
> > start (``tp_dict`` and ``tp_subclasses``), but we can work around this
> > by storing that state on ``PyInterpreterState`` instead of on the
> > respective static type object.  Then the ``__dict__``, etc. getter
> > will do a lookup on the current interpreter, if appropriate, instead
> > of using ``tp_dict``.
>
> But tp_dict is also public C-API. How will that be handled?
> Perhaps naively, I thought static types' dicts could be treated as
> (deeply) immutable, and shared?

They are immutable from Python code but not from C (due to tp_dict).
Basically, we will document that tp_dict should not be used directly
(in the public API) and refer users to a public getter function.  I'll
note this in the PEP.

> Perhaps it would be best to leave it out here and say say "The details
> of sharing ``PyTypeObject`` across interpreters are left to another PEP"?
> Even so, I'd love to know the plan.

What else would you like to know?  There isn't much to it.  For each
of the builtin static types we will keep the relevant mutable state on
PyInterpreterState and look it up there in the relevant getters (e.g.
__dict__ and __subclasses__).

> (And even if these are internals,
> changes to them should be mentioned in What's New, for the sake of
> people who need to maintain old extensions.)

+1

> > Object Cleanup
> > --
> >
> > In order to clean up all immortal objects during runtime finalization,
> > we must keep track of them.
> >
> > For GC objects ("containers") we'll leverage the GC's permanent
> > generation by pushing all immortalized containers there.  During
> > runtime shutdown, the strategy will be to first let the runtime try
> > to do its best effort of deallocating these instances normally.  Most
> > of the module deallocation will now be handled by
> > ``pylifecycle.c:finalize_modules()`` which cleans up the remaining
> > modules as best as we can.  It will change which modules are available
> > during __del__ but that's already defined as undefined behavior by the
> > docs.  Optionally, we could do some topological disorder to guarantee
> > that user modules will be deallocated first before the stdlib modules.
> > Finally, anything leftover (if any) can be found through the permanent
> > generation gc list which we can clear after finalize_modules().
> >
> > For non-container objects, the tracking approach will vary on a
> > case-by-case basis.  In nearly every case, each such object is directly
> > accessible on the 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow
Thanks for the feedback.  I've responded inline below.

-eric

On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> I hope per-interpreter GIL success at some point, and I know this is
> needed for per-interpreter GIL.
>
> But I am worrying about per-interpreter GIL may be too complex to
> implement and maintain for core developers and extension writers.
> As you know, immortal don't mean sharable between interpreters. It is
> too difficult to know which object can be shared, and where the
> shareable objects are leaked to other interpreters.
> So I am not sure that per interpreter GIL is achievable goal.

I plan on addressing this in the PEP I am working on for
per-interpreter GIL.  In the meantime, I doubt the issue will impact
any core devs.

> So I think it's too early to introduce the immortal objects in Python
> 3.11, unless it *improve* performance without per-interpreter GIL
> Instead, we can add a configuration option such as
> `--enalbe-experimental-immortal`.

I agree that immortal objects aren't quite as appealing in general
without per-interpreter GIL.  However, there are actual users that
will benefit from it, assuming we can reduce the performance penalty
to acceptable levels.  For a recent example, see
https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.

> On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
> >
> > Reducing CPU Cache Invalidation
> > ---
> >
> > Avoiding Data Races
> > ---
> >
>
> Both benefits require a per-interpreter GIL.

CPU cache invalidation exists regardless.  With the current GIL the
effect it is reduced significantly.

Per-interpreter GIL is only one situation where data races matter.
Any attempt to generally eliminate the GIL must deal with races on the
per-object runtime state.

> >
> > Avoiding Copy-on-Write
> > --
> >
> > For some applications it makes sense to get the application into
> > a desired initial state and then fork the process for each worker.
> > This can result in a large performance improvement, especially
> > memory usage.  Several enterprise Python users (e.g. Instagram,
> > YouTube) have taken advantage of this.  However, the above
> > refcount semantics drastically reduce the benefits and
> > has led to some sub-optimal workarounds.
> >
>
> As I wrote before, fork is very difficult to use safely. We can not
> recommend to use it for many users.
> And I don't think reducing the size of patch in Instagram or YouTube
> is not good rational for this kind of change.

What do you mean by "this kind of change"?  The proposed change is
relatively small.  It certainly isn't nearly as intrusive as many
changes we make to internals without a PEP.  If you are talking about
the performance penalty, we should be able to eliminate it.

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > relies on copy-on-write, including sharing data from shared objects
> > files between processes.
> >
>
> It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
>
> You may need to write hash of bytes and unicode. You may be need to
> write `tp_type`.
> Immortal objects can "reduce" the memory write. But "at least one
> memory write" is enough to trigger the CoW.

Correct.  However, without immortal objects (AKA immutable per-object
runtime-state) it goes from "very difficult" to "basically
impossible".

> > Accidental Immortality
> > --
> >
> > While it isn't impossible, this accidental scenario is so unlikely
> > that we need not worry.  Even if done deliberately by using
> > ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> > cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> > 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> > If that CPU were 32-bit then it is (technically) more possible though
> > still highly unlikely.
> >
>
> Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.

The question is if this matters.  If really necessary, the PEP can
demonstrate that it doesn't matter in practice.

(Also, the magic value on 32-bit would be 2**29.)

> >
> > Constraints
> > ---
> >
> > * ensure that otherwise immutable objects can be truly immutable
> > * be careful when immortalizing objects that are not otherwise immutable
>
> I am not sure about what this means.
> For example, unicode objects are not immutable because they have hash,
> utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
> 3.12).

I think you understood it correctly.  In the case of str objects, they
are close enough since a race on any of those values will not cause a
different outcome.

I will clarify the point in the PEP.

> > Object Cleanup
> > --
> >
> > In order to clean up all immortal objects during runtime finalization,
> > we must keep track of them.
> >
>
> I don't 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-22 Thread Eddie Elizondo via Python-Dev
Hey Inada, thanks for the feedback

> Generally speaking, fork is a legacy API. It is too difficult to know which 
> library is fork-safe, even for stdlibs.

Yes, this is something that Instagram has to go into great lengths to make sure 
that we get the entire execution into a state where it's safe to fork. It 
works, but it's hard to maintain. We'd rather have a simpler model!

> I hope per-interpreter GIL replaces fork use cases.

We hope so too, hence the big push towards having immutable shared state across 
the interpreters. For large applications like Instagram, this is a must, 
otherwise copying state into every interpreter would be too costly.

> Anyway, I don't believe stopping refcounting will fix the CoW issue yet. See 
> this article [1] again.

That article is five years old so it doesn't reflect the current state of the 
system! We have continuous profiling and monitoring of Copy on Writes and after 
introducing the techniques described in this PEP, we have largely fixed the 
majority of scenarios where this happens.

You are right in the fact that just addressing reference counting will not fix 
all CoW issues. The trick here is also to leverage the permanent GC generation 
used for the `gc.freeze` API. That is, if you have a container that it's known 
to be immortal, it should be pushed into the permanent GC generation. This will 
guarantee that the GC itself will not change the GC headers of said instance.

Thus, if you immortalize your heap before forking (using the techniques in: 
https://github.com/python/cpython/pull/31489) then you'll end up removing the 
vast majority of scenarios where CoW takes place. I can look into writing a new 
technical article for Instagram with more up to date info but this might take 
time to get through!

Now, I said that we've largely fixed the CoW issue because there are still 
places where it happens such as: free lists, the small object allocator, etc. 
But these are relatively small compared to the ones coming from reference 
counts and the GC head mutations.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TBXKYD6OOR7I5QAMTE3VAJT5YCDISOET/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread Terry Reedy

On 2/21/2022 11:11 AM, Petr Viktorin wrote:

On 19. 02. 22 8:46, Eric Snow wrote:



As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and
which are considered implementation details.  Specifically, they should
use the existing public refcount-related API and the only refcount value
with any meaning is 0.  All other values are considered "not 0".


Should we care about hacks/optimizations that rely on having the only 
reference (or all references), e.g. mutating a tuple if it has refcount 
1? Immortal objects shouldn't break them (the special case simply won't 
apply), but this wording would make them illegal.
AFAIK CPython uses this internally, but I don't know how 
prevalent/useful it is in third-party code.


We could say that the only refcounts with any meaning are 0, 1, and > 1.


--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C3R4FKO7PZETOSI5DTGMAXWVUTQM26AW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-21 Thread Chris Angelico
On Tue, 22 Feb 2022 at 03:00, Larry Hastings  wrote:
>
>
> On 2/21/22 22:06, Chris Angelico wrote:
>
> On Mon, 21 Feb 2022 at 16:47, Larry Hastings  wrote:
>
> While I don't think it's fine to play devil's advocate, given the choice 
> between "this will help a common production use-case" (pre-fork servers) and 
> "this could hurt a hypothetical production use case" (long-running 
> applications that reload modules enough times this could waste a significant 
> amount of memory), I think the former is more important.
>
> Can the cost be mitigated by reusing immortal objects? So, for
> instance, a module-level constant of 60*60*24*365 might be made
> immortal, meaning it doesn't get disposed of with the module, but if
> the module gets reloaded, no *additional* object would be created.
>
> I'm assuming here that any/all objects unmarshalled with the module
> can indeed be shared in this way. If that isn't always true, then that
> would reduce the savings here.
>
>
> It could, but we don't have any general-purpose mechanism for that.  We have 
> "interned strings" and "small ints", but we don't have e.g. "interned tuples" 
> or "frequently-used large ints and floats".
>
> That said, in this hypothetical scenario wherein someone is constantly 
> reloading modules but we also have immortal objects, maybe someone could 
> write a smart reloader that lets them somehow propagate existing immortal 
> objects to the new module.  It wouldn't even have to be that sophisticated, 
> just some sort of hook into the marshal step combined with a per-module 
> persistent cache of unmarshalled constants.
>

Fair enough. Since only immortal objects would affect this, it may be
possible for the smart reloader to simply be told of all new
immortals, and it can then intern things itself.

IMO that strengthens the argument that prefork servers are a more
significant use-case than reloading, without necessarily compromising
the rarer case.

Thanks for the explanation.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3D436RCMHODEIBVDIWIJLKZU2TGHBE4J/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread dw-git
Petr Viktorin wrote:
> Should we care about hacks/optimizations that rely on having the only 
> reference (or all references), e.g. mutating a tuple if it has refcount 
> 1? Immortal objects shouldn't break them (the special case simply won't 
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how 
> prevalent/useful it is in third-party code.

For what it's worth Cython does this for string concatenation to concatenate in 
place if possible (this optimization was copied from CPython). It could be 
disabled relatively easily if it became a problem (it's already CPython only 
and version checked so it'd just need another upper-bound version check).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDNQK5RMXSLLYFNIXRORL7GTKU6B4BVR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-21 Thread Petr Viktorin

On 19. 02. 22 8:46, Eric Snow wrote:

Thanks to all those that provided feedback.  I've worked to
substantially update the PEP in response.  The text is included below.
Further feedback is appreciated.


Thank you! This version is much clearer. I like the PEP more and more!

I've sent a PR with a some typo fixes: 
https://github.com/python/peps/pull/2348

and I have a few comments:


[...]

Public Refcount Details

[...]

As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and
which are considered implementation details.  Specifically, they should
use the existing public refcount-related API and the only refcount value
with any meaning is 0.  All other values are considered "not 0".


Should we care about hacks/optimizations that rely on having the only 
reference (or all references), e.g. mutating a tuple if it has refcount 
1? Immortal objects shouldn't break them (the special case simply won't 
apply), but this wording would make them illegal.
AFAIK CPython uses this internally, but I don't know how 
prevalent/useful it is in third-party code.



[...]


_Py_IMMORTAL_REFCNT
---

We will add two internal constants::

 #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
 #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))


As a nitpick: could you say this in prose?

* ``_Py_IMMORTAL_BIT`` has the third top-most bit set.
* ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set.


[...]


Immortal Global Objects
---

All objects that we expect to be shared globally (between interpreters)
will be made immortal.  That includes the following:

* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
   small ints)

All such objects will be immutable.  In the case of the static types,
they will be effectively immutable.  ``PyTypeObject`` has some mutable
start (``tp_dict`` and ``tp_subclasses``), but we can work around this
by storing that state on ``PyInterpreterState`` instead of on the
respective static type object.  Then the ``__dict__``, etc. getter
will do a lookup on the current interpreter, if appropriate, instead
of using ``tp_dict``.


But tp_dict is also public C-API. How will that be handled?
Perhaps naively, I thought static types' dicts could be treated as 
(deeply) immutable, and shared?


Perhaps it would be best to leave it out here and say say "The details 
of sharing ``PyTypeObject`` across interpreters are left to another PEP"?
Even so, I'd love to know the plan. (And even if these are internals, 
changes to them should be mentioned in What's New, for the sake of 
people who need to maintain old extensions.)





Object Cleanup
--

In order to clean up all immortal objects during runtime finalization,
we must keep track of them.

For GC objects ("containers") we'll leverage the GC's permanent
generation by pushing all immortalized containers there.  During
runtime shutdown, the strategy will be to first let the runtime try
to do its best effort of deallocating these instances normally.  Most
of the module deallocation will now be handled by
``pylifecycle.c:finalize_modules()`` which cleans up the remaining
modules as best as we can.  It will change which modules are available
during __del__ but that's already defined as undefined behavior by the
docs.  Optionally, we could do some topological disorder to guarantee
that user modules will be deallocated first before the stdlib modules.
Finally, anything leftover (if any) can be found through the permanent
generation gc list which we can clear after finalize_modules().

For non-container objects, the tracking approach will vary on a
case-by-case basis.  In nearly every case, each such object is directly
accessible on the runtime state, e.g. in a ``_PyRuntimeState`` or
``PyInterpreterState`` field.  We may need to add a tracking mechanism
to the runtime state for a small number of objects.


Out of curiosity: How does this extra work affect in the performance? Is 
it part of the 4% slowdown?




And from the other thread:

On 17. 02. 22 18:23, Eric Snow wrote:
> On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
 Weren't you planning a PEP on subinterpreter GIL as well? Do you 
want to

 submit them together?
>>>
>>> I'd have to think about that.  The other PEP I'm writing for
>>> per-interpreter GIL doesn't require immortal objects.  They just
>>> simplify a number of things.  That's my motivation for writing this
>>> PEP, in fact. :)
>>
>> Please think about it.
>> If you removed the benefits for per-interpreter GIL, the motivation
>> section would be reduced to is memory savings for fork/CoW. (And lots of
>> performance improvements that are great in theory but sum up to a 4% 
loss.)

>
> 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-21 Thread Larry Hastings


On 2/21/22 22:06, Chris Angelico wrote:

On Mon, 21 Feb 2022 at 16:47, Larry Hastings  wrote:

While I don't think it's fine to play devil's advocate, given the choice between "this will 
help a common production use-case" (pre-fork servers) and "this could hurt a hypothetical 
production use case" (long-running applications that reload modules enough times this could 
waste a significant amount of memory), I think the former is more important.


Can the cost be mitigated by reusing immortal objects? So, for
instance, a module-level constant of 60*60*24*365 might be made
immortal, meaning it doesn't get disposed of with the module, but if
the module gets reloaded, no *additional* object would be created.

I'm assuming here that any/all objects unmarshalled with the module
can indeed be shared in this way. If that isn't always true, then that
would reduce the savings here.



It could, but we don't have any general-purpose mechanism for that.  We 
have "interned strings" and "small ints", but we don't have e.g. 
"interned tuples" or "frequently-used large ints and floats".


That said, in this hypothetical scenario wherein someone is constantly 
reloading modules but we also have immortal objects, maybe someone could 
write a smart reloader that lets them somehow propagate existing 
immortal objects to the new module. It wouldn't even have to be that 
sophisticated, just some sort of hook into the marshal step combined 
with a per-module persistent cache of unmarshalled constants.



//arry/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UN3BIEHDK2CCL563MSIJ4DXDWOWHNKHR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-21 Thread Larry Hastings


On 2/21/22 21:44, Larry Hastings wrote:


While I don't think it's fine to play devil's advocate,"



Oh!  Please ignore the word "don't" in the above sentence.  I /do/ think 
it's fine to play devil's advocate.


Sheesh,


//arry/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TABGFU4OFTUDPGF72LY5QMSDTKDUUHHY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-20 Thread Chris Angelico
On Mon, 21 Feb 2022 at 16:47, Larry Hastings  wrote:
>
>
> While I don't think it's fine to play devil's advocate, given the choice 
> between "this will help a common production use-case" (pre-fork servers) and 
> "this could hurt a hypothetical production use case" (long-running 
> applications that reload modules enough times this could waste a significant 
> amount of memory), I think the former is more important.
>

Can the cost be mitigated by reusing immortal objects? So, for
instance, a module-level constant of 60*60*24*365 might be made
immortal, meaning it doesn't get disposed of with the module, but if
the module gets reloaded, no *additional* object would be created.

I'm assuming here that any/all objects unmarshalled with the module
can indeed be shared in this way. If that isn't always true, then that
would reduce the savings here.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7XDU2THWGEX2YUD32VYY5FJXL4GFQ675/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-20 Thread Larry Hastings


While I don't think it's fine to play devil's advocate, given the choice 
between "this will help a common production use-case" (pre-fork servers) 
and "this could hurt a hypothetical production use case" (long-running 
applications that reload modules enough times this could waste a 
significant amount of memory), I think the former is more important.



//arry/

On 2/20/22 06:01, Antoine Pitrou wrote:

On Sat, 19 Feb 2022 12:05:22 -0500
Larry Hastings  wrote:

On 2/19/22 04:41, Antoine Pitrou wrote:

On Fri, 18 Feb 2022 14:56:10 -0700
Eric Snow   wrote:

On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings   wrote:

   He suggested(*) all the constants unmarshalled as part of loading a module should be 
"immortal", and if we could rejigger how we allocated them to store them in 
their own memory pages, that would dovetail nicely with COW semantics, cutting down on 
the memory use of preforked server processes.

Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

That is not so cool if for some reason an application routinely loads
and unloads modules.

Do applications do that for some reason?  Python module reloading is
already so marginal, I thought hardly anybody did it.

I have no data point, but I would be surprised if there wasn't at least
one example of such usage somewhere in the world, for example to
hotload fixes in specific parts of an application without restarting it
(or as part of a plugin / extension / mod system).

There's also the auto-reload functionality in some Web servers or
frameworks, but that is admittedly more of a development feature.

Regards

Antoine.


___
Python-Dev mailing list --python-dev@python.org
To unsubscribe send an email topython-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived 
athttps://mail.python.org/archives/list/python-dev@python.org/message/C2MWXHPOFFH5CLLPKJCVEQD4EGHKTD24/
Code of Conduct:http://python.org/psf/codeofconduct/___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VYGLHB4JXSYKTNJ2AOLDFUKO4GDHWVIV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-20 Thread Antoine Pitrou
On Sat, 19 Feb 2022 12:05:22 -0500
Larry Hastings  wrote:
> On 2/19/22 04:41, Antoine Pitrou wrote:
> > On Fri, 18 Feb 2022 14:56:10 -0700
> > Eric Snow  wrote:  
> >> On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  
> >> wrote:  
> >>>   He suggested(*) all the constants unmarshalled as part of loading a 
> >>> module should be "immortal", and if we could rejigger how we allocated 
> >>> them to store them in their own memory pages, that would dovetail nicely 
> >>> with COW semantics, cutting down on the memory use of preforked server 
> >>> processes.  
> >> Cool idea.  I may mention it in the PEP as a possibility.  Thanks!  
> > That is not so cool if for some reason an application routinely loads
> > and unloads modules.  
> 
> Do applications do that for some reason?  Python module reloading is 
> already so marginal, I thought hardly anybody did it.

I have no data point, but I would be surprised if there wasn't at least
one example of such usage somewhere in the world, for example to
hotload fixes in specific parts of an application without restarting it
(or as part of a plugin / extension / mod system).

There's also the auto-reload functionality in some Web servers or
frameworks, but that is admittedly more of a development feature.

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C2MWXHPOFFH5CLLPKJCVEQD4EGHKTD24/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-19 Thread Inada Naoki
Hi,

I hope per-interpreter GIL success at some point, and I know this is
needed for per-interpreter GIL.

But I am worrying about per-interpreter GIL may be too complex to
implement and maintain for core developers and extension writers.
As you know, immortal don't mean sharable between interpreters. It is
too difficult to know which object can be shared, and where the
shareable objects are leaked to other interpreters.
So I am not sure that per interpreter GIL is achievable goal.

So I think it's too early to introduce the immortal objects in Python
3.11, unless it *improve* performance without per-interpreter GIL
Instead, we can add a configuration option such as
`--enalbe-experimental-immortal`.


On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
>
> Reducing CPU Cache Invalidation
> ---
>
> Avoiding Data Races
> ---
>

Both benefits require a per-interpreter GIL.

>
> Avoiding Copy-on-Write
> --
>
> For some applications it makes sense to get the application into
> a desired initial state and then fork the process for each worker.
> This can result in a large performance improvement, especially
> memory usage.  Several enterprise Python users (e.g. Instagram,
> YouTube) have taken advantage of this.  However, the above
> refcount semantics drastically reduce the benefits and
> has led to some sub-optimal workarounds.
>

As I wrote before, fork is very difficult to use safely. We can not
recommend to use it for many users.
And I don't think reducing the size of patch in Instagram or YouTube
is not good rational for this kind of change.


> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.  Anything that uses ``mmap``
> relies on copy-on-write, including sharing data from shared objects
> files between processes.
>

It is very difficult to reduce CoW with mmap(MAP_PRIVATE).

You may need to write hash of bytes and unicode. You may be need to
write `tp_type`.
Immortal objects can "reduce" the memory write. But "at least one
memory write" is enough to trigger the CoW.


> Accidental Immortality
> --
>
> While it isn't impossible, this accidental scenario is so unlikely
> that we need not worry.  Even if done deliberately by using
> ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> If that CPU were 32-bit then it is (technically) more possible though
> still highly unlikely.
>

Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.


>
> Constraints
> ---
>
> * ensure that otherwise immutable objects can be truly immutable
> * be careful when immortalizing objects that are not otherwise immutable

I am not sure about what this means.
For example, unicode objects are not immutable because they have hash,
utf8 cache and wchar_t cache. (wchar_t cache will be removed in Python
3.12).


>
> Object Cleanup
> --
>
> In order to clean up all immortal objects during runtime finalization,
> we must keep track of them.
>

I don't think we need to clean up all immortal objects.

Of course, we should care immortal by default objects.
But for user-marked immortal objects, it's very difficult to guarantee
__del__ or weakref callback is called safely.

Additionally, if they are marked immortal for avoiding CoW, cleanup cause CoW.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7FCNNQOTIUZTBFZUPYRDSLND6WCVM3JO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-19 Thread Larry Hastings


On 2/19/22 04:41, Antoine Pitrou wrote:

On Fri, 18 Feb 2022 14:56:10 -0700
Eric Snow  wrote:

On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  wrote:

  He suggested(*) all the constants unmarshalled as part of loading a module should be 
"immortal", and if we could rejigger how we allocated them to store them in 
their own memory pages, that would dovetail nicely with COW semantics, cutting down on 
the memory use of preforked server processes.

Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

That is not so cool if for some reason an application routinely loads
and unloads modules.



Do applications do that for some reason?  Python module reloading is 
already so marginal, I thought hardly anybody did it.


Anyway, my admittedly-dim understanding is that COW is most helpful for 
the "pre-fork" server model, and I bet those folks never bother to 
unload modules.



//arry/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N7UFJCMQLO6W4HUJ6DL5M55JOU4CEX4K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-19 Thread Antoine Pitrou
On Fri, 18 Feb 2022 14:56:10 -0700
Eric Snow  wrote:
> On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  wrote:
> > I experimented with this at the EuroPython sprints in Berlin years ago.  I 
> > was sitting next to MvL, who had an interesting observation about it.  
> 
> Classic MvL! :)
> 
> >  He suggested(*) all the constants unmarshalled as part of loading a module 
> > should be "immortal", and if we could rejigger how we allocated them to 
> > store them in their own memory pages, that would dovetail nicely with COW 
> > semantics, cutting down on the memory use of preforked server processes.  
> 
> Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

That is not so cool if for some reason an application routinely loads
and unloads modules.

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4M4J3656UXOQ7X6YFGCHUAQHMNBUEV4O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow
On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  wrote:
> I experimented with this at the EuroPython sprints in Berlin years ago.  I 
> was sitting next to MvL, who had an interesting observation about it.

Classic MvL! :)

>  He suggested(*) all the constants unmarshalled as part of loading a module 
> should be "immortal", and if we could rejigger how we allocated them to store 
> them in their own memory pages, that would dovetail nicely with COW 
> semantics, cutting down on the memory use of preforked server processes.

Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2PRODXEVNO53YYFRL6JUWZQF77WOYS4C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow
On Wed, Feb 16, 2022 at 8:45 PM Inada Naoki  wrote:
> Is there any common tool that utilize CoW by mmap?
> If you know, please its link to the PEP.
> If there is no common tool, most Python users can get benefit from this.

Sorry, I'm not aware of any, but I also haven't researched the topic
much.  Regardless, that would be a good line of inquiry.  A reference
like that would probably help make the PEP a bit more justifiable
without per-interpreter GIL. :)

> Generally speaking, fork is a legacy API. It is too difficult to know
> which library is fork-safe, even for stdlibs. And Windows users can
> not use fork.
> Optimizing for non-fork use case is much better than optimizing for
> fork use cases.

+1

> I hope per-interpreter GIL replaces fork use cases.

Yeah, that's definitely one big benefit.

> But tools using CoW without fork also welcome, especially if it
> supports Windows.

+1

> Anyway, I don't believe stopping refcounting will fix the CoW issue
> yet. See this article [1] again.
>
> [1] 
> https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

That's definitely an important point, given that the main objective of
the proposal is to allow disabling mutation of runtime-internal object
state so that some objects can be made truly immutable.

I'm sure Eddie has some good insight on the matter (and may have even
been involved in writing that article).  Eddie?

> Note that they failed to fix CoW by stopping refcounting code objects! (*)
> Most CoW was caused by cyclic GC and finalization caused most CoW.

That's a good observation!

> (*) It is not surprising to me because eval loop don't incre/decref
> most code attributes. They borrow reference from the code object.

+1

> So we need a sample application and profile it, before saying it fixes CoW.
> Could you provide some data, or drop the CoW issue from this PEP until
> it is proved?

We'll look into that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ESRBMP4WTNONED3K6Z5HMYYY2WIMQZT3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow
Again, thanks for the reply.  It's helpful.  My further responses are
inline below.

-eric

On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
> > Agreed.  However, what behavior do users expect and what guarantees do
> > we make?  Do we indicate how to interpret the refcount value they
> > receive?  What are the use cases under which a user would set an
> > object's refcount to a specific value?  Are users setting the refcount
> > of objects they did not create?
>
> That's what I hoped the PEP would tell me. Instead of simply claiming
> that there won't be issues, it should explain why we won't have any issues.
> [snip]
> IMO, the reasoning should start from the assumption that things will
> break, and explain why they won't (or why the breakage is acceptable).
> If the PEP simply tells me upfront that things will be OK, I have a hard
> time trusting it.
>
> IOW, it's clear you've thought about this a lot (especially after
> reading your replies here), but it's not clear from the PEP.
> That might be editorial nitpicking, if it wasn't for the fact that I
> want find any gaps in your research and reasoning, and invite everyone
> else to look for them as well.

Good point.. It's easy to dump a bunch of unnecessary info into a PEP,
and it was hard for me to know where the line was in this case.  There
hadn't been much discussion previously about the possible ways this
change might break users.  So thanks for bringing this up.  I'll be
sure to put a more detailed explanation in the PEP, with a bit more
evidence too.

> Ah, I see. I was confused by this:

No worries!  I'm glad we cleared it up.  I'll make sure the PEP is
more understandable about this.

> > This is also true even with the GIL, though the impact is smaller.
>
> Smaller than what? The baseline for that comparison is a hypothetical
> GIL-less interpreter, which is only introduced in the next section.
> Perhaps say something like "Python's GIL helps avoid this effect, but
> doesn't eliminate it."

Good point.  I'll clarify the point.

> >> Weren't you planning a PEP on subinterpreter GIL as well? Do you want to
> >> submit them together?
> >
> > I'd have to think about that.  The other PEP I'm writing for
> > per-interpreter GIL doesn't require immortal objects.  They just
> > simplify a number of things.  That's my motivation for writing this
> > PEP, in fact. :)
>
> Please think about it.
> If you removed the benefits for per-interpreter GIL, the motivation
> section would be reduced to is memory savings for fork/CoW. (And lots of
> performance improvements that are great in theory but sum up to a 4% loss.)

Sounds good.  Would this involve more than a note at the top of the PEP?

And just to be clear, I don't think the fate of a per-interpreter GIL
PEP should not depend on this one.

> > It wouldn't match _Py_IMMORTAL_REFCNT, but the high bit of
> > _Py_IMMORTAL_REFCNT would still match.  That bit is what we would
> > actually be checking, rather than the full value.
>
> It makes sense once you know _Py_IMMORTAL_REFCNT has two bits set. Maybe
> it'd be good to note that detail -- it's an internal detail, but crucial
> for making things safe.

Will do.

> >> What about extensions compiled with Python 3.11 (with this PEP) that use
> >> an older version of the stable ABI, and thus should be compatible with
> >> 3.2+? Will they use the old versions of the macros? How will that be 
> >> tested?
> >
> > It wouldn't matter unless an object's refcount reached
> > _Py_IMMORTAL_REFCNT, at which point incref/decref would start
> > noop'ing.  What is the likelihood (in real code) that an object's
> > refcount would grow that far?  Even then, would such an object ever be
> > expected to go back to 0 (and be dealloc'ed)?  Otherwise the point is
> > moot.
>
> That's exactly the questions I'd hope the PEP to answer. I could
> estimate that likelihood myself, but I'd really rather just check your
> work ;)
>
> (Hm, maybe I couldn't even estimate this myself. The PEP doesn't say
> what the value of _Py_IMMORTAL_REFCNT is, and in the ref implementation
> a comment says "This can be safely changed to a smaller value".)

Got it.  I'll be sure that the PEP is more clear about that.  Thanks
for letting me know.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LRUQDLVTC7GV4K3HHZK2ESPW3AHW4NKJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow
On Wed, Feb 16, 2022 at 10:43 PM Jim J. Jewett  wrote:
> I suggest being a little more explicit (even blatant) that the particular 
> details of:
> [snip]
> are not only Cpython-specific, but are also private implementation details 
> that are expected to change in subsequent versions.

Excellent point.

> Ideally, things like the interned string dictionary or the constants from a 
> pyc file will be not merely immortal, but stored in an immortal-only memory 
> page, so that they won't be flushed or CoW-ed when a nearby non-immortal 
> object is modified.

That's definitely worth looking into.

> Getting those details right will make a difference to performance, and you 
> don't want to be locked in to the first draft.

Yep, that is one big reason I was trying to avoid spelling out every
detail of our plan. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/535SKVXHPFZQMKRB2YC6UVQLN2TZ4RMY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Petr Viktorin

On 17. 02. 22 2:13, Eric Snow wrote:

Thanks for the feedback.  My responses are inline below.

-eric


On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin  wrote:

Thank you very much for writing this down! It's very helpful to see a
concrete proposal, and the current state of this idea.
I like the change,


That's good to hear. :)


but I think it's unfortunately more complicated than
the PEP suggests.


That would be unsurprising. :)


This proposal is CPython-specific and, effectively, describes
internal implementation details.


I think that is a naïve statement. Refcounting is
implementation-specific, but it's hardly an *internal* detail.


Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* 
https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT


There is
code that targets CPython specifically, and relies on the details.


Could you elaborate?  Do you mean such code relies on specific refcount values?


The refcount has public getters and setters,


Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?


That's what I hoped the PEP would tell me. Instead of simply claiming 
that there won't be issues, it should explain why we won't have any issues.




and you need a pretty good
grasp of the concept to write a C extension.


I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).


I think that it's safe to assume that this will break people's code,


Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] 
https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/


IMO, the reasoning should start from the assumption that things will 
break, and explain why they won't (or why the breakage is acceptable).
If the PEP simply tells me upfront that things will be OK, I have a hard 
time trusting it.


IOW, it's clear you've thought about this a lot (especially after 
reading your replies here), but it's not clear from the PEP.
That might be editorial nitpicking, if it wasn't for the fact that I 
want find any gaps in your research and reasoning, and invite everyone 
else to look for them as well.



[...]

Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.


This looks out of context. Python has a per-process GIL. It should it go
after the next section.


This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?


Ah, I see. I was confused by this:

This is also true even with the GIL, though the impact is smaller.


Smaller than what? The baseline for that comparison is a hypothetical 
GIL-less interpreter, which is only introduced in the next section.
Perhaps say something like "Python's GIL helps avoid this effect, but 
doesn't eliminate it."




The proposed solution is obvious enough that two people came to the
same conclusion (and implementation, more or less) independently.


Who was it? Assuming it's not a secret :)


Me and Eddit. :)  I don't mind saying so.


In the case of per-interpreter GIL, the only realistic alternative
is to move all global objects into ``PyInterpreterState`` and add
one or more lookup functions to access them.  Then we'd have to

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Jim J. Jewett
I suggest being a little more explicit (even blatant) that the particular 
details of:

(1)  which subset of functionally immortal objects are marked as immortal
(2)  how to mark something as immortal
(3)  how to recognize something as immortal
(4)  which memory-management activities are skipped or modified for immortal 
objects

are not only Cpython-specific, but are also private implementation details that 
are expected to change in subsequent versions.


Ideally, things like the interned string dictionary or the constants from a pyc 
file will be not merely immortal, but stored in an immortal-only memory page, 
so that they won't be flushed or CoW-ed when a nearby non-immortal object is 
modified.  Getting those details right will make a difference to performance, 
and you don't want to be locked in to the first draft.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EPH3PGNKUBUZK26Z2M4SQSPUVIGXZUNB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Inada Naoki
On Thu, Feb 17, 2022 at 7:01 AM Eric Snow  wrote:
>
> > > Also note that "fork" isn't the only operating system mechanism
> > > that uses copy-on-write semantics.
> >
> > Could you elaborate? mmap, maybe?
> > [snip[
> > So if you know how to get benefit from CoW without fork, I want to know it.
>
> Sorry if I got your hopes up.  Yeah, I was talking about mmap.
>

Is there any common tool that utilize CoW by mmap?
If you know, please its link to the PEP.
If there is no common tool, most Python users can get benefit from this.

Generally speaking, fork is a legacy API. It is too difficult to know
which library is fork-safe, even for stdlibs. And Windows users can
not use fork.
Optimizing for non-fork use case is much better than optimizing for
fork use cases.

* https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c18d7db234
* https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
* https://www.evanjones.ca/fork-is-dangerous.html
* https://bugs.python.org/issue33725

I hope per-interpreter GIL replaces fork use cases.
But tools using CoW without fork also welcome, especially if it
supports Windows.

Anyway, I don't believe stopping refcounting will fix the CoW issue
yet. See this article [1] again.

[1] 
https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

Note that they failed to fix CoW by stopping refcounting code objects! (*)
Most CoW was caused by cyclic GC and finalization caused most CoW.

(*) It is not surprising to me because eval loop don't incre/decref
most code attributes. They borrow reference from the code object.

So we need a sample application and profile it, before saying it fixes CoW.
Could you provide some data, or drop the CoW issue from this PEP until
it is proved?

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J53GY7XKFOI4KWHSTTA7FUL7TJLE7WG6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 2:41 PM Terry Reedy  wrote:
> > * the naive implementation shows a 4% slowdown
>
> Without understanding all the benefits, this seems a bit too much for
> me.  2% would be much better.

Yeah, we consider 4% to be too much.  2% would be great.
Performance-neutral would be even better, of course. :)

> > * we have a number of strategies that should reduce that penalty
>
> I would like to see that before approving the PEP.

I expect it would be enough to show where things stand with benchmark
results.  It did not seem like the actual mitigation strategies were
as important, so I opted to leave them out to avoid clutter.  Plus it
isn't clear yet what approaches will help the most, nor how much we
can win back.  So I didn't want to distract with hypotheticals.  If
it's important I can add that in.

> > * without immortal objects, the implementation for per-interpreter GIL
> > will require a number of non-trivial workarounds
>
> To me, that says to speed up immortality first.

Agreed.

> > That last one is particularly meaningful to me since it means we would
> > definitely miss the 3.11 feature freeze.
>
> 3 1/2 months from now.
>
> > With immortal objects, 3.11 would still be in reach.
>
> Is it worth trying to rush it a bit?

I'd rather not rush this.  I'm saying that, for per-interpreter GIL,
3.11 is within reach without rushing if we have immortal objects.
Without them, 3.11 is realistic without rushing things.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CYPYFPFGB7ONMVSTDHFDKZL26E7KG6MO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 12:14 PM Kevin Modzelewski  wrote:
> fwiw Pyston has immortal objects, though with a slightly different goal and 
> thus design [1]. I'm not necessarily advocating for our design (it makes most 
> sense if there is a JIT involved), but just writing to report our experience 
> of making a change like this and the compatibility effects.

Thanks!

> Importantly, our system allows for the reference count of immortal objects to 
> change, as long as it doesn't go below half of the original very-high value. 
> So extension code with no concept of immortality will still update the 
> reference counts of immortal objects, but this is fine. Because of this we 
> haven't seen any issues with extension modules.

As Guido noted, we are taking a similar approach for the sake of older
extensions built with the limited API.  As a precaution, we start the
refcount for immortal objects basically at _Py_IMMORTAL_REFCNT * 1.5.
Then we only need to check the high bit of _Py_IMMORTAL_REFCNT to see
if an object is immortal.

> The small amount of compatibility challenges we've run into have been in 
> testing code that checks for memory leaks. For example this code breaks on 
> Pyston:
> [snip]
> This might work with this PEP, but we've also seen code that asserts that the 
> refcount increases by a specific value, which I believe wouldn't.

Right, this is less of an issue for us since normally we do not change
the refcount of immortal objects.  Also, CPython's test suite keeps us
honest about leaking references and memory blocks. :)

> For Pyston we've simply disabled these tests, figuring that our users still 
> have CPython to test on. Personally I consider this breakage to be small, but 
> I hadn't seen anyone mention the potential usage of sys.getrefcount() so I 
> thought I'd bring it up.

Thanks again for that.

> [1] Our goal is to entirely remove refcounting operations when we can prove 
> we are operating on an immortal object. We can prove it in a couple cases: 
> sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often 
> know the immortality of objects it embeds into the code. So if we can prove 
> statically that an object is immortal then we elide the incref/decrefs, and 
> if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that 
> our reference counts on immortal objects will change, so we detect 
> immortality by checking if the reference count is at least half of the 
> original very-high value.

FWIW, we anticipate that we can take a similar approach in CPython's
eval loop, specializing for immortal objects.  We are also updating
Py_RETURN_NONE, etc. to stop incref'ing.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDBGYUDROQZNEM6LAREIEKSZSQ72BLOH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
Thanks for the feedback.  My responses are inline below.

-eric


On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin  wrote:
> Thank you very much for writing this down! It's very helpful to see a
> concrete proposal, and the current state of this idea.
> I like the change,

That's good to hear. :)

> but I think it's unfortunately more complicated than
> the PEP suggests.

That would be unsurprising. :)

> > This proposal is CPython-specific and, effectively, describes
> > internal implementation details.
>
> I think that is a naïve statement. Refcounting is
> implementation-specific, but it's hardly an *internal* detail.

Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* 
https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT

> There is
> code that targets CPython specifically, and relies on the details.

Could you elaborate?  Do you mean such code relies on specific refcount values?

> The refcount has public getters and setters,

Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?

> and you need a pretty good
> grasp of the concept to write a C extension.

I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).

> I think that it's safe to assume that this will break people's code,

Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] 
https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/

> and
> this PEP should convince us that the breakage is worth it rather than
> dismiss the issue.

Sorry, I didn't mean to be dismissive.  I agree that if there is
breakage this PEP must address it.

> It would be good to note that “container” refers to the GC term, as in
> https://devguide.python.org/garbage_collector/#identifying-reference-cycles
>
> and not e.g.
> https://docs.python.org/3/library/collections.abc.html#collections.abc.Container

+1

> > This has a concrete impact on active projects in the Python community.
> > Below we describe several ways in which refcount modification has
> > a real negative effect on those projects.  None of that would
> > happen for objects that are truly immutable.
> >
> > Reducing Cache Invalidation
> > ---
>
> Explicitly saying “CPU cache” would make the PEP easier to skim.

+1

> > Every modification of a refcount causes the corresponding cache
> > line to be invalidated.  This has a number of effects.
> >
> > For one, the write must be propagated to other cache levels
> > and to main memory.  This has small effect on all Python programs.
> > Immortal objects would provide a slight relief in that regard.
> >
> > On top of that, multi-core applications pay a price.  If two threads
> > are interacting with the same object (e.g. ``None``)  then they will
> > end up invalidating each other's caches with each incref and decref.
> > This is true even for otherwise immutable objects like ``True``,
> > ``0``, and ``str`` instances.  This is also true even with
> > the GIL, though the impact is smaller.
>
> This looks out of context. Python has a per-process GIL. It should it go
> after the next section.

This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?

> > The proposed solution is obvious enough that two people came to the
> > same conclusion (and implementation, more or less) independently.
>
> Who was it? Assuming it's not a secret :)

Me and Eddit. :)  I don't mind saying so.

> > In the case of per-interpreter GIL, the only realistic alternative
> > is to move all global objects into ``PyInterpreterState`` and add
> > one or more lookup functions to access them.  Then we'd have to
> > add some hacks to the C-API to preserve compatibility for the
> > may objects exposed there.  The story is much, much 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow
On Wed, Feb 16, 2022 at 12:37 AM Inada Naoki  wrote:
> +1 for overall idea.

Great!

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.
>
> Could you elaborate? mmap, maybe?
> [snip[
> So if you know how to get benefit from CoW without fork, I want to know it.

Sorry if I got your hopes up.  Yeah, I was talking about mmap.

> > There will likely be others we have not enumerated here.
>
> How about interned strings?

Marking every interned string as immortal may make sense.

> Should the intern dict be belonging to runtime, or (sub)interpreter?
>
> If the interned dict is belonging to runtime, all interned dict should
> be immortal to be shared between subinterpreters.

Excellent questions.  Making immutable objects immortal is relatively
simple.  For the most part, mutable objects should not be shared
between interpreters without protection (e.g. the GIL).  The interned
dict isn't exposed to Python code or the C-API, so there's less risk,
but it still wouldn't work without cleverness.  So it should be
per-interpreter.  It would be nice if it were global though. :)

> If the interned dict is belonging to interpreter, should we register
> immortalized string to all interpreters?

That's a good point.  It may be worth doing something like that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VQYLSPHHP2EE2KPDWCXDLMBAXYAE72D3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Terry Reedy

On 2/15/2022 7:10 PM, Eric Snow wrote:


* the naive implementation shows a 4% slowdown


Without understanding all the benefits, this seems a bit too much for 
me.  2% would be much better.



* we have a number of strategies that should reduce that penalty


I would like to see that before approving the PEP.


* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds


To me, that says to speed up immortality first.


That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.


3 1/2 months from now.


With immortal objects, 3.11 would still be in reach.


Is it worth trying to rush it a bit?

--
Terry Jan Reedy

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A2HYQ7M7RH4SXEQBYECRQKAUH3FHOZC6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Guido van Rossum
Thanks!

On Wed, Feb 16, 2022 at 11:19 AM Kevin Modzelewski  wrote:

> Importantly, our system allows for the reference count of immortal objects
> to change, as long as it doesn't go below half of the original very-high
> value. So extension code with no concept of immortality will still update
> the reference counts of immortal objects, but this is fine. Because of this
> we haven't seen any issues with extension modules.
>

In CPython we will *have* to allow this in order to support binary packages
built with earlier CPython versions (assuming they only use the stable
ABI). Those packages will necessarily use INCREF/DECREF macros that don't
check for the immortality bit. Yes, it will break COW, but nevertheless we
have to support the Stable ABI, and INCREF/DECREF are in the Stable ABI. If
you want COW you will have to compile such packages from source.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4OZLYDYN5Z6HNHQ654PF2IA5O6QH3TNU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Kevin Modzelewski
fwiw Pyston has immortal objects, though with a slightly different goal and
thus design [1]. I'm not necessarily advocating for our design (it makes
most sense if there is a JIT involved), but just writing to report our
experience of making a change like this and the compatibility effects.

Importantly, our system allows for the reference count of immortal objects
to change, as long as it doesn't go below half of the original very-high
value. So extension code with no concept of immortality will still update
the reference counts of immortal objects, but this is fine. Because of this
we haven't seen any issues with extension modules.

The small amount of compatibility challenges we've run into have been in
testing code that checks for memory leaks. For example this code breaks on
Pyston:

def test():
  starting_refcount = sys.getrefcount(1)
  doABunchOfStuff()
  assert sys.getrefcount(1) == starting_refcount

This might work with this PEP, but we've also seen code that asserts that
the refcount increases by a specific value, which I believe wouldn't.

For Pyston we've simply disabled these tests, figuring that our users still
have CPython to test on. Personally I consider this breakage to be small,
but I hadn't seen anyone mention the potential usage of sys.getrefcount()
so I thought I'd bring it up.

- kmod

[1] Our goal is to entirely remove refcounting operations when we can prove
we are operating on an immortal object. We can prove it in a couple cases:
sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often
know the immortality of objects it embeds into the code. So if we can prove
statically that an object is immortal then we elide the incref/decrefs, and
if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that
our reference counts on immortal objects will change, so we detect
immortality by checking if the reference count is at least half of the
original very-high value.

On Tue, Feb 15, 2022 at 7:13 PM Eric Snow 
wrote:

> Eddie and I would appreciate your feedback on this proposal to support
> treating some objects as "immortal".  The fundamental characteristic
> of the approach is that we would provide stronger guarantees about
> immutability for some objects.
>
> A few things to note:
>
> * this is essentially an internal-only change:  there are no
> user-facing changes (aside from affecting any 3rd party code that
> directly relies on specific refcounts)
> * the naive implementation shows a 4% slowdown
> * we have a number of strategies that should reduce that penalty
> * without immortal objects, the implementation for per-interpreter GIL
> will require a number of non-trivial workarounds
>
> That last one is particularly meaningful to me since it means we would
> definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
> would still be in reach.
>
> -eric
>
> ---
>
> PEP: 683
> Title: Immortal Objects, Using a Fixed Refcount
> Author: Eric Snow , Eddie Elizondo
> 
> Discussions-To: python-dev@python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2022
> Python-Version: 3.11
> Post-History:
> Resolution:
>
>
> Abstract
> 
>
> Under this proposal, any object may be marked as immortal.
> "Immortal" means the object will never be cleaned up (at least until
> runtime finalization).  Specifically, the `refcount`_ for an immortal
> object is set to a sentinel value, and that refcount is never changed
> by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
> For immortal containers, the ``PyGC_Head`` is never
> changed by the garbage collector.
>
> Avoiding changes to the refcount is an essential part of this
> proposal.  For what we call "immutable" objects, it makes them
> truly immutable.  As described further below, this allows us
> to avoid performance penalties in scenarios that
> would otherwise be prohibitive.
>
> This proposal is CPython-specific and, effectively, describes
> internal implementation details.
>
> .. _refcount:
> https://docs.python.org/3.11/c-api/intro.html#reference-counts
>
>
> Motivation
> ==
>
> Without immortal objects, all objects are effectively mutable.  That
> includes "immutable" objects like ``None`` and ``str`` instances.
> This is because every object's refcount is frequently modified
> as it is used during execution.  In addition, for containers
> the runtime may modify the object's ``PyGC_Head``.  These
> runtime-internal state currently prevent
> full immutability.
>
> This has a concrete impact on active projects in the Python community.
> Below we describe several ways in which refcount modification has
> a real negative effect on those projects.  None of that would
> happen for objects that are truly immutable.
>
> Reducing Cache Invalidation
> ---
>
> Every modification of a refcount causes the corresponding cache
> line to be invalidated.  This has a number of effects.
>
> For one, the write must be 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Larry Hastings


I experimented with this at the EuroPython sprints in Berlin years ago.  
I was sitting next to MvL, who had an interesting observation about it.  
He suggested(*) all the constants unmarshalled as part of loading a 
module should be "immortal", and if we could rejigger how we allocated 
them to store them in their own memory pages, that would dovetail nicely 
with COW semantics, cutting down on the memory use of preforked server 
processes.



//arry/

(*) Assuming I remember what he said accurately, of course.  If any of 
this is dumb assume it's my fault.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E2AVH3BSINO7Z55BGQ47LSIE5VKTOGFB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Petr Viktorin

On 16. 02. 22 1:10, Eric Snow wrote:

Eddie and I would appreciate your feedback on this proposal to support
treating some objects as "immortal".  The fundamental characteristic
of the approach is that we would provide stronger guarantees about
immutability for some objects.

A few things to note:

* this is essentially an internal-only change:  there are no
user-facing changes (aside from affecting any 3rd party code that
directly relies on specific refcounts)
* the naive implementation shows a 4% slowdown
* we have a number of strategies that should reduce that penalty
* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds

That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
would still be in reach.

-eric


Thank you very much for writing this down! It's very helpful to see a 
concrete proposal, and the current state of this idea.
I like the change, but I think it's unfortunately more complicated than 
the PEP suggests.





---

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History:
Resolution:


Abstract


Under this proposal, any object may be marked as immortal.
"Immortal" means the object will never be cleaned up (at least until
runtime finalization) >> Specifically, the `refcount`_ for an immortal
object is set to a sentinel value, and that refcount is never changed
by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
For immortal containers, the ``PyGC_Head`` is never
changed by the garbage collector.

Avoiding changes to the refcount is an essential part of this
proposal.  For what we call "immutable" objects, it makes them
truly immutable.  As described further below, this allows us
to avoid performance penalties in scenarios that
would otherwise be prohibitive.

This proposal is CPython-specific and, effectively, describes
internal implementation details.


I think that is a naïve statement. Refcounting is 
implementation-specific, but it's hardly an *internal* detail. There is 
code that targets CPython specifically, and relies on the details. The 
refcount has public getters and setters, and you need a pretty good 
grasp of the concept to write a C extension.
I think that it's safe to assume that this will break people's code, and 
this PEP should convince us that the breakage is worth it rather than 
dismiss the issue.




.. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts


Motivation
==

Without immortal objects, all objects are effectively mutable.  That
includes "immutable" objects like ``None`` and ``str`` instances.
This is because every object's refcount is frequently modified
as it is used during execution.  In addition, for containers
the runtime may modify the object's ``PyGC_Head``.  These
runtime-internal state currently prevent
full immutability.


It would be good to note that “container” refers to the GC term, as in 
https://devguide.python.org/garbage_collector/#identifying-reference-cycles


and not e.g. 
https://docs.python.org/3/library/collections.abc.html#collections.abc.Container




This has a concrete impact on active projects in the Python community.
Below we describe several ways in which refcount modification has
a real negative effect on those projects.  None of that would
happen for objects that are truly immutable.

Reducing Cache Invalidation
---


Explicitly saying “CPU cache” would make the PEP easier to skim.


Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.


This looks out of context. Python has a per-process GIL. It should it go 
after the next section.




Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple threads that concurrently incref or decref.  Without a shared
GIL, two running interpreters could not safely share any objects,
even otherwise immutable ones like ``None``.

This means that, to have a 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-15 Thread Inada Naoki
+1 for overall idea.

Some comments:

>
> Also note that "fork" isn't the only operating system mechanism
> that uses copy-on-write semantics.
>

Could you elaborate? mmap, maybe?

Generally speaking, fork is very difficult to use in safe.
My company's web apps load applications and libraries *after* fork,
not *before* fork for safety.
We had changed multiprocessing to use spawn by default on macOS.
So I don't recommend many Python users to use fork.

So if you know how to get benefit from CoW without fork, I want to know it.

>
> Immortal Global Objects
> ---
>
> The following objects will be made immortal:
>
> * singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
> * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
>   small ints)
>
> There will likely be others we have not enumerated here.
>

How about interned strings?
Should the intern dict be belonging to runtime, or (sub)interpreter?

If the interned dict is belonging to runtime, all interned dict should
be immortal to be shared between subinterpreters.
If the interned dict is belonging to interpreter, should we register
immortalized string to all interpreters?

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DQV6ECSUB2VD2EXX6CVCC45RJA6NR2ZZ/
Code of Conduct: http://python.org/psf/codeofconduct/