[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-09 Thread Petr Viktorin

On 09. 03. 22 4:58, Eric Snow wrote:

On Mon, Feb 28, 2022 at 6:01 PM Eric Snow  wrote:

The updated PEP text is included below.  The largest changes involve
either the focus of the PEP (internal mechanism to mark objects
immortal) or the possible ways that things can break on older 32-bit
stable ABI extensions.  All other changes are smaller.


In particular, I'm hoping to get your thoughts on the "Accidental
De-Immortalizing" section.  While I'm confident we will find a good
solution, I'm not yet confident about the specific solution.  So
feedback would be appreciated.  Thanks!


Hi,
I like the newest version, except this one section is concerning.


"periodically reset the refcount for immortal objects (only enable this 
if a stable ABI extension is imported?)" -- that sounds quite expensive, 
both at runtime and maintenance-wise.


"provide a runtime flag for disabling immortality" also doesn't sound 
workable to me. We'd essentially need to run all tests twice every time 
to make sure it stays working.



"Special-casing immortal objects in tp_dealloc() for the relevant types 
(but not int, due to frequency?)" sounds promising.


The "relevant types" are those for which we skip calling incref/decref 
entirely, like in Py_RETURN_NONE. This skipping is one of the optional 
optimizations, so we're entirely in control of if/when to apply it. How 
much would it slow things back down if it wasn't done for ints at all?




Some more reasoning for not worrying about de-immortalizing in types 
without this optimization:
These objects will be de-immortalized with refcount around 2^29, and 
then incref/decref go back to being paired properly. If 2^29 is much 
higher than the true reference count at de-immortalization, this'll just 
cause a memory leak at shutdown.
And it's probably OK to assume that the true reference count of an 
object can't be anywhere near 2^29: most of the time, to hold a 
reference you also need to have a pointer to the referenced object, and 
there ain't enough memory for that many pointers. This isn't a formally 
sound assumption, of course -- you can incref a million times with a 
single pointer if you pair the decrefs correctly. But it might be why we 
had no issues with "int won't overflow", an assumption which would fail 
with just 4× higher numbers.


Of course, the this argument would apply to immortalization and 64-bit 
builds as well. I wonder if there are holes in it :)


Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is 
valid, refcount values 1 and 0 can no longer be treated specially. 
That's probably not a practical issue for the relevant types, but it's 
one more thing to think about when applying the optimization.



There's also the other direction to consider: if an old stable-ABI 
extension does unpaired *increfs* on an immortal object, it'll 
eventually overflow the refcount.
When the refcount is negative, decref will currently crash if built with 
Py_DEBUG, and I think we want to keep that check/crash. (Note that 
either be Python itself or any extension could be built with Py_DEBUG.)
Hopefully we can live with that, and hope anyone running with Py_DEBUG 
will send a useful use case report.

Or is there another bit before the sign this'll mess up?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7ZSLUMOIOV676UH42LIWGQASFMXBWSBN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Petr Viktorin

On 09. 03. 22 4:38, Eric Snow wrote:

I'd really appreciate feedback on this new PEP about making the GIL
per-interpreter.


Yay! Thank you!




The PEP targets 3.11, but we'll see if that is too close.  I don't
mind waiting one more
release, though I'd prefer 3.11 (obviously).  Regardless, I have no
intention of rushing
this through at the expense of cutting corners.  Hence, we'll see how it goes.


How mature is the implementation?

If it ends up in 3.12, I'd consider asking the release manager for an 
extra alpha release so people can start playing with the feature early.


(With my Fedora hat on: I'd love to test it with thousands of packages!)



The PEP text is included inline below.  Thanks!

-eric

===

PEP: 684
Title: A Per-Interpreter GIL
Author: Eric Snow 
Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst


This iteration of the PEP should also have `Requires: 683` (Immortal 
Objects).



Created: 08-Mar-2022
Python-Version: 3.11
Post-History: 08-Mar-2022
Resolution:





Abstract


Since Python 1.5 (1997), CPython users can run multiple interpreters
in the same process.  However, interpreters in the same process
have always shared a significant
amount of global state.  This is a source of bugs, with a growing
impact as more and more people use the feature.  Furthermore,
sufficient isolation would facilitate true multi-core parallelism,
where interpreters no longer share the GIL.  The changes outlined in
this proposal will result in that level of interpreter isolation.


High-Level Summary
==

At a high level, this proposal changes CPython in the following ways:

* stops sharing the GIL between interpreters, given sufficient isolation
* adds several new interpreter config options for isolation settings
* adds some public C-API for fine-grained control when creating interpreters
* keeps incompatible extensions from causing problems

The GIL
---

The GIL protects concurrent access to most of CPython's runtime state.
So all that GIL-protected global state must move to each interpreter
before the GIL can.

(In a handful of cases, other mechanisms can be used to ensure
thread-safe sharing instead, such as locks or "immortal" objects.)

CPython Runtime State
-

Properly isolating interpreters requires that most of CPython's
runtime state be stored in the ``PyInterpreterState`` struct.  Currently,
only a portion of it is; the rest is found either in global variables
or in ``_PyRuntimeState``.  Most of that will have to be moved.

This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of C global variables and consolidate the runtime
state into ``_PyRuntimeState`` and ``PyInterpreterState``.
(See `Consolidating Runtime Global State`_ below.)  That project has
`significant merit on its own `_
and has faced little controversy.  So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposal--only a dependency.

Other Isolation Considerations
--

CPython's interpreters must be strictly isolated from each other, with
few exceptions.  To a large extent they already are.  Each interpreter
has its own copy of all modules, classes, functions, and variables.
The CPython C-API docs `explain further `_.

.. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats

However, aside from what has already been mentioned (e.g. the GIL),
there are a couple of ways in which interpreters still share some state.

First of all, some process-global resources (e.g. memory,
file descriptors, environment variables) are shared.  There are no
plans to change this.

Second, some isolation is faulty due to bugs or implementations that
did not take multiple interpreters into account.  This includes
CPython's runtime and the stdlib, as well as extension modules that
rely on global variables.  Bugs should be opened in these cases,
as some already have been.

Depending on Immortal Objects
-

:pep:`683` introduces immortal objects as a CPython-internal feature.
With immortal objects, we can share any otherwise immutable global
objects between all interpreters.  Consequently, this PEP does not
need to address how to deal with the various objects
`exposed in the public C-API `_.
It also simplifies the question of what to do about the builtin
static types.  (See `Global Objects`_ below.)

Both issues have alternate solutions, but everything is simpler with
immortal objects.  If PEP 683 is not accepted then this one will be
updated with the alternatives.  This lets us reduce noise in this
proposal.


Motivation
==

The fundamental problem we're solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime.  The GIL is the
cause.  While it usually isn't a problem in practice, at the v

[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Petr Viktorin
Oops, I hit Send by mistake! Please disregard the previous message (I
often draft questions I later find answered, so I delete them.)

On Wed, Mar 9, 2022 at 5:53 PM Petr Viktorin  wrote:
>
> On 09. 03. 22 4:38, Eric Snow wrote:
> > I'd really appreciate feedback on this new PEP about making the GIL
> > per-interpreter.
>
> Yay! Thank you!
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RLBJEE2MLXMJNN2R444AFZDN54JDRWI7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Petr Viktorin

On 09. 03. 22 4:38, Eric Snow wrote:

I'd really appreciate feedback on this new PEP about making the GIL
per-interpreter.


Yay! Thank you!
This PEP definitely makes per-interpreter GIL sound possible :)



The PEP targets 3.11, but we'll see if that is too close.  I don't
mind waiting one more
release, though I'd prefer 3.11 (obviously).  Regardless, I have no
intention of rushing
this through at the expense of cutting corners.  Hence, we'll see how it goes.
> The PEP text is included inline below.  Thanks!

-eric

===

PEP: 684
Title: A Per-Interpreter GIL
Author: Eric Snow 
Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst


This iteration of the PEP should also have `Requires: 683` (Immortal 
Objects).


[...]


Motivation
==

The fundamental problem we're solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime.  The GIL is the
cause.  While it usually isn't a problem in practice, at the very least
it makes Python's multi-core story murky, which makes the GIL
a consistent distraction.

Isolated interpreters are also an effective mechanism to support
certain concurrency models.  :pep:`554` discusses this in more detail.

Indirect Benefits
-

Most of the effort needed for a per-interpreter GIL has benefits that
make those tasks worth doing anyway:

* makes multiple-interpreter behavior more reliable
* has led to fixes for long-standing runtime bugs that otherwise
   hadn't been prioritized > * has been exposing (and inspiring fixes for) previously unknown 

runtime bugs

* has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
* has driven cleaner and more complete runtime finalization
* led to structural layering of the C-API (e.g. ``Include/internal``)
* also see `Benefits to Consolidation`_ below


Do you want to dig up some bpo examples, to make these more convincing 
to the casual reader?




Furthermore, much of that work benefits other CPython-related projects:

* performance improvements ("faster-cpython")
* pre-fork application deployment (e.g. Instagram)


Maybe say “e.g. with Instagram's Cinder” – both the household name and 
the project you can link to?



* extension module isolation (see :pep:`630`, etc.)
* embedding CPython


A lot of these points are duplicated in "Benefits to Consolidation" list 
below, maybe there'd be, ehm, benefits to consolidating them?


[...]

PEP 554
---


Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for 
people who don't remember the magic numbers but want to skim the table 
of contents.



:pep:`554` is strictly about providing a minimal stdlib module
to give users access to multiple interpreters from Python code.
In fact, it specifically avoids proposing any changes related to
the GIL.  Consider, however, that users of that module would benefit
from a per-interpreter GIL, which makes PEP 554 more appealing.


Rationale
=

During initial investigations in 2014, a variety of possible solutions
for multi-core Python were explored, but each had its drawbacks
without simple solutions:

* the existing practice of releasing the GIL in extension modules
   * doesn't help with Python code
* other Python implementations (e.g. Jython, IronPython)
   * CPython dominates the community
* remove the GIL (e.g. gilectomy, "no-gil")
   * too much technical risk (at the time)
* Trent Nelson's "PyParallel" project
   * incomplete; Windows-only at the time
* ``multiprocessing``

   * too much work to make it effective enough;
 high penalties in some situations (at large scale, Windows)

* other parallelism tools (e.g. dask, ray, MPI)
   * not a fit for the stdlib
* give up on multi-core (e.g. async, do nothing)
   * this can only end in tears


This list doesn't render correctly in ReST, you need blank lines everywhere.
There are more cases like this below.

[...]> Per-Interpreter State

-

The following runtime state will be moved to ``PyInterpreterState``:

* all global objects that are not safely shareable (fully immutable)
* the GIL
* mutable, currently protected by the GIL


Spelling out “mutable state” in these lists would make this clearer, 
since “state” isn't elided from all the points.



* mutable, currently protected by some other per-interpreter lock
* mutable, may be used independently in different interpreters


This includes extension modules (with multi-phase init), right?


* all other mutable (or effectively mutable) state
   not otherwise excluded below

Furthermore, a number of parts of the global state have already been
moved to the interpreter, such as GC, warnings, and atexit hooks.

The following state will not be moved:

* global objects that are safely shareable, if any
* immutable, often ``const``
* treated as immutable


Do you have an example for this?


* related to CPython's ``main()`` execution
* related to the REPL


Woul

[Python-Dev] Re: Defining tiered platform support

2022-03-09 Thread Charalampos Stratakis
On Fri, Mar 4, 2022 at 9:49 AM Christian Heimes 
wrote:

> Hi Brett,
>
> thanks for starting the discussion! Much appreciated.
>
> On 04/03/2022 00.30, Brett Cannon wrote:
> > Tier 1 is the stuff we run CI against: latest Windows, latest macOS,
> > Linux w/ the latest glibc (I don't know of a better way to define Linux
> > support as I don't know if a per-distro list is the right abstraction).
> > These are platforms we won't even let code be committed for if they
> > would break; they block releases if they don't work. These platforms we
> > all implicitly promise to support.
>  >
> > Tier 2 is the platforms we would revert a change within 24 hours if they
> > broke: latest FeeBSD, older Windows, older macOS, Linux w/ older
> > glibc.This is historically the "stable buildbot plus a core dev" group
> > of platforms. The change I would like to see is two core devs (in case
> > one is on vacation), and a policy as to how a platform ends up here
> > (e.g. SC must okay it based on consensus of everyone). The stable
> > buildbot would still be needed to know if a release is blocked as we
> > would hold a release up if they were red. The platform and the core devs
> > supporting these platforms would be listed in PEP 11.
>
> I would like to see an explicit statement about glibc compatibility.
> glibc's API and ABI is very stable. We have autoconf feature checks for
> newer glibc features, so I'm not overly concerned with breaking
> compatibility with glibc. Anyhow we should also ensure that we are
> backwards compatible with older glibc releases that are commonly used in
> the community.
>
> Therefore I propose that we target the oldest manylinux standard
> accepted by PyPI, for which the operating system has not reached its
> EOL. At the moment this is manylinux2014, aka CentOS 2024 with EOL June
> 2024. We could also state that we aim for compatibility with oldest
> Debian Stable and Ubuntu LTS with standard, free security updates. As of
> today Debian 10 Buster Ubuntu 18.04 Bionic are the oldest versions with
> regular updates.
>
>
> Apropos libc, what is our plan concerning musl libc (Alpine)? It's a
> popular distro for containers. CPython's test suite is failing on latest
> Alpine (https://bugs.python.org/issue46390). Some of the problems seem
> to be caused by issues or missing features in musl libc. I like to see
> the problems fixed before we claim basic support for Alpine.
>
>
> > I would expect PEP 11 to list the appropriate C symbol that's set for
> > that platform, e.g. __linux__.
>
> For POSIX-like OS I would rather follow the example of Rust and use
> platform target triplet. The triplet encodes machine (CPU arch), vendor,
> and operating system. The OS part can encode libc. For example
> x86_64-*-linux-gnu for "x84_64 arch", "any vendor", and "Linux with GNU
> libc (glibc)". Commands like ./config.guess or gcc -dumpmachine return
> the current triplet.
>
> The target triplet is used by autoconf's ./configure script a lot.
>
>
> > I don't know if we want to bother listing CPU architectures since we are
> > a pure C project which makes CPU architecture less of a thing, but I'm
> > personally open to the idea of CPU architectures being a part of the
> > platform definition.
>
> I strongly recommend that we include machine architecture, too. We have
> some code that uses machine specific instructions or features, e.g.
> unaligned memory access. We cannot debug problems on CPU archs unless we
> have access to the hardware.
>
>
Agreed, there have been various architecture specific bugs in the past and
the buildbots provide good coverage in that respect.


>
> > I don't think we should cover C compilers here as that's covered by PEP
> > 7. Otherwise PEP 7 should only list C versions/features and PEP 11 lists
> > compilers and their versions.
>
> We should say something about compilers. I wouldn't list compiler
> versions, though. Compiler features like C99 support should be sufficient.
>
> Do we target the platform's default compiler or are we targeting the
> latest compiler that is officially supported for the platform? CentOS 7
> comes with an old GCC, but has newer GCC versions in SCL (Developer
> Toolset 8). I'm asking because CentOS 7's default gcc does not support
> stdatomic.h. The official manylinux2014 OSCI container image ships GCC
> from devtoolset-8.
>
>
That's an interesting question and RHEL7 is a bit of a special case. If
mimalloc, for example, will be used in CPython, RHEL7/CentOS7 support is
out of the
question in regards to its default compiler. I've already changed the
config to some RHEL7 buildbots to use a later GCC version through the
Developer Toolset 8, so GCC 8.

The latest Python shipped through Red Hat Software Collection channels in
RHEL7 is Python 3.8, built using Developer Toolset 9 (GCC 9).

However, me and David Edelsohn are the only ones providing RHEL7 buildbots,
so coordinating a change to all the configs to use a later GCC version
should
be easy enough.

Another thing

[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Eric Snow
Thanks for the feedback, Petr!  Responses inline below.

-eric

On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin  wrote:
> This PEP definitely makes per-interpreter GIL sound possible :)

Oh good. :)

> > PEP: 684
> > Title: A Per-Interpreter GIL
> > Author: Eric Snow 
> > Discussions-To: python-dev@python.org
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
>
> This iteration of the PEP should also have `Requires: 683` (Immortal
> Objects).

+1

> > Most of the effort needed for a per-interpreter GIL has benefits that
> > make those tasks worth doing anyway:
> >
> > * makes multiple-interpreter behavior more reliable
> > * has led to fixes for long-standing runtime bugs that otherwise
> >hadn't been prioritized > * has been exposing (and inspiring fixes for) 
> > previously unknown
> runtime bugs
> > * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
> > * has driven cleaner and more complete runtime finalization
> > * led to structural layering of the C-API (e.g. ``Include/internal``)
> > * also see `Benefits to Consolidation`_ below
>
> Do you want to dig up some bpo examples, to make these more convincing
> to the casual reader?

Heh, the casual reader isn't really my target audience. :)  I actually
have a stockpile of links but left them all out until they were
needed.  Would the decision-makers benefit from the links?  I'm trying
to avoid adding to the already sizeable clutter in this PEP. :)  I'll
add some links in if you think it matters.

> > Furthermore, much of that work benefits other CPython-related projects:
> >
> > * performance improvements ("faster-cpython")
> > * pre-fork application deployment (e.g. Instagram)
>
> Maybe say “e.g. with Instagram's Cinder” – both the household name and
> the project you can link to?

+1

Note that Instagram isn't exactly using Cinder.  I'll have to check if
Cinder uses the pre-fork model.

> > * extension module isolation (see :pep:`630`, etc.)
> > * embedding CPython
>
> A lot of these points are duplicated in "Benefits to Consolidation" list
> below, maybe there'd be, ehm, benefits to consolidating them?

There shouldn't be any direct overlap.

FWIW, the whole "Extra Context" section is essentially a separate PEP
that I inlined (with the caveat that it really isn't worth its own
PEP).  I'm still considering yanking it, so the above list should
stand on its own.

> > PEP 554
> > ---
>
> Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for
> people who don't remember the magic numbers but want to skim the table
> of contents.

+1

> This list doesn't render correctly in ReST, you need blank lines everywhere.
> There are more cases like this below.

Hmm, I had blank lines and the PEP editor told me I needed to remove them.

> [...]> Per-Interpreter State
> > -
> >
> > The following runtime state will be moved to ``PyInterpreterState``:
> >
> > * all global objects that are not safely shareable (fully immutable)
> > * the GIL
> > * mutable, currently protected by the GIL
>
> Spelling out “mutable state” in these lists would make this clearer,
> since “state” isn't elided from all the points.

+1

> > * mutable, currently protected by some other per-interpreter lock
> > * mutable, may be used independently in different interpreters
>
> This includes extension modules (with multi-phase init), right?

Yep.

> > The following state will not be moved:
> >
> > * global objects that are safely shareable, if any
> > * immutable, often ``const``
> > * treated as immutable
>
> Do you have an example for this?

Strings (PyUnicodeObject) actually cache some info, making them not
strictly immutable, but they are close enough to be treated as such.
I'll add a note to the PEP.

> > * related to CPython's ``main()`` execution
> > * related to the REPL
>
> Would “only used by” work instead of “related to”?

Sure.

> > * set during runtime init, then treated as immutable
>
> `main()`, REPL and runtime init look like special cases of functionality
> that only runs in one interpreter. If it's so, maybe generalize this?

+1

> > * ``_PyInterpreterConfig``
> > * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)
>
> Since the API is not documented (and _PyInterpreterConfig is not even in
> main yet!), it would be good to sketch out the docs (intended behavior)
> here.

+1

> > The following fields will be added to ``PyInterpreterConfig``:
> >
> > * ``own_gil`` - (bool) create a new interpreter lock
> >(instead of sharing with the main interpreter)
>
> As a user of the API, what should I consider when setting this flag?
> Would the GIL be shared with the *parent* interpreter or the main one?

The GIL would be shared with the main interpreter.  I state that there
but it looks like I wasn' clear enough.

> What are the restrictions/implications of this flag?

Good point.  I'll add a brief explanation of why you would want to
keep sharing the GIL (e.g. the status quo) and what is different if

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-09 Thread Jim J. Jewett
> "periodically reset the refcount for immortal objects (only enable this
> if a stable ABI extension is imported?)" -- that sounds quite expensive, 
> both at runtime and maintenance-wise.

As I understand it, the plan is to represent an immortal object by setting two 
high-order bits to 1.  The higher bit is the actual test, and the one 
representing half of that is a safety margin.

When reducing the reference count, CPython already checks whether the 
refcount's new value is 0.  It could instead check whether refcount & (not 
!immortal_bit) is 0, which would detect when the safety margin has been reduced 
to 0 -- and could then add it back in.  Since the bit manipulation is not 
conditional, the only extra branch will occur when an object is about to be 
de-allocated, and that might be rare enough to be an acceptable cost.  (It 
still doesn't prevent rollover from too many increfs,  but ... that should 
indeed be rare in the wild.)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O324Q4KMMXL2UHOQIZZWS52U7YHJGYEI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Carl Meyer
Hi Eric, just one note:

On Wed, Mar 9, 2022 at 7:13 PM Eric Snow  wrote:
> > Maybe say “e.g. with Instagram's Cinder” – both the household name and
> > the project you can link to?
>
> +1
>
> Note that Instagram isn't exactly using Cinder.

This sounds like a misunderstanding somewhere. Instagram server is
"exactly using Cinder" :)

>  I'll have to check if  Cinder uses the pre-fork model.

It doesn't really make sense to ask whether "Cinder uses the pre-fork
model" -- Cinder is just a CPython variant, it can work with all the
same execution models CPython can. Instagram server uses Cinder with a
pre-fork execution model. Some other workloads use Cinder without
pre-forking.

Carl
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5A3E6VCEY5XZXEFPGHNGKPM3HXQEJRTX/
Code of Conduct: http://python.org/psf/codeofconduct/