from:"Eric Snow"

[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-12-02 Thread Eric Snow

On Mon, Nov 28, 2022 at 6:45 PM Steven D'Aprano  wrote:
> On Tue, Nov 29, 2022 at 01:34:54PM +1300, Greg Ewing wrote:
> > I got the impression that there were some internal language reasons
> > to want stable dicts, e.g. so that the class dict passed to __prepare__
> > preserves the order in which names are assigned in the class body. Are
> > there any such use cases for stable sets?
>
> Some people wanted order preserving kwargs, I think for web frameworks.
> There was even a discussion for a while about using OrderedDict for
> kwargs and leaving dicts unordered.

See https://peps.python.org/pep-0468/ (kwargs) and
https://peps.python.org/pep-0520/ (class definition body).  I
re-implemented OrderedDict in C for this purpose.  Literally right
after I had finished that, Inada-san showed up with his compact dict
implementation.  Many of us were at the first core sprint at the time
and there was a lot of excitement about compact dict.  It was merged
right away (for 3.6) and there was quick agreement that we could
depend on dict insertion ordering internally (for a variety of use
cases, IIRC).  Thus, suddenly both my PEPs were effectively
implemented, so we marked them as approved and moved on.

FWIW, making the insertion ordering an official part of the language
happened relatively soon afterward, though for 3.7, not 3.6. [1]  I'm
pretty sure there's a python-dev thread about that.  The stdtypes docs
were updated [2] soon after, and we finally got around to updating the
language [3] a couple years later.

-eric

[1] https://docs.python.org/3/whatsnew/3.7.html#summary-release-highlights
[2] https://bugs.python.org/issue33609
[3] https://bugs.python.org/issue39879
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5QYN66BWHO4GHWD34DIY43NLBMAM4UPZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Switching to Discourse

2022-07-21 Thread Eric Snow

On Thu, Jul 21, 2022 at 12:19 AM Stefan Behnel  wrote:
> I'm actually reading python-dev, c.l.py etc. through Gmane, and have done
> that ever since I joined. Simply because it's a mailing list of which I
> don't need a local (content) copy, and wouldn't want one. Gmane seems to
> have a complete archive that's searchable, regardless of "when I subscribed".

+1

> It's really sad that Discourse lacks an NNTP interface. There's an
> unmaintained bridge to NNTP servers [1], but not an emulating interface
> that would serve the available discussions via NNTP messages, so that users
> can get them into their NNTP/Mail clients to read them in proper discussion
> threads. I think adding that next to the existing web interface would serve
> everyone's needs just perfectly.

Perhaps the possible mirroring-to-mailman that Steve (Turnbull)
mentioned would be enough to facilitate a continuity for NNTP?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PHPTWOITHVDNN5WDHQZUHXBDO3ABYGMZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Switching to Discourse

2022-07-20 Thread Eric Snow

On Mon, Jul 18, 2022 at 11:48 AM  wrote:
> LLVM did the same recently (though they imported all previous messages from 
> the mailinglist, thus making them searchable in discourse) [2 - announcement; 
> 3 - retro], and by and large, I think it was a success.
>
> One of the comments in the retro was:
> > Searching the archives is much easier and have found me many old threads 
> > that I probably would have problem finding before since I haven’t been 
> > subscribed for that long.
>
> I that it would be worth considering importing the mailing list into a 
> separate discourse category that's then archived, but at least searchable. 
> This would also lower the hurdle of new(er) contributors to investigate 
> previous discussion on a given topic.

+1

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/32TYER52AV527DSZBTYGZMFRZR25BNR2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Switching to Discourse

2022-07-20 Thread Eric Snow

On Fri, Jul 15, 2022 at 5:21 AM Petr Viktorin wrote:
> The Steering Council would like to switch from python-dev to
> discuss.python.org.

This seems like a net win for the community so +1 from me. (For me
personally it amounts to disruption with little advantage, so I'd
probably be -0). However, I am not python-dev and discuss.python.org
is probably a better fit for most of the participants.)

(Message threading on discuss.python.org feels like a step backward in
usability though. This is especially true with long threads, support
for which (I expect) Discourse has not prioritized.)

My only real concern is one I've brought up before when we started
splitting discussions onto DPO (discuss.python.org), as well as with
the GitHub issues migration: message archives.

I consider the ability to search message archives to be essential to
effective contribution, both in attracting/integrating new
contributors and in providing "offline" context for active
contributors. The existing archives have aided me personally so many
times in both ways.

There are relevant three aspects to archival and search that are worth
asking about here:

1. search functionality on the [archive] web site
2. ability to search using other tools (e.g. my favorite: Google
search with "site:...")
3. single archive vs. split archive

Regarding (1), currently it is relatively easy to search through
message archives on https://mail.python.org/archives/list/ The
DPO UI search functionality seems fine.

Regarding (2), currently it's easy to search using other tools and the
results are clean (not noisy). With DPO, is that possible? (A quick
attempt was a complete failure.) Would the results be good enough?
Would they be noisier?

Regarding (3), it's a small thing but, IMHO, having a single archive
is valuable. Most notably (for me, at least), with a split archive it
becomes a little harder to make sure searches covered the full message
history of a given channel.

It would be nice if at least one of the sites could preserve *all* the
history. In the case of python-dev, either we'd forward all relevant
DPO messages to python-dev@python.org (or otherwise directly send them
to https://mail.python.org/archives/list/python-dev@python.org) or
we'd import the archived mailing list into DPO. Or maybe it would
require more work than it would be worth?

> - You can use discuss.python.org's “mailing list mode” (which subscribes
> you to all new posts), possibly with filtering and/or categorizing
> messages locally.

FWIW, I've been using mailing list mode (for consumption) since we
started discuss.python.org and it's been fine. I've hit a
couple[1][2] minor annoyances, but overall I don't have any real
complaints. Mailing list mode is straightforward to configure, the
messages have a "mailing list" header set (for easy filtering), and
jumping over to the web UI to start a thread, respond (or react) is
trivial.

-eric

[1] My mobile email notifications format the messages weird.
[2] The messages are significantly noisier than regular (text) email.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/HA47EERV3V5AUGJDFC5BQEZYYR5PYURN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Switching to Discourse

2022-07-20 Thread Eric Snow

On Fri, Jul 15, 2022 at 12:15 PM Barry Warsaw  wrote:
> I agree that the experiment has proven successful enough that there’s more 
> value at this point in consolidating discussions.

We've only been running this experiment since 2017(?) so maybe it's
too soon to say it's a success? 


-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D5S72W7HCKGJ5FMNPLCK35XHQEMIA4XH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-11 Thread Eric Snow

responses inline

-eric

On Wed, Mar 9, 2022 at 8:23 AM Petr Viktorin  wrote:
> "periodically reset the refcount for immortal objects (only enable this
> if a stable ABI extension is imported?)" -- that sounds quite expensive,
> both at runtime and maintenance-wise.

Are you talking just about "(only enable this if a stable ABI
extension is imported?)"?  Such a check could certainly be expensive
but it doesn't have to be.  However, I'm guessing that you are
actually talking about the mechanism to periodically reset the
refcount.

The actual periodic reset doesn't seem like it needs to be all that
expensive overall.  It would just need to be in a place that gets
triggered often enough, but not too often such that the extra cost of
resetting the refcount would be a problem.

One important factor is whether we need to worry about potential
de-immortalization for all immortal objects or only for a specific
subset, like the most commonly used objects (at least most commonly
used by the problematic older stable ABI extensions),  Mostly, we only
need to be concerned with the objects that are likely to trigger
de-immortalization in those extensions.  Realistically, there aren't
many potential immortal objects that would be exposed to the
de-immortalization problem (e.g. None, True, False), so we could limit
this workaround to them.

A variety of options come to mind.  In each case we would reset the
refcount of a given object if it is immortal.  (We would also only do
so if the refcount actually changed--to avoid cache invalidation and
copy-on-write.)

If we need to worry about *all* immortal objects then I see several options:

1. in a single place where stable ABI extensions are likely to pass
all objects often enough
2. in a single place where all objects pass through often enough

On the other hand, if we only need to worry about a fixed set of
objects, the following options come to mind:

1. in a single place that is likely to be called by older stable ABI extensions
2. in a place that runs often enough, targeting a hard-coded group of
immortal objects (common static globals like None)
   * perhaps in the eval breaker code, in exception handling, etc.
3. like (2) but rotate through subsets of the hard-coded group (to
reduce the overall cost)
4. like (2), but in spread out in type-specific code (e.g. static
types could be reset in type_dealloc())

Again, none of those should be in code that runs often enough that the
overhead would add up.

> "provide a runtime flag for disabling immortality" also doesn't sound
> workable to me. We'd essentially need to run all tests twice every time
> to make sure it stays working.

Yeah, that makes it not worth it.

> "Special-casing immortal objects in tp_dealloc() for the relevant types
> (but not int, due to frequency?)" sounds promising.
>
> The "relevant types" are those for which we skip calling incref/decref
> entirely, like in Py_RETURN_NONE. This skipping is one of the optional
> optimizations, so we're entirely in control of if/when to apply it.

We would definitely do it for those types.  NoneType and bool already
have a tp_dealloc that calls Py_FatalError() if triggered.  The
tp_dealloc for str & tuple have special casing for some singletons
that do likewise.  In PyType_Type.tp_dealloc we have a similar assert
for static types.  In each case we would instead reset the refcount to
the initial immortal value.  Regardless, in practice we may only need
to worry (as noted above) about the problem for the most commonly used
global objects, so perhaps we could stop there.

However, it depends on what the level of risk is, such that it would
warrant incurring additional potential performance/maintenance costs.
What is the likelihood of actual crashes due to pathological
de-immortalization in older stable ABI extensions?  I don't have a
clear answer to offer on that but I'd only expect it to be a problem
if such extensions are used heavily in (very) long-running processes.

> How much would it slow things back down if it wasn't done for ints at all?

I'll look into that.  We're talking about the ~260 small ints, so it
depends on how much they are used relative to all the other int
objects that are used in a program.

> Some more reasoning for not worrying about de-immortalizing in types
> without this optimization:
> These objects will be de-immortalized with refcount around 2^29, and
> then incref/decref go back to being paired properly. If 2^29 is much
> higher than the true reference count at de-immortalization, this'll just
> cause a memory leak at shutdown.
> And it's probably OK to assume that the true reference count of an
> object can't be anywhere near 2^29: most of the time, to hold a
> reference you also need to have a pointer to the referenced object, and
> there ain't enough memory for that many pointers. This isn't a formally
> sound assumption, of course -- you can incref a million times with a
> single pointer if you pair the decrefs correctly. But it

[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-10 Thread Eric Snow

On Wed, Mar 9, 2022 at 7:37 PM Carl Meyer  wrote:
> > Note that Instagram isn't exactly using Cinder.
>
> This sounds like a misunderstanding somewhere. Instagram server is
> "exactly using Cinder" :)

:)

Thanks for clarifying, Carl.

> >  I'll have to check if  Cinder uses the pre-fork model.
>
> It doesn't really make sense to ask whether "Cinder uses the pre-fork
> model" -- Cinder is just a CPython variant, it can work with all the
> same execution models CPython can. Instagram server uses Cinder with a
> pre-fork execution model. Some other workloads use Cinder without
> pre-forking.

+1

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZI6JXJJ2F6DCHTVYUVQFDNPCWEH76J6V/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-09 Thread Eric Snow

Thanks for the feedback, Petr!  Responses inline below.

-eric

On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin  wrote:
> This PEP definitely makes per-interpreter GIL sound possible :)

Oh good. :)

> > PEP: 684
> > Title: A Per-Interpreter GIL
> > Author: Eric Snow 
> > Discussions-To: python-dev@python.org
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
>
> This iteration of the PEP should also have `Requires: 683` (Immortal
> Objects).

+1

> > Most of the effort needed for a per-interpreter GIL has benefits that
> > make those tasks worth doing anyway:
> >
> > * makes multiple-interpreter behavior more reliable
> > * has led to fixes for long-standing runtime bugs that otherwise
> >hadn't been prioritized > * has been exposing (and inspiring fixes for) 
> > previously unknown
> runtime bugs
> > * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
> > * has driven cleaner and more complete runtime finalization
> > * led to structural layering of the C-API (e.g. ``Include/internal``)
> > * also see `Benefits to Consolidation`_ below
>
> Do you want to dig up some bpo examples, to make these more convincing
> to the casual reader?

Heh, the casual reader isn't really my target audience. :)  I actually
have a stockpile of links but left them all out until they were
needed.  Would the decision-makers benefit from the links?  I'm trying
to avoid adding to the already sizeable clutter in this PEP. :)  I'll
add some links in if you think it matters.

> > Furthermore, much of that work benefits other CPython-related projects:
> >
> > * performance improvements ("faster-cpython")
> > * pre-fork application deployment (e.g. Instagram)
>
> Maybe say “e.g. with Instagram's Cinder” – both the household name and
> the project you can link to?

+1

Note that Instagram isn't exactly using Cinder.  I'll have to check if
Cinder uses the pre-fork model.

> > * extension module isolation (see :pep:`630`, etc.)
> > * embedding CPython
>
> A lot of these points are duplicated in "Benefits to Consolidation" list
> below, maybe there'd be, ehm, benefits to consolidating them?

There shouldn't be any direct overlap.

FWIW, the whole "Extra Context" section is essentially a separate PEP
that I inlined (with the caveat that it really isn't worth its own
PEP).  I'm still considering yanking it, so the above list should
stand on its own.

> > PEP 554
> > ---
>
> Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for
> people who don't remember the magic numbers but want to skim the table
> of contents.

+1

> This list doesn't render correctly in ReST, you need blank lines everywhere.
> There are more cases like this below.

Hmm, I had blank lines and the PEP editor told me I needed to remove them.

> [...]> Per-Interpreter State
> > -
> >
> > The following runtime state will be moved to ``PyInterpreterState``:
> >
> > * all global objects that are not safely shareable (fully immutable)
> > * the GIL
> > * mutable, currently protected by the GIL
>
> Spelling out “mutable state” in these lists would make this clearer,
> since “state” isn't elided from all the points.

+1

> > * mutable, currently protected by some other per-interpreter lock
> > * mutable, may be used independently in different interpreters
>
> This includes extension modules (with multi-phase init), right?

Yep.

> > The following state will not be moved:
> >
> > * global objects that are safely shareable, if any
> > * immutable, often ``const``
> > * treated as immutable
>
> Do you have an example for this?

Strings (PyUnicodeObject) actually cache some info, making them not
strictly immutable, but they are close enough to be treated as such.
I'll add a note to the PEP.

> > * related to CPython's ``main()`` execution
> > * related to the REPL
>
> Would “only used by” work instead of “related to”?

Sure.

> > * set during runtime init, then treated as immutable
>
> `main()`, REPL and runtime init look like special cases of functionality
> that only runs in one interpreter. If it's so, maybe generalize this?

+1

> > * ``_PyInterpreterConfig``
> > * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``)
>
> Since the API is not documented (and _PyInterpreterConfig is not even in
> main yet!), it would be good to sketch out the docs (intended behavior)
> here.

+1

> > The following fields will be added to ``PyInterpreterConfig``:
> >
> > * ``own_gil`` - (bool) create a new interpreter lock
> >(instead of sharing with the main interpre

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-08 Thread Eric Snow

On Mon, Feb 28, 2022 at 6:01 PM Eric Snow  wrote:
> The updated PEP text is included below.  The largest changes involve
> either the focus of the PEP (internal mechanism to mark objects
> immortal) or the possible ways that things can break on older 32-bit
> stable ABI extensions.  All other changes are smaller.

In particular, I'm hoping to get your thoughts on the "Accidental
De-Immortalizing" section.  While I'm confident we will find a good
solution, I'm not yet confident about the specific solution.  So
feedback would be appreciated.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2NNPKXRL6HY7IYUDMEQ6DS5RC3AYQKYQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PEP 684: A Per-Interpreter GIL

2022-03-08 Thread Eric Snow

I'd really appreciate feedback on this new PEP about making the GIL
per-interpreter.

The PEP targets 3.11, but we'll see if that is too close.  I don't
mind waiting one more
release, though I'd prefer 3.11 (obviously).  Regardless, I have no
intention of rushing
this through at the expense of cutting corners.  Hence, we'll see how it goes.

The PEP text is included inline below.  Thanks!

-eric

===

PEP: 684
Title: A Per-Interpreter GIL
Author: Eric Snow 
Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 08-Mar-2022
Python-Version: 3.11
Post-History: 08-Mar-2022
Resolution:

Abstract


Since Python 1.5 (1997), CPython users can run multiple interpreters
in the same process.  However, interpreters in the same process
have always shared a significant
amount of global state.  This is a source of bugs, with a growing
impact as more and more people use the feature.  Furthermore,
sufficient isolation would facilitate true multi-core parallelism,
where interpreters no longer share the GIL.  The changes outlined in
this proposal will result in that level of interpreter isolation.


High-Level Summary
==

At a high level, this proposal changes CPython in the following ways:

* stops sharing the GIL between interpreters, given sufficient isolation
* adds several new interpreter config options for isolation settings
* adds some public C-API for fine-grained control when creating interpreters
* keeps incompatible extensions from causing problems

The GIL
---

The GIL protects concurrent access to most of CPython's runtime state.
So all that GIL-protected global state must move to each interpreter
before the GIL can.

(In a handful of cases, other mechanisms can be used to ensure
thread-safe sharing instead, such as locks or "immortal" objects.)

CPython Runtime State
-

Properly isolating interpreters requires that most of CPython's
runtime state be stored in the ``PyInterpreterState`` struct.  Currently,
only a portion of it is; the rest is found either in global variables
or in ``_PyRuntimeState``.  Most of that will have to be moved.

This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of C global variables and consolidate the runtime
state into ``_PyRuntimeState`` and ``PyInterpreterState``.
(See `Consolidating Runtime Global State`_ below.)  That project has
`significant merit on its own `_
and has faced little controversy.  So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposal--only a dependency.

Other Isolation Considerations
--

CPython's interpreters must be strictly isolated from each other, with
few exceptions.  To a large extent they already are.  Each interpreter
has its own copy of all modules, classes, functions, and variables.
The CPython C-API docs `explain further `_.

.. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats

However, aside from what has already been mentioned (e.g. the GIL),
there are a couple of ways in which interpreters still share some state.

First of all, some process-global resources (e.g. memory,
file descriptors, environment variables) are shared.  There are no
plans to change this.

Second, some isolation is faulty due to bugs or implementations that
did not take multiple interpreters into account.  This includes
CPython's runtime and the stdlib, as well as extension modules that
rely on global variables.  Bugs should be opened in these cases,
as some already have been.

Depending on Immortal Objects
-

:pep:`683` introduces immortal objects as a CPython-internal feature.
With immortal objects, we can share any otherwise immutable global
objects between all interpreters.  Consequently, this PEP does not
need to address how to deal with the various objects
`exposed in the public C-API `_.
It also simplifies the question of what to do about the builtin
static types.  (See `Global Objects`_ below.)

Both issues have alternate solutions, but everything is simpler with
immortal objects.  If PEP 683 is not accepted then this one will be
updated with the alternatives.  This lets us reduce noise in this
proposal.


Motivation
==

The fundamental problem we're solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime.  The GIL is the
cause.  While it usually isn't a problem in practice, at the very least
it makes Python's multi-core story murky, which makes the GIL
a consistent distraction.

Isolated interpreters are also an effective mechanism to support
certain concurrency models.  :pep:`554` discusses this in more detail.

Indirect Benefits
-

Most of the effort needed for a per-interpreter GIL has benefits that
make those tasks worth doing anyway:

* make

[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-02-28 Thread Eric Snow

I've updated PEP 683 for the feedback I've gotten. Thanks again for that!

The updated PEP text is included below. The largest changes involve
either the focus of the PEP (internal mechanism to mark objects
immortal) or the possible ways that things can break on older 32-bit
stable ABI extensions. All other changes are smaller.

Given the last round of discussion, I'm hoping this will be the last
round before we go to the steering council.

-eric

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To:
https://mail.python.org/archives/list/python-dev@python.org/thread/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History: 15-Feb-2022, 19-Feb-2022, 28-Feb-2022
Resolution:

Abstract

Currently the CPython runtime maintains a
`small amount of mutable state `_ in the
allocated memory of each object. Because of this, otherwise immutable
objects are actually mutable. This can have a large negative impact
on CPU and memory performance, especially for approaches to increasing
Python's scalability.

This proposal mandates that, internally, CPython will support marking
an object as one for which that runtime state will no longer change.
Consequently, such an object's refcount will never reach 0, and so the
object will never be cleaned up. We call these objects "immortal".
(Normally, only a relatively small number of internal objects
will ever be immortal.) The fundamental improvement here
is that now an object can be truly immutable.

Scope
-

Object immortality is meant to be an internal-only feature. So this
proposal does not include any changes to public API or behavior
(with one exception). As usual, we may still add some private
(yet publicly accessible) API to do things like immortalize an object
or tell if one is immortal. Any effort to expose this feature to users
would need to be proposed separately.

There is one exception to "no change in behavior": refcounting semantics
for immortal objects will differ in some cases from user expectations.
This exception, and the solution, are discussed below.

Most of this PEP focuses on an internal implementation that satisfies
the above mandate. However, those implementation details are not meant
to be strictly proscriptive. Instead, at the least they are included
to help illustrate the technical considerations required by the mandate.
The actual implementation may deviate somewhat as long as it satisfies
the constraints outlined below. Furthermore, the acceptability of any
specific implementation detail described below does not depend on
the status of this PEP, unless explicitly specified.

For example, the particular details of:

* how to mark something as immortal
* how to recognize something as immortal
* which subset of functionally immortal objects are marked as immortal
* which memory-management activities are skipped or modified for
immortal objects

are not only CPython-specific but are also private implementation
details that are expected to change in subsequent versions.

Implementation Summary
--

Here's a high-level look at the implementation:

If an object's refcount matches a very specific value (defined below)
then that object is treated as immortal. The CPython C-API and runtime
will not modify the refcount (or other runtime state) of an immortal
object.

Aside from the change to refcounting semantics, there is one other
possible negative impact to consider. A naive implementation of the
approach described below makes CPython roughly 4% slower. However,
the implementation is performance-neutral once known mitigations
are applied.

Motivation
==

As noted above, currently all objects are effectively mutable. That
includes "immutable" objects like ``str`` instances. This is because
every object's refcount is frequently modified as the object is used
during execution. This is especially significant for a number of
commonly used global (builtin) objects, e.g. ``None``. Such objects
are used a lot, both in Python code and internally. That adds up to
a consistent high volume of refcount changes.

The effective mutability of all Python objects has a concrete impact
on parts of the Python community, e.g. projects that aim for
scalability like Instragram or the effort to make the GIL
per-interpreter. Below we describe several ways in which refcount
modification has a real negative effect on such projects.
None of that would happen for objects that are truly immutable.

Reducing CPU Cache Invalidation
---

Every modification of a refcount causes the corresponding CPU cache
line to be invalidated. This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory. This has small effect on all Python programs.
Immortal objects

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow

On Wed, Feb 23, 2022 at 4:21 PM Antonio Cuni  wrote:
> When refcheck=True (the default), numpy raises an error if you try to resize 
> an array inplace whose refcnt > 2 (although I don't understand why > 2 and 
> not > 1, and the docs aren't very clear about this).
>
> That said, relying on the exact value of the refcnt is very bad for 
> alternative implementations and for HPy, and in particular it is impossible 
> to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this 
> point of view, a wording which explicitly restricts the "legal" usage of the 
> refcnt details would be very welcome.

Thanks for the feedback and example.  It helps.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D23Z3C7CQIIGALDRSU4RDDM7GVUAASGW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow

On Wed, Feb 23, 2022 at 9:16 AM Petr Viktorin  wrote:
>>> But tp_dict is also public C-API. How will that be handled?
>>> Perhaps naively, I thought static types' dicts could be treated as
>>> (deeply) immutable, and shared?
>>
>> They are immutable from Python code but not from C (due to tp_dict).
>> Basically, we will document that tp_dict should not be used directly
>> (in the public API) and refer users to a public getter function.  I'll
>> note this in the PEP.
>
> What worries me is that existing users of the API haven't read the new
> documentation. What will happen if users do use it?
> Or worse, add things to it?

We will probably set it to NULL, so the user code would fail or crash.
I suppose we could set it to a dummy object that emits helpful errors.

However, I don't think that is worth it.  We're talking about where
users are directly accessing tp_dict of the builtin static types, not
their own.  That is already something they should definitely not be
doing.

> (Hm, the current docs are already rather confusing -- 3.2 added a note
> that "It is not safe to ... modify tp_dict with the dictionary C-API.",
> but above that it says "extra attributes for the type may be added to
> this dictionary [in some cases]")

Yeah, the docs will have to be clarified.

>> Having thought about it some more, I don't think this PEP should be
>> strictly bound to per-interpreter GIL.  That is certainly my personal
>> motivation.  However, we have a small set of users that would benefit
>> significantly, the change is relatively small and simple, and the risk
>> of breaking users is also small.
>
> Right, with the recent performance improvements it's looking like it
> might stand on its own after all.

Great!

>> Honestly, it might not have needed a PEP in the first place if I
>> had been a bit more clear about the idea earlier.
>
> Maybe it's good to have a PEP to clear that up :)

Yeah, the PEP process has been helpful for that. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AKFMFZ45UJXED24YRB4NHQ4HT442XVSP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-28 Thread Eric Snow

Responses inline below.

-eric

On Tue, Feb 22, 2022 at 7:22 PM Inada Naoki  wrote:
> > For a recent example, see
> > https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.
>
> It is not proven example, but just a hope at the moment. So option is
> fine to prove the idea.
>
> Although I can not read the code, they said "patching ASLR by patching
> `ob_type` fields;".
> It will cause CoW for most objects, isn't it?
>
> So reducing memory write don't directly means reducing CoW.
> Unless we can stop writing on a page completely, the page will be copied.

Yeah, they would have to address that.

> > CPU cache invalidation exists regardless.  With the current GIL the
> > effect it is reduced significantly.
>
> It's an interesting point. We can not see the benefit from
> pypeformance, because it doesn't use much data and it runs one process
> at a time.
> So the pyperformance can not make enough stress to the last level
> cache which is shared by many cores.
>
> We need multiprocess performance benchmark apart from pyperformance,
> to stress the last level cache from multiple cores.
> It helps not only this PEP, but also optimizing containers like dict and set.

+1

> Can proposed optimizations to eliminate the penalty guarantee that
> every __del__, weakref are not broken,
> and no memory leak occurs when the Python interpreter is initialized
> and finalized multiple times?
> I haven't confirmed it yet.

They will not break __del__ or weakrefs.  No memory will leak after
finalization.  If any of that happens then it is a bug.

> FWIW, I filed an issue to remove hash cache from bytes objects.
> https://github.com/faster-cpython/ideas/issues/290
>
> Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...)
> Removing it will save some RAM usage and make immortal bytes truly
> immutable, safe to be shared between interpreters.

+1  Thanks!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QKMPALMWGF5366C6PQRSIIFVNXKF4UAM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

On Tue, Feb 22, 2022, 20:26 Larry Hastings  wrote:

> Are these optimizations specifically for the PR, or are these
> optimizations we could apply without taking the immortal objects?  Kind of
> like how Sam tried to offset the nogil slowdown by adding optimizations
> that we went ahead and added anyway ;-)
>

Basically all the optimizations require immortal objects.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7VJVBFBWE3HWTPRVZH3WLSR7EZHZD337/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

On Sat, Feb 19, 2022 at 12:46 AM Eric Snow  wrote:
> Performance
> ---
>
> A naive implementation shows `a 4% slowdown`_.
> Several promising mitigation strategies will be pursued in the effort
> to bring it closer to performance-neutral.  See the `mitigation`_
> section below.

FYI, Eddie has been able to get us back to performance-neutral after
applying several of the mitigation strategies we discussed. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZYGZEQSVBS6ODVAHPL3QN4CJ7JN4FYWO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

On Mon, Feb 21, 2022 at 4:56 PM Terry Reedy  wrote:
> We could say that the only refcounts with any meaning are 0, 1, and > 1.

Yeah, that should work.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7HZ7VBJQOYHXFV3ZD4V7DCMLBL4Q34WP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

On Mon, Feb 21, 2022 at 10:56 AM  wrote:
> For what it's worth Cython does this for string concatenation to concatenate 
> in place if possible (this optimization was copied from CPython). It could be 
> disabled relatively easily if it became a problem (it's already CPython only 
> and version checked so it'd just need another upper-bound version check).

That's good to know.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OEZS4KGQJET5DL3M2OTB76I4W7F56FJC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

Thanks for the responses.  I've replied inline below.

-eric

On Mon, Feb 21, 2022 at 9:11 AM Petr Viktorin  wrote:
>
> On 19. 02. 22 8:46, Eric Snow wrote:
> > Thanks to all those that provided feedback.  I've worked to
> > substantially update the PEP in response.  The text is included below.
> > Further feedback is appreciated.
>
> Thank you! This version is much clearer. I like the PEP more and more!

Great!

> I've sent a PR with a some typo fixes:
> https://github.com/python/peps/pull/2348

Thank you.

> > Public Refcount Details
> [...]
> > As part of this proposal, we must make sure that users can clearly
> > understand on which parts of the refcount behavior they can rely and
> > which are considered implementation details.  Specifically, they should
> > use the existing public refcount-related API and the only refcount value
> > with any meaning is 0.  All other values are considered "not 0".
>
> Should we care about hacks/optimizations that rely on having the only
> reference (or all references), e.g. mutating a tuple if it has refcount
> 1? Immortal objects shouldn't break them (the special case simply won't
> apply), but this wording would make them illegal.
> AFAIK CPython uses this internally, but I don't know how
> prevalent/useful it is in third-party code.

Good point.  As Terry suggested, we could also let 1 have meaning.

Regardless, any documented restriction would only apply to users of
the public C-API, not to internal code.

> > _Py_IMMORTAL_REFCNT
> > ---
> >
> > We will add two internal constants::
> >
> >  #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
> >  #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))
>
> As a nitpick: could you say this in prose?
>
> * ``_Py_IMMORTAL_BIT`` has the third top-most bit set.
> * ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set.

Sure.

> > Immortal Global Objects
> > ---
> >
> > All objects that we expect to be shared globally (between interpreters)
> > will be made immortal.  That includes the following:
> >
> > * singletons (``None``, ``True``, ``False``, ``Ellipsis``, 
> > ``NotImplemented``)
> > * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
> > * all static objects in ``_PyRuntimeState.global_objects`` (e.g. 
> > identifiers,
> >small ints)
> >
> > All such objects will be immutable.  In the case of the static types,
> > they will be effectively immutable.  ``PyTypeObject`` has some mutable
> > start (``tp_dict`` and ``tp_subclasses``), but we can work around this
> > by storing that state on ``PyInterpreterState`` instead of on the
> > respective static type object.  Then the ``__dict__``, etc. getter
> > will do a lookup on the current interpreter, if appropriate, instead
> > of using ``tp_dict``.
>
> But tp_dict is also public C-API. How will that be handled?
> Perhaps naively, I thought static types' dicts could be treated as
> (deeply) immutable, and shared?

They are immutable from Python code but not from C (due to tp_dict).
Basically, we will document that tp_dict should not be used directly
(in the public API) and refer users to a public getter function.  I'll
note this in the PEP.

> Perhaps it would be best to leave it out here and say say "The details
> of sharing ``PyTypeObject`` across interpreters are left to another PEP"?
> Even so, I'd love to know the plan.

What else would you like to know?  There isn't much to it.  For each
of the builtin static types we will keep the relevant mutable state on
PyInterpreterState and look it up there in the relevant getters (e.g.
__dict__ and __subclasses__).

> (And even if these are internals,
> changes to them should be mentioned in What's New, for the sake of
> people who need to maintain old extensions.)

+1

> > Object Cleanup
> > --
> >
> > In order to clean up all immortal objects during runtime finalization,
> > we must keep track of them.
> >
> > For GC objects ("containers") we'll leverage the GC's permanent
> > generation by pushing all immortalized containers there.  During
> > runtime shutdown, the strategy will be to first let the runtime try
> > to do its best effort of deallocating these instances normally.  Most
> > of the module deallocation will now be handled by
> > ``pylifecycle.c:finalize_modules()`` which cleans up the remaining
> > modules as best as we can.  It will change which modules are available
> > during __del__ but that's already defined as undefined behavior by the
> > docs.  Optionally, we

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-22 Thread Eric Snow

Thanks for the feedback.  I've responded inline below.

-eric

On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki  wrote:
> I hope per-interpreter GIL success at some point, and I know this is
> needed for per-interpreter GIL.
>
> But I am worrying about per-interpreter GIL may be too complex to
> implement and maintain for core developers and extension writers.
> As you know, immortal don't mean sharable between interpreters. It is
> too difficult to know which object can be shared, and where the
> shareable objects are leaked to other interpreters.
> So I am not sure that per interpreter GIL is achievable goal.

I plan on addressing this in the PEP I am working on for
per-interpreter GIL.  In the meantime, I doubt the issue will impact
any core devs.

> So I think it's too early to introduce the immortal objects in Python
> 3.11, unless it *improve* performance without per-interpreter GIL
> Instead, we can add a configuration option such as
> `--enalbe-experimental-immortal`.

I agree that immortal objects aren't quite as appealing in general
without per-interpreter GIL.  However, there are actual users that
will benefit from it, assuming we can reduce the performance penalty
to acceptable levels.  For a recent example, see
https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/.

> On Sat, Feb 19, 2022 at 4:52 PM Eric Snow  wrote:
> >
> > Reducing CPU Cache Invalidation
> > ---
> >
> > Avoiding Data Races
> > ---
> >
>
> Both benefits require a per-interpreter GIL.

CPU cache invalidation exists regardless.  With the current GIL the
effect it is reduced significantly.

Per-interpreter GIL is only one situation where data races matter.
Any attempt to generally eliminate the GIL must deal with races on the
per-object runtime state.

> >
> > Avoiding Copy-on-Write
> > --
> >
> > For some applications it makes sense to get the application into
> > a desired initial state and then fork the process for each worker.
> > This can result in a large performance improvement, especially
> > memory usage.  Several enterprise Python users (e.g. Instagram,
> > YouTube) have taken advantage of this.  However, the above
> > refcount semantics drastically reduce the benefits and
> > has led to some sub-optimal workarounds.
> >
>
> As I wrote before, fork is very difficult to use safely. We can not
> recommend to use it for many users.
> And I don't think reducing the size of patch in Instagram or YouTube
> is not good rational for this kind of change.

What do you mean by "this kind of change"?  The proposed change is
relatively small.  It certainly isn't nearly as intrusive as many
changes we make to internals without a PEP.  If you are talking about
the performance penalty, we should be able to eliminate it.

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.  Anything that uses ``mmap``
> > relies on copy-on-write, including sharing data from shared objects
> > files between processes.
> >
>
> It is very difficult to reduce CoW with mmap(MAP_PRIVATE).
>
> You may need to write hash of bytes and unicode. You may be need to
> write `tp_type`.
> Immortal objects can "reduce" the memory write. But "at least one
> memory write" is enough to trigger the CoW.

Correct.  However, without immortal objects (AKA immutable per-object
runtime-state) it goes from "very difficult" to "basically
impossible".

> > Accidental Immortality
> > --
> >
> > While it isn't impossible, this accidental scenario is so unlikely
> > that we need not worry.  Even if done deliberately by using
> > ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
> > cycle, it would take 2^61 cycles (on a 64-bit processor).  At a fast
> > 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
> > If that CPU were 32-bit then it is (technically) more possible though
> > still highly unlikely.
> >
>
> Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit.

The question is if this matters.  If really necessary, the PEP can
demonstrate that it doesn't matter in practice.

(Also, the magic value on 32-bit would be 2**29.)

> >
> > Constraints
> > ---
> >
> > * ensure that otherwise immutable objects can be truly immutable
> > * be careful when immortalizing objects that are not otherwise immutable
>
> I am not sure about what this means.
> For example, unicode objects are not immutable because they have hash,
> utf8 cache and wchar_t cache. (wchar

[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-18 Thread Eric Snow

Thanks to all those that provided feedback. I've worked to
substantially update the PEP in response. The text is included below.
Further feedback is appreciated.

-eric

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Abstract

Specifically, if an object's refcount matches a very specific value
(defined below) then that object is treated as "immortal". If an object
is immortal then its refcount will never be modified by ``Py_INCREF()``,
etc. Consequently, the refcount will never reach 0, so that object will
never be cleaned up (unless explicitly done, e.g. during runtime
finalization). Additionally, all other per-object runtime state
for an immortal object will be considered immutable.

This approach has some possible negative impact, which is explained
below, along with mitigations. A critical requirement for this change
is that the performance regression be no more than 2-3%. Anything worse
the performance-neutral requires that the other benefits are proportionally
large. Aside from specific applications, the fundamental improvement
here is that now an object can be truly immutable.

(This proposal is meant to be CPython-specific and to affect only
internal implementation details. There are some slight exceptions
to that which are explained below. See `Backward Compatibility`_,
`Public Refcount Details`_, and `scope`_.)

Motivation
==

Reducing CPU Cache Invalidation
---

Every modification of a refcount causes the corresponding CPU cache
line to be invalidated. This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory. This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price. If two threads
(running simultaneously on distinct cores) are interacting with the
same object (e.g. ``None``) then they will end up invalidating each
other's caches with each incref and decref. This is true even for
otherwise immutable objects like ``True``, ``0``, and ``str`` instances.
CPython's GIL helps reduce this effect, since only one thread runs at a
time, but it doesn't completely eliminate the penalty.

Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple concurrent threads that may incref or decref the same object.
Without a shared GIL, two running interpreters could not safely share
any objects, even otherwise immutable ones like ``None``.

This means that, to have a per-interpreter GIL, each interpreter must
have its own copy of *every* object. That includes the singletons and
static types. We have a viable strategy for that but it will require
a meaningful amount of extra effort and extra complexity.

The alternative is to ensure that all shared objects are truly immutable.
There would be no races because there would be no modification. This
is something that the immortality proposed here would enable for
otherwise immutable objects. With immortal objects,
support for a per-interpreter GIL
becomes much simpler.

Avoiding Copy-on-Write
--

For some applications it makes sense to

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow

On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  wrote:
> I experimented with this at the EuroPython sprints in Berlin years ago.  I 
> was sitting next to MvL, who had an interesting observation about it.

Classic MvL! :)

>  He suggested(*) all the constants unmarshalled as part of loading a module 
> should be "immortal", and if we could rejigger how we allocated them to store 
> them in their own memory pages, that would dovetail nicely with COW 
> semantics, cutting down on the memory use of preforked server processes.

Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2PRODXEVNO53YYFRL6JUWZQF77WOYS4C/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow

On Wed, Feb 16, 2022 at 8:45 PM Inada Naoki  wrote:
> Is there any common tool that utilize CoW by mmap?
> If you know, please its link to the PEP.
> If there is no common tool, most Python users can get benefit from this.

Sorry, I'm not aware of any, but I also haven't researched the topic
much.  Regardless, that would be a good line of inquiry.  A reference
like that would probably help make the PEP a bit more justifiable
without per-interpreter GIL. :)

> Generally speaking, fork is a legacy API. It is too difficult to know
> which library is fork-safe, even for stdlibs. And Windows users can
> not use fork.
> Optimizing for non-fork use case is much better than optimizing for
> fork use cases.

+1

> I hope per-interpreter GIL replaces fork use cases.

Yeah, that's definitely one big benefit.

> But tools using CoW without fork also welcome, especially if it
> supports Windows.

+1

> Anyway, I don't believe stopping refcounting will fix the CoW issue
> yet. See this article [1] again.
>
> [1] 
> https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

That's definitely an important point, given that the main objective of
the proposal is to allow disabling mutation of runtime-internal object
state so that some objects can be made truly immutable.

I'm sure Eddie has some good insight on the matter (and may have even
been involved in writing that article).  Eddie?

> Note that they failed to fix CoW by stopping refcounting code objects! (*)
> Most CoW was caused by cyclic GC and finalization caused most CoW.

That's a good observation!

> (*) It is not surprising to me because eval loop don't incre/decref
> most code attributes. They borrow reference from the code object.

+1

> So we need a sample application and profile it, before saying it fixes CoW.
> Could you provide some data, or drop the CoW issue from this PEP until
> it is proved?

We'll look into that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ESRBMP4WTNONED3K6Z5HMYYY2WIMQZT3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow

Again, thanks for the reply.  It's helpful.  My further responses are
inline below.

-eric

On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin  wrote:
> > Agreed.  However, what behavior do users expect and what guarantees do
> > we make?  Do we indicate how to interpret the refcount value they
> > receive?  What are the use cases under which a user would set an
> > object's refcount to a specific value?  Are users setting the refcount
> > of objects they did not create?
>
> That's what I hoped the PEP would tell me. Instead of simply claiming
> that there won't be issues, it should explain why we won't have any issues.
> [snip]
> IMO, the reasoning should start from the assumption that things will
> break, and explain why they won't (or why the breakage is acceptable).
> If the PEP simply tells me upfront that things will be OK, I have a hard
> time trusting it.
>
> IOW, it's clear you've thought about this a lot (especially after
> reading your replies here), but it's not clear from the PEP.
> That might be editorial nitpicking, if it wasn't for the fact that I
> want find any gaps in your research and reasoning, and invite everyone
> else to look for them as well.

Good point.. It's easy to dump a bunch of unnecessary info into a PEP,
and it was hard for me to know where the line was in this case.  There
hadn't been much discussion previously about the possible ways this
change might break users.  So thanks for bringing this up.  I'll be
sure to put a more detailed explanation in the PEP, with a bit more
evidence too.

> Ah, I see. I was confused by this:

No worries!  I'm glad we cleared it up.  I'll make sure the PEP is
more understandable about this.

> > This is also true even with the GIL, though the impact is smaller.
>
> Smaller than what? The baseline for that comparison is a hypothetical
> GIL-less interpreter, which is only introduced in the next section.
> Perhaps say something like "Python's GIL helps avoid this effect, but
> doesn't eliminate it."

Good point.  I'll clarify the point.

> >> Weren't you planning a PEP on subinterpreter GIL as well? Do you want to
> >> submit them together?
> >
> > I'd have to think about that.  The other PEP I'm writing for
> > per-interpreter GIL doesn't require immortal objects.  They just
> > simplify a number of things.  That's my motivation for writing this
> > PEP, in fact. :)
>
> Please think about it.
> If you removed the benefits for per-interpreter GIL, the motivation
> section would be reduced to is memory savings for fork/CoW. (And lots of
> performance improvements that are great in theory but sum up to a 4% loss.)

Sounds good.  Would this involve more than a note at the top of the PEP?

And just to be clear, I don't think the fate of a per-interpreter GIL
PEP should not depend on this one.

> > It wouldn't match _Py_IMMORTAL_REFCNT, but the high bit of
> > _Py_IMMORTAL_REFCNT would still match.  That bit is what we would
> > actually be checking, rather than the full value.
>
> It makes sense once you know _Py_IMMORTAL_REFCNT has two bits set. Maybe
> it'd be good to note that detail -- it's an internal detail, but crucial
> for making things safe.

Will do.

> >> What about extensions compiled with Python 3.11 (with this PEP) that use
> >> an older version of the stable ABI, and thus should be compatible with
> >> 3.2+? Will they use the old versions of the macros? How will that be 
> >> tested?
> >
> > It wouldn't matter unless an object's refcount reached
> > _Py_IMMORTAL_REFCNT, at which point incref/decref would start
> > noop'ing.  What is the likelihood (in real code) that an object's
> > refcount would grow that far?  Even then, would such an object ever be
> > expected to go back to 0 (and be dealloc'ed)?  Otherwise the point is
> > moot.
>
> That's exactly the questions I'd hope the PEP to answer. I could
> estimate that likelihood myself, but I'd really rather just check your
> work ;)
>
> (Hm, maybe I couldn't even estimate this myself. The PEP doesn't say
> what the value of _Py_IMMORTAL_REFCNT is, and in the ref implementation
> a comment says "This can be safely changed to a smaller value".)

Got it.  I'll be sure that the PEP is more clear about that.  Thanks
for letting me know.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LRUQDLVTC7GV4K3HHZK2ESPW3AHW4NKJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-17 Thread Eric Snow

On Wed, Feb 16, 2022 at 10:43 PM Jim J. Jewett  wrote:
> I suggest being a little more explicit (even blatant) that the particular 
> details of:
> [snip]
> are not only Cpython-specific, but are also private implementation details 
> that are expected to change in subsequent versions.

Excellent point.

> Ideally, things like the interned string dictionary or the constants from a 
> pyc file will be not merely immortal, but stored in an immortal-only memory 
> page, so that they won't be flushed or CoW-ed when a nearby non-immortal 
> object is modified.

That's definitely worth looking into.

> Getting those details right will make a difference to performance, and you 
> don't want to be locked in to the first draft.

Yep, that is one big reason I was trying to avoid spelling out every
detail of our plan. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/535SKVXHPFZQMKRB2YC6UVQLN2TZ4RMY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow

On Wed, Feb 16, 2022 at 2:41 PM Terry Reedy  wrote:
> > * the naive implementation shows a 4% slowdown
>
> Without understanding all the benefits, this seems a bit too much for
> me.  2% would be much better.

Yeah, we consider 4% to be too much.  2% would be great.
Performance-neutral would be even better, of course. :)

> > * we have a number of strategies that should reduce that penalty
>
> I would like to see that before approving the PEP.

I expect it would be enough to show where things stand with benchmark
results.  It did not seem like the actual mitigation strategies were
as important, so I opted to leave them out to avoid clutter.  Plus it
isn't clear yet what approaches will help the most, nor how much we
can win back.  So I didn't want to distract with hypotheticals.  If
it's important I can add that in.

> > * without immortal objects, the implementation for per-interpreter GIL
> > will require a number of non-trivial workarounds
>
> To me, that says to speed up immortality first.

Agreed.

> > That last one is particularly meaningful to me since it means we would
> > definitely miss the 3.11 feature freeze.
>
> 3 1/2 months from now.
>
> > With immortal objects, 3.11 would still be in reach.
>
> Is it worth trying to rush it a bit?

I'd rather not rush this.  I'm saying that, for per-interpreter GIL,
3.11 is within reach without rushing if we have immortal objects.
Without them, 3.11 is realistic without rushing things.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CYPYFPFGB7ONMVSTDHFDKZL26E7KG6MO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow

On Wed, Feb 16, 2022 at 12:14 PM Kevin Modzelewski  wrote:
> fwiw Pyston has immortal objects, though with a slightly different goal and 
> thus design [1]. I'm not necessarily advocating for our design (it makes most 
> sense if there is a JIT involved), but just writing to report our experience 
> of making a change like this and the compatibility effects.

Thanks!

> Importantly, our system allows for the reference count of immortal objects to 
> change, as long as it doesn't go below half of the original very-high value. 
> So extension code with no concept of immortality will still update the 
> reference counts of immortal objects, but this is fine. Because of this we 
> haven't seen any issues with extension modules.

As Guido noted, we are taking a similar approach for the sake of older
extensions built with the limited API.  As a precaution, we start the
refcount for immortal objects basically at _Py_IMMORTAL_REFCNT * 1.5.
Then we only need to check the high bit of _Py_IMMORTAL_REFCNT to see
if an object is immortal.

> The small amount of compatibility challenges we've run into have been in 
> testing code that checks for memory leaks. For example this code breaks on 
> Pyston:
> [snip]
> This might work with this PEP, but we've also seen code that asserts that the 
> refcount increases by a specific value, which I believe wouldn't.

Right, this is less of an issue for us since normally we do not change
the refcount of immortal objects.  Also, CPython's test suite keeps us
honest about leaking references and memory blocks. :)

> For Pyston we've simply disabled these tests, figuring that our users still 
> have CPython to test on. Personally I consider this breakage to be small, but 
> I hadn't seen anyone mention the potential usage of sys.getrefcount() so I 
> thought I'd bring it up.

Thanks again for that.

> [1] Our goal is to entirely remove refcounting operations when we can prove 
> we are operating on an immortal object. We can prove it in a couple cases: 
> sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often 
> know the immortality of objects it embeds into the code. So if we can prove 
> statically that an object is immortal then we elide the incref/decrefs, and 
> if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that 
> our reference counts on immortal objects will change, so we detect 
> immortality by checking if the reference count is at least half of the 
> original very-high value.

FWIW, we anticipate that we can take a similar approach in CPython's
eval loop, specializing for immortal objects.  We are also updating
Py_RETURN_NONE, etc. to stop incref'ing.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CDBGYUDROQZNEM6LAREIEKSZSQ72BLOH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow

Thanks for the feedback.  My responses are inline below.

-eric

On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin  wrote:
> Thank you very much for writing this down! It's very helpful to see a
> concrete proposal, and the current state of this idea.
> I like the change,

That's good to hear. :)

> but I think it's unfortunately more complicated than
> the PEP suggests.

That would be unsurprising. :)

> > This proposal is CPython-specific and, effectively, describes
> > internal implementation details.
>
> I think that is a naïve statement. Refcounting is
> implementation-specific, but it's hardly an *internal* detail.

Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* 
https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT

> There is
> code that targets CPython specifically, and relies on the details.

Could you elaborate?  Do you mean such code relies on specific refcount values?

> The refcount has public getters and setters,

Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?

> and you need a pretty good
> grasp of the concept to write a C extension.

I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).

> I think that it's safe to assume that this will break people's code,

Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] 
https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/

> and
> this PEP should convince us that the breakage is worth it rather than
> dismiss the issue.

Sorry, I didn't mean to be dismissive.  I agree that if there is
breakage this PEP must address it.

> It would be good to note that “container” refers to the GC term, as in
> https://devguide.python.org/garbage_collector/#identifying-reference-cycles
>
> and not e.g.
> https://docs.python.org/3/library/collections.abc.html#collections.abc.Container

+1

> > This has a concrete impact on active projects in the Python community.
> > Below we describe several ways in which refcount modification has
> > a real negative effect on those projects.  None of that would
> > happen for objects that are truly immutable.
> >
> > Reducing Cache Invalidation
> > ---
>
> Explicitly saying “CPU cache” would make the PEP easier to skim.

+1

> > Every modification of a refcount causes the corresponding cache
> > line to be invalidated.  This has a number of effects.
> >
> > For one, the write must be propagated to other cache levels
> > and to main memory.  This has small effect on all Python programs.
> > Immortal objects would provide a slight relief in that regard.
> >
> > On top of that, multi-core applications pay a price.  If two threads
> > are interacting with the same object (e.g. ``None``)  then they will
> > end up invalidating each other's caches with each incref and decref.
> > This is true even for otherwise immutable objects like ``True``,
> > ``0``, and ``str`` instances.  This is also true even with
> > the GIL, though the impact is smaller.
>
> This looks out of context. Python has a per-process GIL. It should it go
> after the next section.

This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?

> > The proposed solution is obvious enough that two people came to the
> > same conclusion (and implementation, more or less) independently.
>
> Who was it? Assuming it's not a secret :)

Me and Eddit. :)  I don't mind saying so.

> > In the case of per-interpreter GIL, the only realistic alternative
> > is to move all global objects into ``PyInterpreterState`` and add
> > one or more lookup functions to access them.  Then we'd have to
> > add some hacks to the C-API to preserve compatibility for the
> > may objects exposed there.  The story is much, much

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Eric Snow

On Wed, Feb 16, 2022 at 12:37 AM Inada Naoki  wrote:
> +1 for overall idea.

Great!

> > Also note that "fork" isn't the only operating system mechanism
> > that uses copy-on-write semantics.
>
> Could you elaborate? mmap, maybe?
> [snip[
> So if you know how to get benefit from CoW without fork, I want to know it.

Sorry if I got your hopes up.  Yeah, I was talking about mmap.

> > There will likely be others we have not enumerated here.
>
> How about interned strings?

Marking every interned string as immortal may make sense.

> Should the intern dict be belonging to runtime, or (sub)interpreter?
>
> If the interned dict is belonging to runtime, all interned dict should
> be immortal to be shared between subinterpreters.

Excellent questions.  Making immutable objects immortal is relatively
simple.  For the most part, mutable objects should not be shared
between interpreters without protection (e.g. the GIL).  The interned
dict isn't exposed to Python code or the C-API, so there's less risk,
but it still wouldn't work without cleverness.  So it should be
per-interpreter.  It would be nice if it were global though. :)

> If the interned dict is belonging to interpreter, should we register
> immortalized string to all interpreters?

That's a good point.  It may be worth doing something like that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VQYLSPHHP2EE2KPDWCXDLMBAXYAE72D3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-15 Thread Eric Snow

Eddie and I would appreciate your feedback on this proposal to support
treating some objects as "immortal".  The fundamental characteristic
of the approach is that we would provide stronger guarantees about
immutability for some objects.

A few things to note:

* this is essentially an internal-only change:  there are no
user-facing changes (aside from affecting any 3rd party code that
directly relies on specific refcounts)
* the naive implementation shows a 4% slowdown
* we have a number of strategies that should reduce that penalty
* without immortal objects, the implementation for per-interpreter GIL
will require a number of non-trivial workarounds

That last one is particularly meaningful to me since it means we would
definitely miss the 3.11 feature freeze.  With immortal objects, 3.11
would still be in reach.

-eric

---

PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To: python-dev@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History:
Resolution:


Abstract


Under this proposal, any object may be marked as immortal.
"Immortal" means the object will never be cleaned up (at least until
runtime finalization).  Specifically, the `refcount`_ for an immortal
object is set to a sentinel value, and that refcount is never changed
by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
For immortal containers, the ``PyGC_Head`` is never
changed by the garbage collector.

Avoiding changes to the refcount is an essential part of this
proposal.  For what we call "immutable" objects, it makes them
truly immutable.  As described further below, this allows us
to avoid performance penalties in scenarios that
would otherwise be prohibitive.

This proposal is CPython-specific and, effectively, describes
internal implementation details.

.. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts


Motivation
==

Without immortal objects, all objects are effectively mutable.  That
includes "immutable" objects like ``None`` and ``str`` instances.
This is because every object's refcount is frequently modified
as it is used during execution.  In addition, for containers
the runtime may modify the object's ``PyGC_Head``.  These
runtime-internal state currently prevent
full immutability.

This has a concrete impact on active projects in the Python community.
Below we describe several ways in which refcount modification has
a real negative effect on those projects.  None of that would
happen for objects that are truly immutable.

Reducing Cache Invalidation
---

Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.

Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple threads that concurrently incref or decref.  Without a shared
GIL, two running interpreters could not safely share any objects,
even otherwise immutable ones like ``None``.

This means that, to have a per-interpreter GIL, each interpreter must
have its own copy of *every* object, including the singletons and
static types.  We have a viable strategy for that but it will
require a meaningful amount of extra effort and extra
complexity.

The alternative is to ensure that all shared objects are truly immutable.
There would be no races because there would be no modification.  This
is something that the immortality proposed here would enable for
otherwise immutable objects.  With immortal objects,
support for a per-interpreter GIL
becomes much simpler.

Avoiding Copy-on-Write
--

For some applications it makes sense to get the application into
a desired initial state and then fork the process for each worker.
This can result in a large performance improvement, especially
memory usage.  Several enterprise Python users (e.g. Instagram,
YouTube) have taken advantage of this.  However, the above
refcount semantics drastically reduce the benefits and
has led to some sub-optimal workarounds.

Also note that "fork" isn't the only operating system mechanism
that uses copy-on-write seman

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2022-02-08 Thread Eric Snow

On Wed, Dec 15, 2021 at 10:15 AM Eric Snow  wrote:
> Yes, I plan on benchmarking the change as soon as we can run
> pyperformance on main.

I just ran the benchmarks and the PR makes CPython 4% slower.  See
https://github.com/python/cpython/pull/19474#issuecomment-1032944709.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3TT6Q5TQLUMLL5TWTKHRTXQ3XATHIUBW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-07 Thread Eric Snow

On Fri, Feb 4, 2022 at 8:25 PM Eric Snow  wrote:
> On Fri, Feb 4, 2022, 16:03 Guido van Rossum  wrote:
>> I wonder if a better solution than that PR wouldn't be to somehow change the 
>> implementation of _Py_IDENTIFIER() to do that,
>
> Yeah, I had the same realization today.  I'm going to try it out.

I updated _Py_IDENTIFIER() to use a statically initialized string
object and it isn't too bad.  The tricky thing is that PyASCIIObject
expects to the data to be an array after the object.  So the field
must be a pre-sized array (like I did in gh-30928).  That makes things
messier.  The alternative is to do what Steve is suggesting.

I ran the benchmarks and making _Py_IDENTIFIER() a statically
initialized object makes things 2% slower (instead of 1% faster).
There are a few things I could do to speed that up a little, but at
best we'd get back to performance-neutral.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DDWOJLFOTXTZ35LMBCPH2DHFMCSVLHH5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Eric Snow

On Fri, Feb 4, 2022, 16:03 Guido van Rossum  wrote:

> I wonder if a better solution than that PR wouldn't be to somehow change
> the implementation of _Py_IDENTIFIER() to do that,
>

Yeah, I had the same realization today.  I'm going to try it out.

-eric

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A7Q4TBBOCEAXZYOY6GSY3NA2FSVNUMHL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Eric Snow

On Thu, Feb 3, 2022 at 3:49 PM Eric Snow  wrote:
> I suppose I'd like to know what the value of _Py_IDENTIFIER() is for
> 3rd party modules.

Between Guido, Victor, Stefan, and Sebastian, I'm getting the sense
that a public replacement for _Py_IDENTIFER() would be worth pursuing.
Considering that it would probably help numpy move toward
subinterpreter support, I may work on this after all. :)

(For core CPython we'll still benefit from the statically initialized
strings, AKA gh-30928.)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D2LHEZZUQH66Q5ZIOEJTGSCEMQEMKCUQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Eric Snow

On Fri, Feb 4, 2022 at 8:21 AM Stefan Behnel  wrote:
> Correct. We (intentionally) have our own way to intern strings and do not
> depend on CPython's identifier framework.

You're talking about __Pyx_StringTabEntry (and __Pyx_InitString())?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q5AL3SLW5BCUA6FLDBUNZTH5Z7ZYAHER/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-03 Thread Eric Snow

On Thu, Feb 3, 2022 at 4:01 PM Guido van Rossum  wrote:
> Why not read through some of that code and see what they are doing with it?

Yep, I'm planning on it.

> I imagine one advantage is that _Py_IDENTIFIER() can be used entirely local 
> to a function.

Yeah, they'd have to put something like this in their module init:

state->partial_str = PyUnicode_InternFromString("partial");
if (state->partial_str == NULL) {
return NULL;
}

> E.g. (from _operator.c):
>
> _Py_IDENTIFIER(partial);
> functools = PyImport_ImportModule("functools");
> if (!functools)
> return NULL;
> partial = _PyObject_GetAttrId(functools, _partial);
>
> That's convenient since it means they don't have to pass module state around.

I might call that cheating. :)  For an extension module this means
they are storing a little bit of their state in the
runtime/interpreter state instead of in their module state.  Is there
precedent for that with any of our other API?

Regardless, the status quo certainly is simpler (if they aren't
already using module state in the function).  Without _Py_IDENTIFER()
it would look like:

functools = PyImport_ImportModule("functools");
if (!functools)
return NULL;
my_struct *state = (my_struct*)PyModule_GetState(module);
if (state == NULL) {
Py_DECREF(functools);
return NULL;
}
partial = PyObject_GetAttr(functools, state->partial_str);

If they are already using the module state in their function then the
code would be simpler:

functools = PyImport_ImportModule("functools");
if (!functools)
return NULL;
partial = PyObject_GetAttr(functools, state->partial_str);

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4625QOLXLZAAU2XNXEQM5W2JWX3FH4VM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-03 Thread Eric Snow

On Thu, Feb 3, 2022 at 7:26 AM Victor Stinner  wrote:
> In bpo-39465, I made the _PyUnicode_FromId() compatible with running
> sub-interpreters in parallel (one GIL per interpreter).
>
> A "static" PyUnicodeObject would have to share the reference count
> between sub-interpreters, whereas Py_INCREF/Py_DECREF are not
> thread-safe: there is lock to prevent data races.

Yeah, if we end up not being able to safely share the global string
objects between interpreters then we would move them under
PyInterpreterState.  Currently I'm putting them under _PyRuntimeState.
Doing that might reduce the performance benefits a little, since
Py_GET_GLOBAL_STRING() would have to look up the interpreter to use
(or we'd have to pass it in).  That doesn't seem like much of a
penalty though and doesn't impact the other benefits of the change.

> Is there a way to push the "immortal objects" strategy discussed in
> bpo-40255?

I'm planning on circling back to that next week.

> The deepfreeze already pushed some functions related to
> that, like _PyObject_IMMORTAL_INIT() in the internal C API.
> Moreover... deepfreeze already produces "immortal" PyUnicodeObject
> strings using the "ob_refcnt = 9" hack.

Note we only set the value really high as a safety precaution since
these objects are all statically allocated.  Eddie Elizondo's proposal
involves a number of other key points, including keeping the refcount
from changing.

> IMO we should decide on a strategy. Either we move towards immortal
> objects (modify Py_INCREF/Py_DECREF to not modify the ref count if an
> object is immortal), or we make sure that no Python is shared between
> two Python interpreters.

+1

The catch is that things get messier when we make some objects
per-interpreter while others stay runtime-global.  I'm going to write
a bit more about this next week, but the best strategy will probably
be to first consolidate all the global objects under _PyRuntimeState
first and then move them to PyInterpreterState all at once when we can
do it safely.

> > I'd also like to actually get rid of _Py_IDENTIFIER(), along with
> > other related API including ~14 (private) C-API functions.  Dropping
> > all that helps reduce maintenance costs.
>
> Is it required by your work on static strings, or is it more about
> removing the API which would no longer be consumed by Python itself?

It is definitely not required for that.  Rather, we won't need it any
more so we should benefit from getting rid of it.  The only blocker is
that some 3rd party modules are using it.

> If it's not required, would it make sense to follow the PEP 387
> deprecation (mark functions as deprecated, document the deprecation,
> and wait 2 releases to remove it)?

If you think it's worth it.  It's a private API.  I'd rather work to
get 3rd party modules off it and then move on sooner.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WR33NVCSIHOMN5X7YGCL2DHNCBQGKWAU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-03 Thread Eric Snow

On Thu, Feb 3, 2022 at 7:17 AM Victor Stinner  wrote:
> In the top 5000 PyPI projects, I found 11 projects using them:
> [snip[
> They use the these 17 functions:

Thanks!  That is super helpful.

> If the _Py_IDENTIFIER() API is removed, it would be *nice* to provide
> a migrate path (tool?) to help these projects moving away the
> _Py_IDENTIFIER() API. Or at least do the work to update these 11
> projects.

If something like _Py_IDENTIFIER() provides genuine value then we
should consider a proper public API.  Otherwise I agree that we should
work with those projects to stop using it.  I guess either way they
should stop using the "private" API. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IGXAXUSDBKYOOVVFSAUYLE5R5TXVZT4A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-03 Thread Eric Snow

On Thu, Feb 3, 2022 at 6:46 AM Ronald Oussoren  wrote:
> Although my gut feeling is that adding a the CI check you mention is good
> enough and adding the tooling for generating code isn’t worth the additional
> complexity.

Yeah, I came to the same conclusion. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FAMDFSTN5KR2Z7LOVTK5GGF6YKR6G65Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-03 Thread Eric Snow

On Wed, Feb 2, 2022 at 11:50 PM Inada Naoki  wrote:
> It would be nice to provide something similar to _PY_IDENTIFIER, but
> designed (and documented) for 3rd party modules like this.

I suppose I'd like to know what the value of _Py_IDENTIFIER() is for
3rd party modules.  They can already use PyUnicode_InternFromString()
to get a "global" object and then store it in their module state.  I
would not expect _Py_IDENTIFIER() to provide much of an advantage over
that.  Perhaps I'm missing something?

If there is a real benefit then we should definitely figure out a good
public API for it (if the current private one isn't sufficient).  I
won't be authoring that PEP though. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AVJKKWITJPHUQTE2IXDYBCTQTKVPZPD7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Moving away from _Py_IDENTIFIER().

2022-02-02 Thread Eric Snow

I'm planning on moving us to a simpler, more efficient alternative to
_Py_IDENTIFIER(), but want to see if there are any objections first
before moving ahead.  Also see https://bugs.python.org/issue46541.

_Py_IDENTIFIER() was added in 2011 to replace several internal string
object caches and to support cleaning up the cached objects during
finalization.  A number of "private" functions (each with a
_Py_Identifier param) were added at that time, mostly corresponding to
existing functions that take PyObject* or char*.  Note that at present
there are several hundred uses of _Py_IDENTIFIER(), including a number
of duplicates.

My plan is to replace our use of _Py_IDENTIFIER() with statically
initialized string objects (as fields under _PyRuntimeState).  That
involves the following:

* add a PyUnicodeObject field (not a pointer) to _PyRuntimeState for
each string that currently uses _Py_IDENTIFIER() (or
_Py_static_string())
* statically initialize each object as part of the initializer for
_PyRuntimeState
* add a macro to look up a given global string
* update each location that currently uses _Py_IDENTIFIER() to use the
new macro instead

Pros:

* reduces indirection (and extra calls) for C-API functions that need
the strings (making the code a little easier to understand and
speeding it up)
* the objects are referenced from a fixed address in the static data
section instead of the heap (speeding things up and allowing the C
compiler to optimize better)
* there is no lazy allocation (or lookup, etc.) so there are fewer
possible failures when the objects get used (thus less error return
checking)
* saves memory (at little, at least)
* if needed, the approach for per-interpreter is simpler
* helps us get rid of several hundred static variables throughout the code base
* allows us to get rid of _Py_IDENTIFIER() and a bunch of related
C-API functions
* "deep frozen" modules can use the global strings
* commonly-used strings could be pre-allocated by adding
_PyRuntimeState fields for them

Cons:

* a little less convenient: adding a global string requires modifying
a separate file from the one where you actually want to use the string
* strings can get "orphaned" (I'm planning on checking in CI)
* some strings may never get used for any given ./python invocation
(not that big a difference though)

I have a PR up (https://github.com/python/cpython/pull/30928) that
adds the global strings and replaces use of _Py_IDENTIFIER() in our
code base, except for in non-builtin stdlib extension modules.  (Those
will be handled separately if we proceed.)  The PR also adds a CI
check for "orphaned" strings.  It leaves _Py_IDENTIFIER() for now, but
disallows any Py_BUILD_CORE code from using it.

With that change I'm seeing a 1% improvement in performance (see
https://github.com/faster-cpython/ideas/issues/230).

I'd also like to actually get rid of _Py_IDENTIFIER(), along with
other related API including ~14 (private) C-API functions.  Dropping
all that helps reduce maintenance costs.  However, at least one PyPI
project (blender) is using _Py_IDENTIFIER().  So, before we could get
rid of it, we'd first have to deal with that project (and any others).

To sum up, I wanted to see if there are any objections before I start
merging anything.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DNMZAMB4M6RVR76RDZMUK2WRLI6KAAYS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-02 Thread Eric Snow

On Wed, Feb 2, 2022 at 3:41 PM Eric Snow  wrote:
> I'd also like to actually get rid of _Py_IDENTIFIER(), along with
> other related API including ~14 (private) C-API functions.

FTR, here is the (private/internal) C-API affected by getting rid of
_Py_IDENTIFIER():

* 21 C-API functions with `_Py_Identifer` parameters - would be dropped
   + _PyUnicode_FromId()
   + _PyUnicode_EqualToASCIIId()
   + _PyObject_CallMethodId()
   + _PyObject_CallMethodId_SizeT()
   + _PyObject_CallMethodIdObjArgs()
   + _PyObject_VectorcallMethodId()
   + _PyObject_CallMethodIdNoArgs()
   + _PyObject_CallMethodIdOneArg()
   + _PyEval_GetBuiltinId()
   + _PyDict_GetItemId()
   + _PyDict_SetItemId()
   + _PyDict_DelItemId()
   + _PyDict_ContainsId()
   + _PyImport_GetModuleId()
   + _PyType_LookupId()
   + _PyObject_LookupSpecial()
   + _PyObject_GetAttrId()
   + _PyObject_SetAttrId()
   + _PyObject_LookupAttrId()
   + _PySys_GetObjectId()
   + _PySys_SetObjectId()
* 7 new internal functions to replace the _Py*Id() functions that
didn't already have a normal counterpart
   + _PyObject_CallMethodObj()
   + _PyObject_IsSingleton()
   + _PyEval_GetBuiltin()
   + _PySys_SetAttr()
   + _PyObject_LookupSpecial() (with PyObject* param)
   + _PyDict_GetItemWithError()
   + _PyObject_CallMethod()
* the runtime state related to identifiers - would be dropped
* _Py_Identifier, _Py_IDENTIFIER(), _Py_static_string() - would be dropped

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OXKABHIUDUQETWXXBKUWD63XN65IVC22/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Python no longer leaks memory at exit

2022-01-27 Thread Eric Snow

On Thu, Jan 27, 2022 at 8:40 AM Victor Stinner  wrote:
> tl; dr Python no longer leaks memory at exit on the "python -c pass" command 
> ;-)

Thanks to all for the effort on this!

Would it be worth adding a test to make sure we don't start leaking
memory again?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ELLXKXMQAZ3WMLDDNKU7QLR6AGE36JJR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2022-01-05 Thread Eric Snow

On Wed, Jan 5, 2022, 15:02 Trent Nelson  wrote:

> I thought that was pretty interesting.  Potentially many, many upper
> bits for the taking.  The code also had some logic that would int 3
> as soon as a 32-bit refcnt overflowed, and that never hit either
> (obviously, based on the numbers above).
>
> I also failed to come up with real-life code that would result in a
> Python object having a reference count higher than None's refcnt, but
> that may have just been from lack of creativity.
>
> Just thought I'd share.
>

Thanks, Trent.  That's super helpful.

-eric

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FMSE7AFZVJVBFRQMMYAEAXELITHN2E3B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Static types and subinterpreters running in parallel

2021-12-16 Thread Eric Snow

On Thu, Dec 16, 2021 at 10:54 AM Guido van Rossum  wrote:
>
> Eric has been looking into this. It's probably the only solution if we can't 
> get immutable objects.

Yep.  I've investigated the following approach (for the objects
exposed in the public and limited C-API):

* add a pointer field to PyInterpreterState (or a sub-struct) for each
of the objects
* for the main interpreter, set those pointers to the existing
statically declared objects
* for subinterpreters make a copy (memcpy()?) and fix it up
* add a lookup API and encourage extensions to use it
* for 3.11+ change the symbols to macros:
   + in the internal C-API (Py_BUILD_CORE), the macro would resolve to
the corresponding PyInterpreterState field
   + in the public C-API (and limited API extensions built with
3.11+), the macro would resolve to a call to a (non-inline) lookup
function
   + for limited API extensions built against earlier Python versions
we'd still export the existing symbols
* limited API extensions built against pre-3.11 Python would only be
allowed to run in the main interpreter on 3.11+
   + they probably weren't built with subinterpreters in mind anyway

There are still a number of details to sort out, but nothing that
seems like a huge obstacle.  Here are the ones that come to mind,
along with other details, caveats, and open questions:

* the static types exposed in the C-API are PyObject values rather than pointers
   + I solved this by dereferencing the result of the lookup function
(Guido's idea), e.g. #define PyTuple_Type (*(_Py_GetObject_Tuple()))
* there is definitely a penalty to using a per-interpreter lookup function
   + this would only apply to extension modules since internally we
would access the PyInterpreterState fields directly
   + this is mostly a potential problem only when the object is
directly referenced frequently (e.g. a tight loop),
   + the impact would probably center on use of the high-frequency
singletons (None, True, False) and possibly with Py*_CheckExact()
calls
   + would it be enough of a problem to be worth mitigating?  how
would we do so?
* static types in extensions can't have tp_base set to a builtin type
(since the macro won't resolve)
   + extensions that support subinterpreters (i.e. PEP 489) won't be
using static types (a weak assumption)
   + extensions that do not support subinterpreters and still have
static types would probably break
   + how to fix that?
* limited API extensions built against 3.11+ but running under older
Python versions would break?
   + how to fix that?

> But I would prefer the latter, if we can get the performance penalty low 
> enough.

Absolutely.  Using immortal objects to solve this is a much simpler solution.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7RPTHCLEUHR34PIJKRN453UEWCAI56NW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Eric Snow

On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou  wrote:
> As a data point, in PyArrow, we have a bunch of C++ code that interacts
> with Python but doesn't belong in a particular Python module.  That C++
> code can of course have global state, including perhaps Python objects.

Thanks for that example!

> What might be nice would be a C API to allow creating interpreter-local
> opaque structs, for example:
>
> void* Py_GetInterpreterLocal(const char* unique_name);
> void* Py_SetInterpreterLocal(const char* unique_name,
>  void* ptr, void(*)() destructor);

That's interesting.  I can imagine that as just a step beyond the
module state API, with the module being implicit.  Do you think this
would be an improvement over using module state?  (I'm genuinely
curious.)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KB7ET6XXJFTJDBHL7ABEPSGTD3M2RNAW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-16 Thread Eric Snow

On Thu, Dec 16, 2021 at 2:48 AM Petr Viktorin  wrote:
> But does the sign bit need to stay intact, and do we actually need to
> rely on the immortal bit to always be set for immortal objects?
> If the refcount rolls over to zero, an immortal object's dealloc could
> bump it back and give itself another few minutes.
> Allowing such rollover would mean having to deal with negative
> refcounts, but that might be acceptable.

FWIW, my original attempt at immortal objects (quite a while ago) used
the sign bit as the marker (negative refcount meant immortal).
However, this broke GC and Py_DECREF() and getting those to work right
was a pain.  It also made a few things harder to debug because a
negative refcount no longer necessarily indicated something had gone
wrong.  In the end I switched to a really high bit as the marker and
it was all much simpler.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LJ2WVSUPJY2X3VVJW4EEEFNOBRJ7AB4V/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-15 Thread Eric Snow

On Tue, Dec 14, 2021 at 10:12 AM Eric Snow  wrote:
> * it is fully backward compatible and the C-API is essentially unaffected

Hmm, this is a little misleading.  It will definitely be backward
incompatible for extension modules that don't work under multiple
subinterpreters (or rely on the GIL to protect global state).  Hence
that other thread I started. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UDMQXP6GO5SYJGHKHX2W4VRSNAZ55PMI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 12:18 PM Chris Angelico  wrote:
> Sorry if this is a dumb question, but would it be possible to solve
> that last point with an immortal arena [1] from which immortal objects
> could be allocated? None/True/False could be allocated there, but so
> could anything that is more dynamic, if it's decided as important
> enough. It would still be possible to recognize them by pointer (since
> the immortal arena would be a specific block of memory).

That's an interesting idea.  An immortal arena would certainly be one
approach to investigate.

However, I'm not convinced there is enough value to justify going out
of our way to allow dynamically allocated objects to be immortal.
Keep in mind that the concept of immortal objects would probably not
be available outside the internal API, and, internally, any objects we
want to be immortal will probably be statically allocated.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I2ZG4J577Q4CDWXQHYCOMOFMPJPP5XJT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Tue, Dec 14, 2021 at 11:19 AM Eric Snow  wrote:
> There is one solution that would help both of the above in a nice way:
> "immortal" objects.

FYI, here are some observations that came up during some discussions
with the "faster-cpython" team today:

* immortal objects should probably only be immutable ones (other than
ob_refcnt, of course)
* GC concerns are less of an issue if a really high ref count (bit) is
used to identify immortal objects
* ob_refcnt is part of the public API (sadly), so using it to mark
immortal objects may be sensitive to interference
* ob_refcnt is part of the stable ABI (even more sadly), affecting any
solution using ref counts
* using the ref count isn't the only viable approach; another would be
checking the pointer itself
   + put the object in a specific section of static data and compare
the pointer against the bounds
   + this avoids loading the actual object data if it is immortal
   + for objects that are mostly treated as markers (e.g. None), this
could have a meaningful impact
   + not compatible with dynamically allocated objects

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LVLFPOIOXM34NQ2G73BAXIRS4TIN74JV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 6:16 AM Antoine Pitrou  wrote:
> Did you try to take into account the envisioned project for adding a
> "complete" GC and removing the GIL?

Yeah.  I was going to start a separate thread about per-interpreter
GIL vs. no-gil, but figured I was already pushing my luck with 3
simultaneous related threads here. :)  It would definitely be covered
by the info doc/PEP.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XHJ3PNBW23HXCT4BI3LXYFE4Q5NW576P/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 8:16 AM Skip Montanaro  wrote:
> It might be worth (re)reviewing Sam Gross's nogil effort to see how he 
> approached this:

Yeah, there is good info in there.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BRQQ4FKWPXIEBSPKR4G2UUC4U4LDF3OV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 4:03 AM Victor Stinner  wrote:
> The last time I saw a benchmark on immortal object, it was clearly 10%
> slower overall on the pyperformance benchmark suite. That's a major
> slowdown.

Yes, I plan on benchmarking the change as soon as we can run
pyperformance on main.

> > * abandon all hope
>
> I wrote https://bugs.python.org/issue39511 and
> https://github.com/python/cpython/pull/18301 to have per-interpreter
> None, True and False singletons.

Yeah, I took a similar approach in the alternative to immortal objects
that I prototyped.

> By the way, I made the _Py_IDENTIFIER() API and _PyUnicode_FromId()
> compatible with subinterpreters in Python 3.10. This change caused a
> subtle regression when using subintepreters (because an optimization
> made on an assumption on interned strings which is no longer true).
> The fix is trivial but I didn't wrote it yet:
> https://bugs.python.org/issue46006

FYI, I'm looking into statically allocating (and initializing) all the
string objects currently using _Py_IDENTIFIER().

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3XG4QY77MCRXEFUCJHB44RRIHFEM4MDD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Tue, Dec 14, 2021 at 11:19 AM Eric Snow  wrote:
> The idea of objects that never get deallocated isn't new and has been
> explored here several times.  Not that long ago I tried it out by
> setting the refcount really high.  That worked.  Around the same time
> Eddie Elizondo at Facebook did something similar but modified
> Py_INCREF() and Py_DECREF() to keep the refcount from changing.  Our
> solutions were similar but with different goals in mind.  (Facebook
> wants to avoid copy-on-write in their pre-fork model.)

FTR, here are links to the above efforts:

* reducing CoW (Instagram): https://bugs.python.org/issue40255
* Eddie's PR: https://github.com/python/cpython/pull/19474
* my PR: https://github.com/python/cpython/pull/24828
* some other discussion: https://github.com/faster-cpython/ideas/issues/14

(I don't have a link to any additional work Eddie did to reduce the
performance penalty.)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OUJHQY22BZY5TJXYGPQQOBTCLUWB6OVQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 2:42 AM Christian Heimes  wrote:
> Would it be possible to write the Py_INCREF() and Py_DECREF() macros in
> a way that does not depend on branching? For example we could use the
> highest bit of the ref count as an immutable indicator and do something like

As Antoine pointed out, wouldn't that cause too much cache
invalidation between threads, especially for None, True, and False.
That's the main reason I abandoned my previous effort
(https://github.com/ericsnowcurrently/cpython/pull/9).

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UA7CVGRI4N6ADOHDPMM4GC66XYKTW3KL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Eric Snow

On Wed, Dec 15, 2021 at 2:50 AM Pablo Galindo Salgado
 wrote:
> One thing to consider: ideally, inmortal objects should not participate in 
> the GC. There is nothing inheritly wrong if they do but we would need to 
> update the GC (and therefore add more branching in possible hot paths) to 
> deal with these as the algorithm requires the refcount to be exact to 
> correctly compute the cycles.

That's a good point.  Do static types and the global singletons
already opt out of GC participation?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ASIWOGWC5CKB3TNIFYS6767HEES5ATSP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-14 Thread Eric Snow

On Tue, Dec 14, 2021 at 4:09 PM Brett Cannon  wrote:

> There's also the concern of memory usage if these immortal objects are never 
> collected.
>
> But which objects are immortal? You only listed None, True, and False. 
> Otherwise assume/remember I'm management and provide a list and/or link of 
> what would get marked as immortal so we can have an idea of the memory impact.

Pretty much we would mark any object as immortal which would exist for
the lifetype of the runtime (or the respective interpreter in some
cases).  So currently that would include the global singletons (None,
True, False, small ints, empty tuple, etc.) and the static types.  We
would likely also include cached strings (_Py_Identifier, interned,
etc.).

>From another angle: I'm working on static allocation for nearly all
the objects currently dynamically allocated during runtime/interpreter
init.  All of them would be marked immortal.  This is similar to the
approach taken by Eddie with walking the heap and marking all objects
found.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JKRY6FQYZIFFYQ64BSKLFGWUKX74NZ7M/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-14 Thread Eric Snow

Yeah, no (mutable) global state at the C level.  It would also be good
to implement multi-phase init (PEP 489), but I don't expect that to
require much work itself.

-eric

On Tue, Dec 14, 2021 at 4:04 PM Brett Cannon  wrote:
>
>
>
> On Tue, Dec 14, 2021 at 9:41 AM Eric Snow  wrote:
>>
>> One of the open questions relative to subinterpreters is: how to
>> reduce the amount of work required for extension modules to support
>> them?  Thanks to Petr Viktorin for a lot of work he's done in this
>> area (e.g. PEP 489)!  Extensions also have the option to opt out of
>> subinterpreter support.
>>
>> However, that's only one part of the story.  A while back Nathaniel
>> expressed concerns with how making subinterpreters more accessible
>> will have a negative side effect affecting projects that publish large
>> extensions, e.g. numpy.  Not all extensions support subinterpreters
>> due to global state (incl. in library dependencies).  The amount of
>> work to get there may be large.  As subinterpreters increase in usage
>> in the community, so will demand increase for subinterpreter support
>> in those extensions.  Consequently, such projects be pressured to do
>> the extra work (which is made even more stressful by the short-handed
>> nature of most open source projects) .
>>
>> So we (the core devs) would effectively be requiring those extensions
>> to support subinterpreters, regardless of letting them opt out.  This
>> situation has been weighing heavily on my mind since Nathaniel brought
>> this up.  Here are some ideas I've had or heard of about what we could
>> do to help:
>>
>> * add a page to the C-API documentation about how to support subinterpreters
>> * identify the extensions most likely to be impacted and offer to help
>> * add more helpers to the C-API to make adding subinterpreter support
>> less painful
>> * fall back to loading the extension in its own namespace (e.g. use 
>> ldm_open())
>> * fall back to copying the extension's file and loading from the copied file
>> * ...
>>
>>  I'd appreciate your thoughts on what we can do to help.  Thanks!
>
>
> What are the requirements put upon an extension in order to support 
> subinterpreters? you hint at global state at the C level, but nothing else is 
> mentioned. Is that it?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BQU3PVN6MHR2P24RAUPJSWFS547W7FPM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] "immortal" objects and how they would help per-interpreter GIL

2021-12-14 Thread Eric Snow

Most of the work toward interpreter isolation and a per-interpreter
GIL involves moving static global variables to _PyRuntimeState or
PyInterpreterState (or module state).  Through the effort of quite a
few people, we've made good progress.  However, many globals still
remain, with the majority being objects and most of those being static
strings (e.g. _Py_Identifier), static types (incl. exceptions), and
singletons.

On top of that, a number of those objects are exposed in the public
C-API and even in the limited API. :(  Dealing with this specifically
is probably the trickiest thing I've had to work through in this
project.

There is one solution that would help both of the above in a nice way:
"immortal" objects.

The idea of objects that never get deallocated isn't new and has been
explored here several times.  Not that long ago I tried it out by
setting the refcount really high.  That worked.  Around the same time
Eddie Elizondo at Facebook did something similar but modified
Py_INCREF() and Py_DECREF() to keep the refcount from changing.  Our
solutions were similar but with different goals in mind.  (Facebook
wants to avoid copy-on-write in their pre-fork model.)

A while back I concluded that neither approach would work for us.  The
approach I had taken would have significant cache performance
penalties in a per-interpreter GIL world.  The approach that modifies
Py_INCREF() has a significant performance penalty due to the extra
branch on such a frequent operation.

Recently I've come back to the idea of immortal objects because it's
much simpler than the alternate (working) solution I found.  So how do
we get around that performance penalty?  Let's say it makes CPython 5%
slower.  We have some options:

* live with the full penalty
* make other changes to reduce the penalty to a more acceptable
threshold than 5%
* eliminate the penalty (e.g. claw back 5% elsewhere)
* abandon all hope

Mark Shannon suggested to me some things we can do.  Also, from a
recent conversation with Dino Viehland it sounds like Eddie was able
to reach performance-neutral with a few techniques.  So here are some
things we can do to reduce or eliminate that penalty:

* reduce refcount operations on high-activity objects (e.g. None, True, False)
* reduce refcount operations in general
* walk the heap at the end of runtime initialization and mark all
objects as immortal
* mark all global objects as immortal (statics or in _PyRuntimeState;
for PyInterpreterState not needed)

What do you think?  Does this sound realistic?  Are there additional
things we can do to counter that penalty?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7O3FUA52QGTVDC6MDAV5WXKNFEDRK5D6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] subinterpreters and their possible impact on large extension projects

2021-12-14 Thread Eric Snow

One of the open questions relative to subinterpreters is: how to
reduce the amount of work required for extension modules to support
them? Thanks to Petr Viktorin for a lot of work he's done in this
area (e.g. PEP 489)! Extensions also have the option to opt out of
subinterpreter support.

However, that's only one part of the story. A while back Nathaniel
expressed concerns with how making subinterpreters more accessible
will have a negative side effect affecting projects that publish large
extensions, e.g. numpy. Not all extensions support subinterpreters
due to global state (incl. in library dependencies). The amount of
work to get there may be large. As subinterpreters increase in usage
in the community, so will demand increase for subinterpreter support
in those extensions. Consequently, such projects be pressured to do
the extra work (which is made even more stressful by the short-handed
nature of most open source projects) .

So we (the core devs) would effectively be requiring those extensions
to support subinterpreters, regardless of letting them opt out. This
situation has been weighing heavily on my mind since Nathaniel brought
this up. Here are some ideas I've had or heard of about what we could
do to help:

* add a page to the C-API documentation about how to support subinterpreters
* identify the extensions most likely to be impacted and offer to help
* add more helpers to the C-API to make adding subinterpreter support
less painful
* fall back to loading the extension in its own namespace (e.g. use ldm_open())
* fall back to copying the extension's file and loading from the copied file
* ...

I'd appreciate your thoughts on what we can do to help. Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/X3ZOSP2A4RTSKTBZ4XYHROSJBONCEDID/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] my plans for subinterpreters (and a per-interpreter GIL)

2021-12-14 Thread Eric Snow

Hi all,

I'm still hoping to land a per-interpreter GIL for 3.11.  There is
still a decent amount of work to be done but little of it will require
solving any big problems:

* pull remaining static globals into _PyRuntimeState and PyInterpreterState
* minor updates to PEP 554
* finish up the last couple pieces of the PEP 554 implementation
* maybe publish a companion PEP about per-interpreter GIL

There are also a few decisions to be made.  I'll open a couple of
other threads to get feedback on those.  Here I'd like your thoughts
on the following:

Do we need a PEP about per-interpreter GIL?

I haven't thought there would be much value in such a PEP.  There
doesn't seem to be any decision that needs to be made.  At best the
PEP would be an explanation of the project, where:

* the objective has gotten a lot of support (and we're working on
addressing the concerns of the few objectors)
* most of the required work is worth doing regardless (e.g. improve
runtime init/fini, eliminate static globals)
* the performance impact is likely to be a net improvement
* it is fully backward compatible and the C-API is essentially unaffected

So the value of a PEP would be in consolidating an explanation of the
project into a single document.  It seems like a poor fit for a PEP.

(You might wonder, "what about PEP 554?"  I purposefully avoided any
discussion of the GIL in PEP 554.  It's purpose is to expose
subinterpreters to Python code.)

However, perhaps I'm too close to it all.  I'd like your thoughts on the matter.

Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)

2021-12-09 Thread Eric Snow

On Thu, Dec 9, 2021, 11:26 Petr Viktorin  wrote:

> I'll not get back to CPython until Tuesday, but I'll add a quick note
> for now. It's a bit blunt for lack of time; please don't be offended.
>

Not at all. :)  The tooling is a secondary concern to my point.  Mostly, I
wish the declarations in the header files had the extra classifications,
rather than having to remember to refer to a separate text file.

>
-eric

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OV7BOBOINCBWLZS3DZRWWJGY3BE4IOZB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)

2021-12-09 Thread Eric Snow

(replying to
https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ2NVUFNZDXVNK5DU3R3JGLL3J/)

On Wed, Dec 8, 2021 at 10:06 AM Eric Snow wrote:
> What about the various symbols listed in Misc/stable_abi.txt that were
> accidentally added to the limited API? Can we move toward dropping
> them from the stable ABI?

tl;dr We should consider making classifications related to the stable
ABI harder to miss.

Knowing what is in the limited API is fairly straightforward. [1]
However, it's clear that identifying what is part of the stable ABI,
and why, is not so easy. Currently, we must rely on
Misc/stable_abi.txt [2] (and the associated
Tools/scripts/stable_abi.py). Documentation (C-API docs, PEPs,
devguide) help too.

Yet, there's a concrete disconnect here: the header files are by
definition the authoritative single-source-of-truth for the C-API and
it's too easy to forget about supplemental info in another file or
document. This out-of-sight-out-of-mind situation is part of how we
accidentally added things to the limited API for a while. [3]

The stable ABI isn't the only area where we must identify different
subsets of the C-API. However, in those other cases we use different
structural/naming conventions to explicitly group things. Most
importantly, each of those conventions makes the grouping unavoidable
when reading the code. [4] For example:

* closely related declarations go in the same header file (and then
also exposed via Include/Python.h)
* prefixes (e.g. Py_, PyDict_) provides similar grouping
* an additional underscore prefix identifies "private" C-API
* symbols are explicitly identified as part of the C-API via macros
(PyAPI_FUNC, PyAPI_DATA) [5]
* relatively recently, different directories correspond to different
API layers (Include, Include/cpython, Include/internal) [3]

Could we take a similar explicit, coupled-to-the-code approach to
identify when the different stable ABI situations apply? Here's the
specific approach I had in mind, with macros similar to PyAPI_FUNC:

* PyAPI_ABI_FUNC - in stable ABI when it wouldn't be normally (e.g.
underscore prefix, in Include/internal)
* PyAPI_ABI_INDIRECT - exposed in stable ABI due to a macro
* PyAPI_ABI_ONLY - it only exists for ABI compatibility and isn't
actually used any more
* PyAPI_ABI_ACCIDENTAL - unintentionally added to limited API,
probably not used there

(...or perhaps use a PyABI_ prefix, though that's a bit easy to miss
when reading.)

As a reader I would find markers like this helpful in recognizing
those special situations, as well as the constraints those situations
impose on modification. At the least such macros would indicate
something different is going on, and the macro name would be something
I could look up if I needed more info. I expect others reading the
code would get comparable value. I also expect tools like
Tools/scripts/stable_abi.py would benefit.

-eric

[1] in Include/*.h and not #ifndef Py_LIMITED_API (sadly also making
it easy to accidentally add things to the limited API, see [3])
[2] Before that you had to rely on comments or external documents or,
in the worst case, work it out through careful study of the code,
commit history, and mailing list archives.
[3] The addition of Include/cpython and Include/internal helped us
stop accidentally adding to the limited API.
[4] It also makes the groupings deterministically discoverable by tools.
[5] explicit use of "extern" indicates a different intent
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/7BSVTXDYCEOURQTLDRUXPXNPRYMM3I4G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-09 Thread Eric Snow

On Thu, Dec 9, 2021 at 1:56 AM Petr Viktorin  wrote:
> It's possible to remove them just like _PyObject_GC_Malloc was removed,
> but check that it was unusable (e.g. not called from public macros) in
> all versions of Python from 3.2 up to now.

That's what I expected.  Thanks.

> Could you check if this PR makes things clear?
> https://github.com/python/devguide/pull/778

Yeah, that text is super helpful.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TB7JEBQXUJJKK4SZVLCMUNOTRTD5KQ5C/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-08 Thread Eric Snow

On Wed, Dec 8, 2021 at 2:23 AM Petr Viktorin  wrote:
> That really depends on what function we'd want to remove. There are
> usually alternatives to deleting things, but the options depend on the
> function. If we run out of other options we can make the function always
> fail or make it leak memory.
> And the regular backwards compatibility policy gives us 2 years to
> figure something out :)

What about the various symbols listed in Misc/stable_abi.txt that were
accidentally added to the limited API?  Can we move toward dropping
them from the stable ABI?

Most notably, there are quite a few functions listed there that are in
the stable ABI but no longer in the limited API.  This implies that
either they were already deprecated in the limited API (and removed)
or they were just removed.  At least in some cases they were moved to
header files in Include/cpython or Include/internal.  So I would not
expect extensions to be using them.  This subset of those symbols
seems entirely appropriate to remove from the stable ABI.  Is that
okay?  Do we even need to bother deprecating them?  What about just
the "private" ones?

For example, I went to change/remove _PyThreadState_Init() (internal
API declared in Include/internal/pycore_pystate.h) and found that it
is in the stable ABI but not the limited API.  It's highly unlikely
anyone is using it and plan on double-checking.  As far as I can tell,
the function was accidentally exposed in the limited API and stable
ABI and later removed from the limited API.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ2NVUFNZDXVNK5DU3R3JGLL3J/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 6:55 AM Eric V. Smith  wrote:
> As a compromise, how about go with #1, but print a warning if python
> detects that it's not built with optimizations or is run from a source
> tree (the conditions in #2 and #3)? The warning could suggest running
> with "-X frozen_modules=off". I realize that it will probably be ignored
> over time, but maybe it will provide enough of a reminder if someone is
> debugging and sees the warning.

Yeah, that would probably be sufficient (and much simpler).  I'll try it out.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EFNVXABK36DTKO6IDFC2PTP6P4OHM46B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 6:47 AM Pablo Galindo Salgado
 wrote:
> One interesting consequence of what Eric mentioned (They have a different 
> loader and repr.  Also, frozen modules do not
> have __file__ set (and __path__ is always []).) is that frozen modules don't 
> have a `__file__` attribute IIRC and therefore
>  tracebacks won't include the source.

FYI, we are planning on setting __file__ on the frozen stdlib modules,
whenever possible.  (We can do that whenever we can determine the
stdlib dir during startup. See https://bugs.python.org/issue45211.)
Regardless, for tracebacks we would need to set co_filename on the
module's code objects, right?

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W6F2V3H3KHGLOL5CJDLTO7DGO37LYIG5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 6:36 AM Victor Stinner  wrote:
> Honestly, for me, #1: always on, is the most reasonable choice.
>
> I dislike when Python behaves differently depending on subtle things
> like "was it built with optimizations" or "is Python started from its
> source tree"?
>
> When I built Python without optimization and/or from its source tree,
> I do that to debug an issue. If the bug goes away in this case, it can
> waste my time.
>
> So I prefer to teach everybody how to use "-X frozen_modules=off" if
> they want to hack the stdlib for their greatest pleasure. I prefer
> that such special use case requires an opt-in option, the special use
> case is not special enough to be the default.

Agreed.  I just don't want to discourage potential contributors nor
waste anyone's time.  I suppose that's the fundamental question I
originally posted: would it be too annoying for contributors if we
made the default "on" always?  I expect most non-docs contributions
are made against the stdlib so that factors in.

> It means that the site module module can no longer be "customized" by
> modifying directly the site.py file (inject a path in PYTHONPATH env
> var where the customized site.py lives). But there is already a
> supported way to customize the site module: create a module named
> "sitecustomize" or "usercustomizer". I recall that virtualenv likes to
> override stdlib site.py with its own code. tox uses virtualenv by
> default. Someone should check if freezing site doesn't break
> virtualenv and tox, since they seem to be popular in Python. The venv
> doesn't need to override site.py and tox can use venv if I recall
> correctly.
>
> If site.py customization is too popular, I would suggest to not freeze
> this one, until the community stops doing that.

Good point.  I'll look into that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M53U66ZP7QUSHDBYK2HONALLKW2EKSFQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 2:22 AM Marc-Andre Lemburg  wrote:
> #3 sounds like a good solution, but how would you detect "running
> from the source tree" ? This sounds like you need another stat call
> somewhere, which is what the frozen modules try to avoid.

We already look for the stdlib dir in Modules/getpath.c.  We can use
that information without an extra stat.  (See
https://bugs.python.org/issue45211.)

> I'd like to suggest adding an environment variable to enable /
> disable the setting instead. This makes it easy to customize the
> behavior without introducing complicated logic.

That's essentially what "-X frozen_modules=..." provides, though with
an env var you don't have to adjust your CLI invocation each time.
That said, there are a couple reasons why an env var might not be
suitable.  For one, I expect use of the -X option to be very uncommon,
especially outside of core development, so more of a one-off feature.
In contrast, to me environment variables imply repeated usage.  Also,
if we use an env var to override the default (of "on"), contributors
will still get bitten by the problem I described originally.  To me,
it's important that the default in that case be "off" without any
other intervention.

FWIW, I consider the "complicated logic" part as the negative side of
going with running-in-source-tree.  So, at this point I'm leaning more
toward Brett's suggestion of using "configure --with-pydebug" (AKA
Py_DEBUG) to determine the default.  That should be a suitable
approximation of running-in-source-tree.  We can circle back if it
proves inadequate.

On Tue, Sep 28, 2021 at 2:26 AM Marc-Andre Lemburg  wrote:
> Just to clarify: the modules would still always be frozen with
> the env var setting, but Python would simply not import them
> as frozen modules, but instead go and look on the PYTHONPATH
> for the modules.
>
> This could be achieved by special casing the frozen module
> finder function to only trigger on importlib modules and
> return NULL for all other possibly frozen modules.

Right.  That is essentially what we're doing.  (See find_frozen() in
Python/import.c.)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BVXMNZYHBPTKYC4QEVHGWUKQMLR2XGSZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 6:02 AM Ronald Oussoren via Python-Dev
 wrote:
> Of course. I mentioned it because the proposal is to add a new option that’s 
> enabled after installation, and basically not when the testsuite is run.  
> That’s not a problem, we could just enable the option in most CI jobs.

FYI, I already added the CLI option (-X frozen_modules=[on|off]) a
couple weeks ago, with the default always "off", and have frozen about
10 of the stdlib modules (see _imp._frozen_module_names()).  This
thread is about a satisfactory approach to changing the default to
"on".

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HD7GBPNT74GPY6COVQ6W4V7MTJ4NIHUT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-28 Thread Eric Snow

On Tue, Sep 28, 2021 at 2:54 AM Ronald Oussoren via Python-Dev
 wrote:
> I agree, but… Most CPython tests are run while running from the source tree, 
> that means that there will have to be testrunner configurations that run with 
> “-X frozen_modules=on”.

If the build option that determines the default is covered by existing
builtbots then we will be running the test suite in both modes without
any extra work.  The alternative is that we do for other modules what
we do with importlib: run the relevant tests one in each mode.
However, it's better to run the whole suite in both modes, so I'd
favor relying on the build-option-specific buildbots to get us
coverage.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M2LMJYETM7KXVAWQ6UY7DMAZUXO6H33K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 3:31 PM Victor Stinner  wrote:
> Which stdlib modules are currently frozen? If I really want to hack
> site.py or os.py for whatever reason, I just have to use "python3 -X
> frozen_modules=off"?

The single-source-of-truth is Tools/scripts/freeze_modules.py.  After
running "make regen-frozen" you'll find a cleaner list in
Python/frozen_modules/MANIFEST.  You can also look at the generated
code in Makefile.pre.in or Python/frozen.c.  Finally, you can run
"./python -X frozen_modules=on -c 'import _imp;
print(_imp._frozen_module_names())'"

> > 1. always default to "on" (the annoyance for contributors isn't big enough?)
>
> What is the annoyance?

The annoyance of changes to the .py files not getting used (at least
not until after running "make all"

> What is different between frozen and not frozen?

They have a different loader and repr.  Also, frozen modules do not
have __file__ set (and __path__ is always []).

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J5IINOU6JBHNBA4ZOTXWDCBC3QIQT2EF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 3:04 PM Barry Warsaw  wrote:
> If you’re planning a runtime -X option, then does that mean that the modules 
> will be frozen at build time but Python will decide at runtime whether to use 
> the frozen modules or the unfrozen ones?

Correct.  FYI, this was already done.

> Are you planning on including the currently frozen importlib modules in that 
> same mechanism?

No.  They must always be frozen.  See is_essential_frozen_module() in
Python/import.c.

> Will `make test` and/or CI run Python with both options?  How will we make 
> sure that frozen modules (or not) don’t break Python?

If "configure --with-optimizations" always sets the default to "on"
and the default is "off" otherwise, then the PGO buildbots will
exercise the frozen path.  Likewise if "--with-pydebug" (or
in-source-tree) makes the default "off" and otherwise it's "on".

Without a build-time option already handled by one of the buildbots,
we'd need to either add a dedicated buildbot or run it both ways (like
we do with importlib).  I expect that won't be necessary.

> Option #3 seems like the most reasonable one to me, with the ability to turn 
> it on when running from the source tree.

It's definitely the one that fits most naturally for me.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I2JTBQSFFA2GFMSGRDGHDARUPSZTLMQ2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 2:59 PM Brett Cannon  wrote:
> What about opting out when `--with-pydebug` is used? I'm not sure how many 
> people actively develop in a non-debug build other than testing something, 
> but at that point I would be having to run `make` probably anyway for 
> whatever I'm mucking with if it's that influenced by a debug build.

Yeah, that's an option too.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C2IMGPQMJQAFCH26SYHE4JE4WJRCPDBM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 12:40 PM Steve Dower  wrote:
> Having it be implied by an "--enable-optimizations" option is totally
> fine (and we'd add one to build.bat for this), but I still think it
> needs to be discoverable later whether the frozen modules build option
> was used or not, independent of other build options.

That's reasonable.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KKLMPEB6SI2EC34MUPLSTFBJJYG4O4WE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 10:51 AM Eric Snow  wrote:
> Possible solutions:
>
> 1. always default to "on" (the annoyance for contributors isn't big enough?)
> 2. default to "on" if it's a PGO build (and "off" otherwise)
> 3. default to "on" unless running from the source tree

FWIW, I'm planning on doing (2) (and (3) if it isn't complicated).
Mostly I wanted to verify my assumptions about the possible annoyance
before getting too far.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J656PLVTGTVDCLV2GSZPNV46UTKU4S7M/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

On Mon, Sep 27, 2021 at 11:09 AM Chris Angelico  wrote:
> When exactly does the freezing happen?

When you build the executable (e.g. "make -j8",
".\PCbuild\build.bat").  So your changes to those .py files wouldn't
show up until then.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IGDYUDVFHDU77OLPP3744FIG3IHZWS4D/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] The Default for python -X frozen_modules.

2021-09-27 Thread Eric Snow

We've frozen most of the stdlib modules imported during "python -c
pass" [1][2], to make startup a bit faster.  Import of those modules
is controlled by "-X frozen_modules=[on|off]".  Currently it defaults
to "off" but we'd like to default to "on".  The blocker is the impact
on contributors.  I expect many will make changes to a stdlib module
and then puzzle over why those changes aren't getting used.  That's an
annoyance we can avoid, which is the point of this thread.

Possible solutions:

1. always default to "on" (the annoyance for contributors isn't big enough?)
2. default to "on" if it's a PGO build (and "off" otherwise)
3. default to "on" unless running from the source tree

Thoughts?

-eric


[1] https://bugs.python.org/issue45020
[2] FWIW, we may end up also freezing the modules imported for "python
-m ...", along with some other commonly used modules (like argparse).
That is a separate discussion.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNOX43DRFKLEW3IMDXDKPDMNRGR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: A better way to freeze modules

2021-09-03 Thread Eric Snow

On Fri, Sep 3, 2021 at 5:32 AM Paul Moore  wrote:
> On Fri, 3 Sept 2021 at 10:29, Simon Cross  
> wrote:
> > I think adding a meta path importer that reads from a standard
> > optimized format could be a great addition.
>
> I think the biggest open question would be "what benefits does this
> have over the existing zipimport?"

+1

> > As you mentioned in your email, this is a big detour from the current
> > start-up performance work, so I think practically the people working
> > on performance are unlikely to take a detour from their detour right
> > now.
>
> Agreed, it would probably have to be an independent development
> initially. If it delivers better performance, then switching the
> startup work to use it would give a second set of performance
> improvements, which no-one is going to object to. Similarly, if it's
> simpler to manage, then the maintainability benefits could justify
> switching over.

+1

> > * Write the meta path importer in a separate package (it sounds like
> > you've already done a lot of the work and gained a lot of
> > understanding of the issues while writing PyOxidizer!)
>
> This is the key thing, though. The import machinery allows new
> importers to be written as standalone modules, so I'd strongly
> recommend that the proposed format/importer gets developed as a PyPI
> module initially, with the PEP then being simply a proposal that the
> module gets added to the stdlib and/or built into the interpreter.

FWIW, I'm a big fan of folks taking advantage of the flexibility of
the import machinery and writing importers like this (especially ones
that folks must explicitly enable).  As noted elsewhere, it would need
to prove its worth before we consider putting it into importlib.

> The key argument would be bootstrapping, IMO. I would definitely expect
> interest in something like this to be lower if it's an external module
> (needing a dependency to load your other dependencies is suboptimal).
> Conversely, though, if no-one shows any interest in a PyPI version of
> this idea, that would strongly imply that it's not as useful in
> practice as you'd hoped.

Excellent point!

> In particular, I'd involve the maintainers of pyinstaller in the
> design. If a new "frozen module importer" mechanism isn't of interest
> to them, it's probably not going to get the necessary support to be
> worth adding to the stdlib.

+1

> On a personal note, I love the flexibility of Python's import system,
> and I've always wanted to write importers for additional storage
> formats (import from a sqlite database, for instance). But I've never
> actually done so, because a zipfile is basically always sufficient for
> any practical use case I've had. One day I hope to find a real use
> case, though :-)

Cool!  I'd love to see what you make.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YV2K3BPVDZRZTGLM4HWQEJWMVPI6BGHD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: A better way to freeze modules

2021-09-03 Thread Eric Snow

On Thu, Sep 2, 2021 at 10:46 PM Gregory Szorc  wrote:
> Over in https://bugs.python.org/issue45020 there is some exciting work around 
> expanding the use of the frozen importer to speed up Python interpreter 
> startup. I wholeheartedly support the effort and don't want to discourage 
> progress in this area.
>
> Simultaneously, I've been down this path before with PyOxidizer and feel like 
> I have some insight to share.

Thanks for the support and for taking the time to share your insight!
Your work on PyOxidizer is really neat.

Before I dive in to replying, I want to be clear about what we are
discussing here.  There are two related topics: the impact of freezing
stdlib modules and usability problems with frozen modules in general
(stdlib or not).  https://bugs.python.org/issue45020 is concerned with
the former but prompted some good discussion about the latter.  From
what I understand, this python-dev thread is more about the latter
(and then some).  That's totally worth discussing!  I just don't want
the two topics to be unnecessarily conflated.

FYI, frozen modules (effectively the .pyc data) are compiled into the
Python binary and lhen loaded from there during import rather than
from the filesystem.  This allows us to avoid disk access, giving us a
performance benefit, but we still have to unmarshal and execute the
module code.  It also allows us to have the import machinery written
in pure Python (importlib._bootstrap and
importlib._bootstrap_external).  (Thanks Brett!)  While frozen modules
are derived from .py files, they currently have some differences from
the corresponding source modules: the loader (which has less
capability), the repr, frozen packages have __path__ set to [], and
frozen modules don't have __file__, __cached__, etc. set.  This has
been the case for a long time.  MAL worked on addressing __file__ but
the effort stalled out.  (See
https://bugs.python.org/issue45020#msg400769 and especially
https://bugs.python.org/issue21736.)  The challenge with solving this
for non-stdlib modules is that the frozen importer would need help to
know where to find corresponding .py files.

bpo-45020 is about freezing a small subset of the stdlib as a
performance improvement.  It's the 11 stdlib modules (plus encodings)
that get imported every time during "./python -c pass".  Freezing them
provides a roughly 15% startup time improvement.  (The 11 modules are:
abc, codecs, encodings, io, _collections_abc, _site_builtins, os,
os.path, genericpath, site, and stat.  Maybe there are a few other
modules it would make sense to freeze but we're starting with those
11.)  This work is probably somewhat affected by the differences
between frozen and source modules, and we may need to set an
appropriate __file__ on frozen stdlib modules to avoid impacting folks
that expect any of those stdlib modules to have it set.  Otherwise,
for bpo-45020 there likely isn't much more we need to do about frozen
stdlib modules shipping with CPython by default.  Regardless,
bpo-45020 doesn't introduce any new problems; rather it slightly
exposes the existing ones.

In contrast to the use of frozen modules in default Python builds,
there are a number of tools in the community for freezing modules
(both stdlib and not) into custom Python binaries, like PyOxidizer and
MAL's PyRun.  Such tools would benefit from broader compatibility
between frozen modules and the corresponding source modules.
Consequently the tool maintainers would be the most likely drivers of
any effort to improve frozen modules (which the discussion with MAL
and Gregory bears out).  The tools would especially benefit if those
improvements could apply to non-stdlib modules, which requires a more
complex solution than is needed for stdlib modules.

At the (relative) extreme is to throw out the existing frozen module
approach (or even the "unmarshal + exec" approach of source-based
modules) and replace it with something more efficient and/or more
compatible (and cross-platform).  From what I understood, this is the
main focus of this thread.  It's interesting stuff and I hope the
discussion renders a productive result.

FTR, in bpo-45020 Gregory helpfully linked to some insightful material
related to PyOxidizer and frozen modules:

* https://github.com/indygreg/PyOxidizer/issues/69
* 
https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_behavior_and_compliance.html?highlight=__file__#file-and-cached-module-attributes
* https://pypi.org/project/oxidized-importer/ and
https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer.html

With that said, on to replying. :)

> I don't think I'll be offending anyone by saying the existing CPython frozen 
> importer is quite primitive in terms of functionality: it does the minimum it 
> needs to do to support importing module bytecode embedded in the interpreter 
> binary [for purposes of bootstrapping the Python-based importlib modules]. 
> The C struct representing frozen modules is literally just the

[Python-Dev] Re: Is anyone relying on new-bugs-announce/python-bugs-list/bugs.python.org summaries

2021-09-03 Thread Eric Snow

On Mon, Aug 23, 2021 at 4:16 PM Ammar Askar  wrote:
> As part of PEP 588, migrating bugs.python.org issues to Github,

Thanks for working on this!

> 1. Weekly summary emails with bug counts and issues from the week,
> 2. Emails sent to the new-bugs-announce and python-bugs-list for new

I rely on both these.  They help improve signal-to-noise and make it
easier to quickly get back up-to-date if I'm out for a while.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IZJQIDKKL7KBGPUEL2YQO44FZIMLTZPO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making code object APIs unstable

2021-08-13 Thread Eric Snow

On Fri, Aug 13, 2021 at 11:29 AM Guido van Rossum  wrote:
> If these weren't part of the stable ABI, I'd choose (E).

They aren't in the stable ABI (or limited API).  Instead, they are
part of the broader public API (declared in Include/cpython/code.h,
along with "struct PyCodeObject" and others).

FWIW, there is actually very little API related to PyCodeObject that
is in the limited API:

* Include/code.h:typedef struct PyCodeObject PyCodeObject;
* Include/genobject.h:PyCodeObject *prefix##_code;
* Include/pyframe.h:PyAPI_FUNC(PyCodeObject *)
PyFrame_GetCode(PyFrameObject *frame);

All that said, the issue of compatibility remains.  I mostly agree
with Guido's analysis and his choice of (E), as long as it's
appropriately documented as unstable.

However, I'd probably pick (C) with a caveat.  We already have a
classification for this sort of unstable API: "internal".  Given how
code objects are so coupled to the CPython internals, I suggest that
most API related to PyCodeObject belongs in the internal API (in
Include/internal/pycore_code.h) and thus moved out of the public API.
Folks that are creating code objects manually via the C-API are
probably already doing low-level stuff that requires other "internal"
API (via Py_BUILD_CORE, etc.).  Otherwise they should use
types.CodeType instead.

Making that change would naturally include dropping PyCode_New() and
PyCode_NewWithPosArgs(), as described in (C).  However, we already
have _PyCode_New() in the internal API.  (It is slightly different but
effectively equivalent.)  We could either drop the underscore on
_PyCode_New() or move the existing PyCode_NewWithPosArgs() (renamed to
PyCode_New) to live beside it.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KWWRLL56EI2S5BVADKMDCG4UED76GXXG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Why aren't we allowing the use of C11?

2021-01-29 Thread Eric Snow

On Thu, Jan 28, 2021 at 9:28 AM Mark Shannon  wrote:
> Is there a good reason not to start using C11 now?

Would C17 be a better choice?  It sounds like it exists to fix
problems with C11 (and doesn't actually add any new features).

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2D25C5KI73LBRVLFHDBGH4OIKSCCEPUO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 622 railroaded through?

2020-07-03 Thread Eric Snow

On Fri, Jul 3, 2020, 12:40 Eric Snow  wrote:

> Also,  keep in mind that PEPs are a tool for the decision maker (i.e.
> BDFL delegate).  Effectively, everything else is convention.  The process
> usually involves community feedback, but has never been community-driven.
> All this has become more painful for volunteers as the Python community has
> grown.
>
> -eric
>

To further elaborate on that, a PEP isn't legislation to be approved by the
community.  Rather, it is meant to capture the proposal and discussion
sufficiently that the BDFL/delegate can make a good decision.  Ultimately
there isn't much more to the process than that, beyond convention.  The
BDFL-delegate is trusted to do the right thing and the steering council is
there as a backstop.

It's up to the decision maker to reach a conclusion and it makes sense that
they especially consider community impact.  However, there is no
requirement of community approval.  This is not new.  Over the years quite
a few decisions by Guido (as BDFL) sparked controversy yet in hindsight
Python is better for each of those decisions.  (See PEP 20.)

The main difference in recent years is the growth of the Python community,
which is a happy problem even if a complex one. :)  There has been a huge
influx of folks without context on Python's governance but with contrary
expectations and loud voices.  On the downside, growth has greatly
increased communications traffic and signal-to-noise, as well as somewhat
shifting tone in the wrong direction.  Unfortunately all this contributed
to us losing our BDFL. :(  Thankfully we have the steering council as a
replacement.

Regardless, Python is not run as a democracy nor by a representative body.
Instead, this is a group of trusted volunteers that are trying their best
to keep Python going and make it better.  The sacrifices they make reflect
how much they care about the language and the community, especially as
dissenting voices increase in volume and vitriol.  That negativity has a
real impact.

-eric

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GTOD2OHTIJU34DQS6XH756X4K2FLL2C2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 622 railroaded through?

2020-07-03 Thread Eric Snow

On Fri, Jul 3, 2020, 09:18 Antoine Pitrou  wrote:

> I think what you describe as "the usual procedure" isn't as usual as
> you think.
>

+1

Also,  keep in mind that PEPs are a tool for the decision maker (i.e. BDFL
delegate).  Effectively, everything else is convention.  The process
usually involves community feedback, but has never been community-driven.
All this has become more painful for volunteers as the Python community has
grown.

-eric

>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUFB5NI64ASIESOCWHNPUQZPR5BEMXQF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-17 Thread Eric Snow

On Wed, Jun 17, 2020 at 11:42 AM Emily Bowman  wrote:
> So most likely there wouldn't be any way to share something like a bytearray 
> or another
> buffer interface-compatible type for some time. That's too bad, I was hoping 
> to have
> shared arrays that I could put a memoryview on in each thread/interpreter and 
> deal with
> locking if I need to,

Earlier versions of PEP 554 did have a "SendChannel.send_buffer()"
method for this but we tabled it in the interest of simplifying.  That
said, I expect we'll add something like that separately later.

> but I suppose I can work through an extension once the changes stabilize.

Yep.  This should be totally doable in an extension and hopefully
without much effort.

> Packages like NumPy have had their own opaque C types and C-only routines to 
> handle all the big threading outside of Python as a workaround for a long 
> time now.

As a workaround for what?  This sounds interesting. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D2APLOLR4UL7VXLNRFGFWOUN5MPIO2BV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

2020-06-12 Thread Eric Snow

On Fri, Jun 12, 2020 at 2:49 AM Mark Shannon  wrote:
> The overhead largely comes from what you do with the process. The
> additional cost of starting a new interpreter is the same regardless of
> whether it is in the same process or not.

FWIW, there's more to it than that:

* there is some overhead to starting the runtime and main interpreter
that does not apply to additional in-process interpreters
* I don't see why we shouldn't be able to come up with a strategy for
interpreter startup that does not involve copying or sharing a lot of
interpreter state, thus reducing startup time and memory consumption
* I'm guessing that re-importing builtin/extension modules is faster
than importing then new in a separate process

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M7FZL6LVEP2CRMDKGZE4BA6G7WOS542H/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Can we stop adding to the C API, please?

2020-06-05 Thread Eric Snow

On Wed, Jun 3, 2020 at 7:12 AM Mark Shannon  wrote:
> The size of the C API, as measured by `git grep PyAPI_FUNC | wc -l` has
> been steadily increasing over the last few releases.
>
> 3.5 1237
> 3.6 1304
> 3.7 1408
> 3.8 1478
> 3.9 1518
>
>
> For reference the 2.7 branch has "only" 973 functions

It isn't as bad as that.  Here I'm only looking at PyAPI_FUNC under
Include/.  From 3.5 to master the *public* C-API has increased by 71
functions (and the "private"/internal C-API by 189).  "Private" is
functions starting with "_" and

VER TOT PUB + "_"
2.7   932  (752 + 178)
3.5  1181 (846 + 320)
3.6  1247 (851 + 380)
3.7  1350 (875 + 460 + 13 internal)
3.8  1424 (908 + 422 + 79 internal)
3.9  1447 (917 + 403 + 110 internal)
m1443 (917 + 401 + 108 internal)

(This does not count changes in the number of macros, which may have
gone down...or not.)

FWIW, relative to the "cpython" API split that happened in 3.8 (and
"internal" in 3.7):

VERtotal  Include/*.h   Include/cpython/*.h
Include/internal/*.h
2.7  932  932  (752 + 178)   -   -
3.51181 1181 (846 + 320)-   -
3.612471247 (851 + 380)-   -
3.713501350 (875 + 460)-13 (0 + 13)
3.814241050 (800 + 249)   295 (108 + 173) 79 (0 + 79)
3.91447  944 (789 + 153)   393 (128 + 250)   110 (105 + 5)
m  1443  941 (789 + 150)   394 (128 + 251)   108 (103 + 5)

Here's the "command" I ran:

for pat in 'Include/' 'Include/*.h' 'Include/cpython/*.h'
'Include/internal/*.h'; do
  echo " -- $pat --"
  echo $(git grep 'PyAPI_FUNC(' -- $pat | wc -l) '('$(git grep
'PyAPI_FUNC(.*) [^_]' -- $pat | wc -l) '+' $(git grep 'PyAPI_FUNC(.*)
[_]' -- $pat | wc -l)')'
done


> Every one of these functions represents a maintenance burden.
> Removing them is painful and takes a lot of effort, but adding them is
> done casually, without a PEP or, in many cases, even a review.

I agree with regards to the public C-API, particularly the stable API.

> We need to address what to do about the C API in the long term, but for
> now can we just stop making it larger? Please.
>
> Also, can we remove all the new API functions added in 3.9 before the
> release and it is too late?

In 3.9 we have added 9 functions to the public C-API and removed 19
from the "private" C-API.  The "internal" C-API grew by 31, but I
don't see the point in changing any of those.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FHC5SWV6JTF4FQ4TZWLHVEJ5S22KPBFM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Summary of Python tracker Issues

2020-06-05 Thread Eric Snow

On Fri, May 29, 2020 at 12:16 PM Python tracker  wrote:
> ACTIVITY SUMMARY (2020-05-22 - 2020-05-29)
> Python tracker at https://bugs.python.org/
>
> To view or respond to any of the issues listed below, click on the issue.
> Do NOT respond to this message.
>
> Issues counts and deltas:
>   open7487 ( +9)
>   closed 45080 (+80)
>   total  52567 (+89)
>
> ...

How hard would it be to add PRs (in the same way) to this weekly report?

Also, where is the script for this hosted and where is the source repo
(if any)?  it might be helpful to have a link back to that info,
perhaps somewhere in the devguide.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WCUONFB5MRSC6LHWT442QBF7FBN7TGQJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-07 Thread Eric Snow

On Thu, May 7, 2020 at 2:50 AM Emily Bowman  wrote:
> While large object copies are fairly fast -- I wouldn't say trivial, a 
> gigabyte copy will introduce noticeable lag when processing enough of them -- 
> the flip side of having large objects is that you want to avoid having so 
> many copies that you run into memory pressure and the dreaded swapping. A 
> multiprocessing engine that's fully parallel, every fork takes chunks of data 
> and does everything needed to them won't gain much from zero-copy as long as 
> memory limits aren't hit. But a pipeline of processing would involve many 
> copies, especially if you have a central dispatch thread that passes things 
> from stage to stage. This is a big deal where stages may take longer or 
> slower at any time, especially in low-latency applications, like video 
> conferencing, where dispatch needs the flexibility to skip steps or add extra 
> workers to shove a frame out the door, and using signals to interact with 
> separate processes to tell them to do so is more latency and overhead.
>
> Not that I'm recommending someone go out and make a pure Python 
> videoconferencing unit right now, but it's a use case I'm familiar with. 
> (Since I use Python to test new ideas before converting them into C++.)

Thanks for the insight, Emily (and everyone else).  It's really
helpful to get many different expert perspectives on the matter.  I am
definitely not an expert on big-data/high-performance use cases so,
personally, I rely on folks like Nathaniel, Travis Oliphant, and
yourself.  The more, the better. :)  Again, thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5KA262LMVS3IBXUZQD6VJ5IQTZOSMR5U/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Latest PEP 554 updates.

2020-05-06 Thread Eric Snow

On Wed, May 6, 2020 at 2:25 PM Jeff Allen  wrote:
> Many thanks for working on this so carefully for so long. I'm happy to see 
> the per-interpreter GIL will now be studied fully before final commitment to 
> subinterpreters in the stdlib. I would have chipped in in those terms to the 
> review, but others succesfully argued for "provisional" inclusion, and I was 
> content with that.

No problem. :)

> My reason for worrying about this is that, while the C-API has been there for 
> some time, it has not had heavy use in taxing cases AFAIK, and I think there 
> is room for it to be incorrect. I am thinking more about Jython than CPython, 
> but ideally they are the same structures. When I put the structures to taxing 
> use cases on paper, they don't seem quite to work. Jython has been used in 
> environments with thread-pools, concurrency, and multiple interpreters, and 
> this aspect has had to be "fixed" several times.

That insight would be super helpful and much appreciated. :)  Is that
all on the docs you've linked?

> My use cases include sharing objects between interpreters, which I know the 
> PEP doesn't. The C-API docs acknowledge that object sharing can't be 
> prevented, but do their best to discourage it because of the hazards around 
> allocation. Trouble is, I think it can happen unawares. The fact that Java 
> takes on lifecycle management suggests it shouldn't be a fundamental problem 
> in Jython. I know from other discussion it's where many would like to end up, 
> even in CPython.

Yeah, for now we will strictly disallow sharing actual objects between
interpreters in Python Code.  It would be an interesting project to
try loosening that at some point (especially with immutable type), but
we're going to start from the safer position.

We have no plans to add any similar restrictions to the C-API, where
by you're typically much more free to shoot your own foot. :)

> This is all theory: I don't have even a model implementation, so I won't 
> pontificate. However, I do have pictures, without which I find it impossible 
> to think about this subject. I couldn't find your pictures, so share mine 
> here (WiP):
>
> https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#runtime-thread-and-interpreter-cpython
>
> I would be interested in how you solve the problem of finding the current 
> interpreter, discussed in the article. My preferred answer is:
>
> https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#critical-structures-revisited
>
> That's the API change I think is needed. It might not have a visible effect 
> on the PEP, but it's worth bearing in mind the risk of exposing a thing you 
> might shortly find you want to change.

This is great stuff, Jeff!  Thanks for sharing it.  I was able to skim
through but don't have time to dig in at the moment.  I'll reply in
detail as soon as I can.

In the meantime, the implementation of PEP 554 exposes a single part
of PyInterpreterState: the ID (an int).  The only other internal-ish
info we expose is whether or not an interpreter (by ID) is currently
running.  The only functionality we provide is: create, destroy, and
run_string().

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7RZCIKVRIKXTNFT7IRNLA3OQ5CX2AIJ6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Latest PEP 554 updates.

2020-05-05 Thread Eric Snow

On Mon, May 4, 2020 at 11:30 AM Eric Snow  wrote:
> Further feedback is welcome, though I feel like the PR is ready (or
> very close to ready) for pronouncement.  Thanks again to all.

FYI, after consulting with the steering council I've decided to change
the target release to 3.10, when we expect to have per-interpreter GIL
landed.  That will help maximize the impact of the module and avoid
any confusion.  I'm undecided on releasing a 3.9-only module on PyPI.
If I do it will only be for folks to try it out early and I probably
won't advertise it much.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PZO7ZQB7OAOEJ7AXMJNMDKZC3B3UVDZA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Latest PEP 554 updates.

2020-05-04 Thread Eric Snow

On Mon, May 4, 2020 at 1:22 PM Paul Moore  wrote:
> One thing I would like to see is a comment confirming that as part of
> the implementation, all stdlib modules will be made
> subinterpreter-safe.

Yeah, I'd meant to put a note.  I'll add one.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CVGSOKANMSIGPZKRL6IQDOYJYZRZ3NE2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Latest PEP 554 updates.

2020-05-04 Thread Eric Snow

On Mon, May 4, 2020 at 11:30 AM Eric Snow  wrote:
> Further feedback is welcome, though I feel like the PR is ready (or
> very close to ready) for pronouncement.  Thanks again to all.

oops

s/the PR is ready/the PEP is ready/

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D36JNZMMA2O746KLWBBWMGN2F7MQHVWM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Latest PEP 554 updates.

2020-05-04 Thread Eric Snow

Hi all,

Thanks for the great feedback.  I've updated PEP 554 (Multiple
Interpreters in the Stdlib) following feedback.

https://www.python.org/dev/peps/pep-0554/

Here's a summary of the main changes:

* [API] dropped/deferred the "release" and "close" methods from
   RecvChannel and SendChannel (they were unnecessary and
   the "association" stuff was too confusing)
* [API] dropped RecvChannel/SendChannel.interpreters
* [API] dropped/deferred SendChannel.send_buffer()
* [API] renamed Interpreter.destroy() to Interpreter.close()
* [API] added a per-interpreter "isolated" mode (default: on)
* added a section about "Help for Extension Module Maintainers"
* added a section about documentation
* added many entries to the "deferred" and "rejected" sections

Further feedback is welcome, though I feel like the PR is ready (or
very close to ready) for pronouncement.  Thanks again to all.

-eric

------

PEP: 554
Title: Multiple Interpreters in the Stdlib
Author: Eric Snow 
BDFL-Delegate: Antoine Pitrou 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2017-09-05
Python-Version: 3.9
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
  09-May-2018, 20-Apr-2020, 01-May-2020


Abstract


CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5 (1997).  The feature has been
available via the C-API. [c-api]_  Subinterpreters operate in
`relative isolation from one another `_, which
facilitates novel alternative approaches to
`concurrency `_.

This proposal introduces the stdlib ``interpreters`` module.  The module
will be `provisional `_.  It exposes the basic
functionality of subinterpreters already provided by the C-API, along
with new (basic) functionality for sharing data between interpreters.


A Disclaimer about the GIL
==

To avoid any confusion up front:  This PEP is unrelated to any efforts
to stop sharing the GIL between subinterpreters.  At most this proposal
will allow users to take advantage of any results of work on the GIL.
The position here is that exposing subinterpreters to Python code is
worth doing, even if they still share the GIL.


Proposal


The ``interpreters`` module will be added to the stdlib.  To help
authors of extension modules, a new page will be added to the
`Extending Python `_ docs.  More information on both
is found in the immediately following sections.

The "interpreters" Module
-

The ``interpreters`` module will
provide a high-level interface to subinterpreters and wrap a new
low-level ``_interpreters`` (in the same way as the ``threading``
module).  See the `Examples`_ section for concrete usage and use cases.

Along with exposing the existing (in CPython) subinterpreter support,
the module will also provide a mechanism for sharing data between
interpreters.  This mechanism centers around "channels", which are
similar to queues and pipes.

Note that *objects* are not shared between interpreters since they are
tied to the interpreter in which they were created.  Instead, the
objects' *data* is passed between interpreters.  See the `Shared data`_
section for more details about sharing between interpreters.

At first only the following types will be supported for sharing:

* None
* bytes
* str
* int
* PEP 554 channels

Support for other basic types (e.g. bool, float, Ellipsis) will be added later.

API summary for interpreters module
---

Here is a summary of the API for the ``interpreters`` module.  For a
more in-depth explanation of the proposed classes and functions, see
the `"interpreters" Module API`_ section below.

For creating and using interpreters:

+-+--+
| signature   | description
  |
+=+==+
| ``list_all() -> [Interpreter]`` | Get all existing
interpreters.   |
+-+--+
| ``get_current() -> Interpreter``| Get the currently
running interpreter.   |
+-+--+
| ``get_main() -> Interpreter``   | Get the main
interpreter.|
+-+--+
| ``create(*, isolated=True) -> Interpreter`` | Initialize a new
(idle) Python interpreter.  |
+-+--+

|

+

[Python-Dev] Re: PEP 554 comments

2020-04-29 Thread Eric Snow

On Wed, Apr 29, 2020, 22:05 Greg Ewing  wrote:

> > Furthermore, IMHO "release" is better at communicating the
> > per-interpreter nature than "close".
>
> Channels are a similar enough concept to pipes that I think
> it would be confusing to have "close" mean "close for all
> interpreters". Everyone understands that "closing" a pipe
> only means you're closing your reference to one end of it,
> and they will probably assume closing a channel means the
> same.
>

FWIW, I'd compare channels more closely to queues than to pipes.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTOEUU7VAF55KZYPYHJJE4ZVWIEMNZNK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-29 Thread Eric Snow

Thanks for the thoughtful post!  I'm going to address some of your
comments here and some in a separate discussion in the next few days.

On Wed, Apr 29, 2020 at 10:36 AM Sebastian Berg
 wrote:
> While I still think it is probably not part of PEP 554 as such, I guess
> it needs a full blown PEP on its own. Saying that Python should
> implement subinterpreters. (I am saying "implement" because I believe
> you must consider subinterpreters basically a non-feature at this time.
> It has neither users nor reasonable ecosystem support.)

FWIW, at this point it would be hard to justify removing the existing
public subinterpreters C-API.  There are several large public projects
using it and likely many more private ones we do not know about.

That's not to say that alone justifies exposing the C-API, of course. :)

> In many ways I assume that a lot of the ground work for subinterpreters
> was useful on its own.

There has definitely been a lot of code health effort related to the
CPython runtime code, partly motivated by this project. :)

> But please do not underestimate how much effort
> it will take to make subinterpreters first class citizen in the
> language!

If you are talking about on the CPython side, most of the work is
already done.  The implementation of PEP 554 is nearly complete and
subinterpreter support in the runtime has a few rough edges to buff
out.  The big question is the effort it will demand of the Python
community, which is the point Nathaniel has been emphasizing
(understandably).

> Believe me, I have been there and its tough to write these documents
> and then get feedback which you are not immediately sure what to make
> of.
> Thus, I hope those supporting the idea of subinterpreters will help you
> out and formulate a better framework and clarify PEP 554 when it comes
> to the fuzzy long term user-impact side of the PEP.

FYI, I started working on this project in 2015 and proposed PEP 554 in
2017.  This is actually the 6th round of discussion since then. :)

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SIKH5NK6B67BLLVHDRAMK64PMO6EZ5VI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-29 Thread Eric Snow

Thanks for the great insights into PyObjC!

On Wed, Apr 29, 2020 at 9:02 AM Ronald Oussoren  wrote:
> I don’t know how much the move of global state to per-interpreter state 
> affects extensions, other than references to singletons and static types.

That's the million dollar question. :)

FYI, one additional challenge is when an extension module depends on a
third-party C library which itself keeps global state which might leak
between subinterpreters.  The Cryptography project ran into this
several years ago with OpenSSL and they were understandably grumpy
about it.

> But with some macro trickery that could be made source compatible for 
> extensions.

Yeah, that's one approach that we've discussed in the past (e.g. at
the last core sprint).

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H3RCS47ZUITKKXR3BVYOPXNXBZYF5ZN4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-29 Thread Eric Snow

On Wed, Apr 29, 2020 at 6:27 AM Julien Salort  wrote:
> If your proposal leads to an intelligible actual error, and a clear
> warning in the documentation, instead of a silent crash, this sounds
> like progress, even for those packages which won't work on
> subinterpreters anytime soon...

That's helpful.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BURRXBY24URPSZXRIB7OHYCEBY2G4U67/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-29 Thread Eric Snow

On Wed, Apr 29, 2020 at 1:52 AM Paul Moore  wrote:
> One thing that isn't at all clear to me here is that when you say
> "Subinterpreters run all Python code", do you *just* mean the core
> language? Or the core language plus all builtins? Or the core
> language, builtins and the standard library? Because I think that the
> vast majority of users would expect a core/stdlib function like
> subinterpreters to support the full core+stdlib language.

Agreed.

> So my question would be, do all of the stdlib C extension modules
> support subinterpreters[1]? If they don't, then I think it's very
> reasonable to expect that to be fixed, in the spirit of "eating our
> own dogfood" - if we aren't willing or able to make the stdlib support
> subinterpreters, it's not exactly reasonable or fair to expect 3rd
> party extensions to do so.

That is definitely the right question. :)  Honestly I had not thought
of it that way (nor checked of course).  While many stdlib modules
have been updated to use heap types (see PEP 384) and support PEP 489
(Multi-phase Extension Module Initialization), there are still a few
stragglers.  Furthermore, I expect that there are few modules that
would give us trouble (maybe ssl, cdecimal).  It's all about global
state that gets shared inadvertently between subinterpreters.

Probably the best way to find out is to run the entire test suite in a
subinterpreter.  I'll do that as soon as I can.

> If, on the other hand, the stdlib *is* supported, then I think that
> "all of Python and the stdlib, plus all 3rd party pure Python
> packages" is a significant base of functionality, and an entirely
> reasonable starting point for the feature.

Yep, that's what I meant.  I just need to identify modules where we
need fixes.  Thanks for bringing this up!

> It certainly still excludes
> big parts of the Python ecosystem (notably scientific / data science
> users) but that seems fine to me - big extension users like those can
> be expected to have additional limitations. It's not really that
> different from the situation around C extension support in PyPy.

Agreed.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TGW2OESYUGMMRVU6JIXQXWEP3VMH7WPL/
Code of Conduct: http://python.org/psf/codeofconduct/

1 2 3 4 5 6 >

1 - 100 of 566 matches

Mail list logo