[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant
On Mon, Nov 28, 2022 at 6:45 PM Steven D'Aprano wrote: > On Tue, Nov 29, 2022 at 01:34:54PM +1300, Greg Ewing wrote: > > I got the impression that there were some internal language reasons > > to want stable dicts, e.g. so that the class dict passed to __prepare__ > > preserves the order in which names are assigned in the class body. Are > > there any such use cases for stable sets? > > Some people wanted order preserving kwargs, I think for web frameworks. > There was even a discussion for a while about using OrderedDict for > kwargs and leaving dicts unordered. See https://peps.python.org/pep-0468/ (kwargs) and https://peps.python.org/pep-0520/ (class definition body). I re-implemented OrderedDict in C for this purpose. Literally right after I had finished that, Inada-san showed up with his compact dict implementation. Many of us were at the first core sprint at the time and there was a lot of excitement about compact dict. It was merged right away (for 3.6) and there was quick agreement that we could depend on dict insertion ordering internally (for a variety of use cases, IIRC). Thus, suddenly both my PEPs were effectively implemented, so we marked them as approved and moved on. FWIW, making the insertion ordering an official part of the language happened relatively soon afterward, though for 3.7, not 3.6. [1] I'm pretty sure there's a python-dev thread about that. The stdtypes docs were updated [2] soon after, and we finally got around to updating the language [3] a couple years later. -eric [1] https://docs.python.org/3/whatsnew/3.7.html#summary-release-highlights [2] https://bugs.python.org/issue33609 [3] https://bugs.python.org/issue39879 ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5QYN66BWHO4GHWD34DIY43NLBMAM4UPZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Switching to Discourse
On Thu, Jul 21, 2022 at 12:19 AM Stefan Behnel wrote: > I'm actually reading python-dev, c.l.py etc. through Gmane, and have done > that ever since I joined. Simply because it's a mailing list of which I > don't need a local (content) copy, and wouldn't want one. Gmane seems to > have a complete archive that's searchable, regardless of "when I subscribed". +1 > It's really sad that Discourse lacks an NNTP interface. There's an > unmaintained bridge to NNTP servers [1], but not an emulating interface > that would serve the available discussions via NNTP messages, so that users > can get them into their NNTP/Mail clients to read them in proper discussion > threads. I think adding that next to the existing web interface would serve > everyone's needs just perfectly. Perhaps the possible mirroring-to-mailman that Steve (Turnbull) mentioned would be enough to facilitate a continuity for NNTP? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PHPTWOITHVDNN5WDHQZUHXBDO3ABYGMZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Switching to Discourse
On Mon, Jul 18, 2022 at 11:48 AM wrote: > LLVM did the same recently (though they imported all previous messages from > the mailinglist, thus making them searchable in discourse) [2 - announcement; > 3 - retro], and by and large, I think it was a success. > > One of the comments in the retro was: > > Searching the archives is much easier and have found me many old threads > > that I probably would have problem finding before since I haven’t been > > subscribed for that long. > > I that it would be worth considering importing the mailing list into a > separate discourse category that's then archived, but at least searchable. > This would also lower the hurdle of new(er) contributors to investigate > previous discussion on a given topic. +1 -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/32TYER52AV527DSZBTYGZMFRZR25BNR2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Switching to Discourse
On Fri, Jul 15, 2022 at 5:21 AM Petr Viktorin wrote: > The Steering Council would like to switch from python-dev to > discuss.python.org. This seems like a net win for the community so +1 from me. (For me personally it amounts to disruption with little advantage, so I'd probably be -0). However, I am not python-dev and discuss.python.org is probably a better fit for most of the participants.) (Message threading on discuss.python.org feels like a step backward in usability though. This is especially true with long threads, support for which (I expect) Discourse has not prioritized.) My only real concern is one I've brought up before when we started splitting discussions onto DPO (discuss.python.org), as well as with the GitHub issues migration: message archives. I consider the ability to search message archives to be essential to effective contribution, both in attracting/integrating new contributors and in providing "offline" context for active contributors. The existing archives have aided me personally so many times in both ways. There are relevant three aspects to archival and search that are worth asking about here: 1. search functionality on the [archive] web site 2. ability to search using other tools (e.g. my favorite: Google search with "site:...") 3. single archive vs. split archive Regarding (1), currently it is relatively easy to search through message archives on https://mail.python.org/archives/list/ The DPO UI search functionality seems fine. Regarding (2), currently it's easy to search using other tools and the results are clean (not noisy). With DPO, is that possible? (A quick attempt was a complete failure.) Would the results be good enough? Would they be noisier? Regarding (3), it's a small thing but, IMHO, having a single archive is valuable. Most notably (for me, at least), with a split archive it becomes a little harder to make sure searches covered the full message history of a given channel. It would be nice if at least one of the sites could preserve *all* the history. In the case of python-dev, either we'd forward all relevant DPO messages to python-dev@python.org (or otherwise directly send them to https://mail.python.org/archives/list/python-dev@python.org) or we'd import the archived mailing list into DPO. Or maybe it would require more work than it would be worth? > - You can use discuss.python.org's “mailing list mode” (which subscribes > you to all new posts), possibly with filtering and/or categorizing > messages locally. FWIW, I've been using mailing list mode (for consumption) since we started discuss.python.org and it's been fine. I've hit a couple[1][2] minor annoyances, but overall I don't have any real complaints. Mailing list mode is straightforward to configure, the messages have a "mailing list" header set (for easy filtering), and jumping over to the web UI to start a thread, respond (or react) is trivial. -eric [1] My mobile email notifications format the messages weird. [2] The messages are significantly noisier than regular (text) email. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HA47EERV3V5AUGJDFC5BQEZYYR5PYURN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Switching to Discourse
On Fri, Jul 15, 2022 at 12:15 PM Barry Warsaw wrote: > I agree that the experiment has proven successful enough that there’s more > value at this point in consolidating discussions. We've only been running this experiment since 2017(?) so maybe it's too soon to say it's a success? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/D5S72W7HCKGJ5FMNPLCK35XHQEMIA4XH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)
responses inline -eric On Wed, Mar 9, 2022 at 8:23 AM Petr Viktorin wrote: > "periodically reset the refcount for immortal objects (only enable this > if a stable ABI extension is imported?)" -- that sounds quite expensive, > both at runtime and maintenance-wise. Are you talking just about "(only enable this if a stable ABI extension is imported?)"? Such a check could certainly be expensive but it doesn't have to be. However, I'm guessing that you are actually talking about the mechanism to periodically reset the refcount. The actual periodic reset doesn't seem like it needs to be all that expensive overall. It would just need to be in a place that gets triggered often enough, but not too often such that the extra cost of resetting the refcount would be a problem. One important factor is whether we need to worry about potential de-immortalization for all immortal objects or only for a specific subset, like the most commonly used objects (at least most commonly used by the problematic older stable ABI extensions), Mostly, we only need to be concerned with the objects that are likely to trigger de-immortalization in those extensions. Realistically, there aren't many potential immortal objects that would be exposed to the de-immortalization problem (e.g. None, True, False), so we could limit this workaround to them. A variety of options come to mind. In each case we would reset the refcount of a given object if it is immortal. (We would also only do so if the refcount actually changed--to avoid cache invalidation and copy-on-write.) If we need to worry about *all* immortal objects then I see several options: 1. in a single place where stable ABI extensions are likely to pass all objects often enough 2. in a single place where all objects pass through often enough On the other hand, if we only need to worry about a fixed set of objects, the following options come to mind: 1. in a single place that is likely to be called by older stable ABI extensions 2. in a place that runs often enough, targeting a hard-coded group of immortal objects (common static globals like None) * perhaps in the eval breaker code, in exception handling, etc. 3. like (2) but rotate through subsets of the hard-coded group (to reduce the overall cost) 4. like (2), but in spread out in type-specific code (e.g. static types could be reset in type_dealloc()) Again, none of those should be in code that runs often enough that the overhead would add up. > "provide a runtime flag for disabling immortality" also doesn't sound > workable to me. We'd essentially need to run all tests twice every time > to make sure it stays working. Yeah, that makes it not worth it. > "Special-casing immortal objects in tp_dealloc() for the relevant types > (but not int, due to frequency?)" sounds promising. > > The "relevant types" are those for which we skip calling incref/decref > entirely, like in Py_RETURN_NONE. This skipping is one of the optional > optimizations, so we're entirely in control of if/when to apply it. We would definitely do it for those types. NoneType and bool already have a tp_dealloc that calls Py_FatalError() if triggered. The tp_dealloc for str & tuple have special casing for some singletons that do likewise. In PyType_Type.tp_dealloc we have a similar assert for static types. In each case we would instead reset the refcount to the initial immortal value. Regardless, in practice we may only need to worry (as noted above) about the problem for the most commonly used global objects, so perhaps we could stop there. However, it depends on what the level of risk is, such that it would warrant incurring additional potential performance/maintenance costs. What is the likelihood of actual crashes due to pathological de-immortalization in older stable ABI extensions? I don't have a clear answer to offer on that but I'd only expect it to be a problem if such extensions are used heavily in (very) long-running processes. > How much would it slow things back down if it wasn't done for ints at all? I'll look into that. We're talking about the ~260 small ints, so it depends on how much they are used relative to all the other int objects that are used in a program. > Some more reasoning for not worrying about de-immortalizing in types > without this optimization: > These objects will be de-immortalized with refcount around 2^29, and > then incref/decref go back to being paired properly. If 2^29 is much > higher than the true reference count at de-immortalization, this'll just > cause a memory leak at shutdown. > And it's probably OK to assume that the true reference count of an > object can't be anywhere near 2^29: most of the time, to hold a > reference you also need to have a pointer to the referenced object, and > there ain't enough memory for that many pointers. This isn't a formally > sound assumption, of course -- you can incref a million times with a > single pointer if you pair the decrefs correctly. But it
[Python-Dev] Re: PEP 684: A Per-Interpreter GIL
On Wed, Mar 9, 2022 at 7:37 PM Carl Meyer wrote: > > Note that Instagram isn't exactly using Cinder. > > This sounds like a misunderstanding somewhere. Instagram server is > "exactly using Cinder" :) :) Thanks for clarifying, Carl. > > I'll have to check if Cinder uses the pre-fork model. > > It doesn't really make sense to ask whether "Cinder uses the pre-fork > model" -- Cinder is just a CPython variant, it can work with all the > same execution models CPython can. Instagram server uses Cinder with a > pre-fork execution model. Some other workloads use Cinder without > pre-forking. +1 -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZI6JXJJ2F6DCHTVYUVQFDNPCWEH76J6V/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 684: A Per-Interpreter GIL
Thanks for the feedback, Petr! Responses inline below. -eric On Wed, Mar 9, 2022 at 10:58 AM Petr Viktorin wrote: > This PEP definitely makes per-interpreter GIL sound possible :) Oh good. :) > > PEP: 684 > > Title: A Per-Interpreter GIL > > Author: Eric Snow > > Discussions-To: python-dev@python.org > > Status: Draft > > Type: Standards Track > > Content-Type: text/x-rst > > This iteration of the PEP should also have `Requires: 683` (Immortal > Objects). +1 > > Most of the effort needed for a per-interpreter GIL has benefits that > > make those tasks worth doing anyway: > > > > * makes multiple-interpreter behavior more reliable > > * has led to fixes for long-standing runtime bugs that otherwise > >hadn't been prioritized > * has been exposing (and inspiring fixes for) > > previously unknown > runtime bugs > > * has driven cleaner runtime initialization (:pep:`432`, :pep:`587`) > > * has driven cleaner and more complete runtime finalization > > * led to structural layering of the C-API (e.g. ``Include/internal``) > > * also see `Benefits to Consolidation`_ below > > Do you want to dig up some bpo examples, to make these more convincing > to the casual reader? Heh, the casual reader isn't really my target audience. :) I actually have a stockpile of links but left them all out until they were needed. Would the decision-makers benefit from the links? I'm trying to avoid adding to the already sizeable clutter in this PEP. :) I'll add some links in if you think it matters. > > Furthermore, much of that work benefits other CPython-related projects: > > > > * performance improvements ("faster-cpython") > > * pre-fork application deployment (e.g. Instagram) > > Maybe say “e.g. with Instagram's Cinder” – both the household name and > the project you can link to? +1 Note that Instagram isn't exactly using Cinder. I'll have to check if Cinder uses the pre-fork model. > > * extension module isolation (see :pep:`630`, etc.) > > * embedding CPython > > A lot of these points are duplicated in "Benefits to Consolidation" list > below, maybe there'd be, ehm, benefits to consolidating them? There shouldn't be any direct overlap. FWIW, the whole "Extra Context" section is essentially a separate PEP that I inlined (with the caveat that it really isn't worth its own PEP). I'm still considering yanking it, so the above list should stand on its own. > > PEP 554 > > --- > > Please spell out "PEP 554 (Multiple Interpreters in the Stdlib)", for > people who don't remember the magic numbers but want to skim the table > of contents. +1 > This list doesn't render correctly in ReST, you need blank lines everywhere. > There are more cases like this below. Hmm, I had blank lines and the PEP editor told me I needed to remove them. > [...]> Per-Interpreter State > > - > > > > The following runtime state will be moved to ``PyInterpreterState``: > > > > * all global objects that are not safely shareable (fully immutable) > > * the GIL > > * mutable, currently protected by the GIL > > Spelling out “mutable state” in these lists would make this clearer, > since “state” isn't elided from all the points. +1 > > * mutable, currently protected by some other per-interpreter lock > > * mutable, may be used independently in different interpreters > > This includes extension modules (with multi-phase init), right? Yep. > > The following state will not be moved: > > > > * global objects that are safely shareable, if any > > * immutable, often ``const`` > > * treated as immutable > > Do you have an example for this? Strings (PyUnicodeObject) actually cache some info, making them not strictly immutable, but they are close enough to be treated as such. I'll add a note to the PEP. > > * related to CPython's ``main()`` execution > > * related to the REPL > > Would “only used by” work instead of “related to”? Sure. > > * set during runtime init, then treated as immutable > > `main()`, REPL and runtime init look like special cases of functionality > that only runs in one interpreter. If it's so, maybe generalize this? +1 > > * ``_PyInterpreterConfig`` > > * ``_Py_NewInterpreter()`` (as ``Py_NewInterpreterEx()``) > > Since the API is not documented (and _PyInterpreterConfig is not even in > main yet!), it would be good to sketch out the docs (intended behavior) > here. +1 > > The following fields will be added to ``PyInterpreterConfig``: > > > > * ``own_gil`` - (bool) create a new interpreter lock > >(instead of sharing with the main interpre
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)
On Mon, Feb 28, 2022 at 6:01 PM Eric Snow wrote: > The updated PEP text is included below. The largest changes involve > either the focus of the PEP (internal mechanism to mark objects > immortal) or the possible ways that things can break on older 32-bit > stable ABI extensions. All other changes are smaller. In particular, I'm hoping to get your thoughts on the "Accidental De-Immortalizing" section. While I'm confident we will find a good solution, I'm not yet confident about the specific solution. So feedback would be appreciated. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2NNPKXRL6HY7IYUDMEQ6DS5RC3AYQKYQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP 684: A Per-Interpreter GIL
I'd really appreciate feedback on this new PEP about making the GIL per-interpreter. The PEP targets 3.11, but we'll see if that is too close. I don't mind waiting one more release, though I'd prefer 3.11 (obviously). Regardless, I have no intention of rushing this through at the expense of cutting corners. Hence, we'll see how it goes. The PEP text is included inline below. Thanks! -eric === PEP: 684 Title: A Per-Interpreter GIL Author: Eric Snow Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Mar-2022 Python-Version: 3.11 Post-History: 08-Mar-2022 Resolution: Abstract Since Python 1.5 (1997), CPython users can run multiple interpreters in the same process. However, interpreters in the same process have always shared a significant amount of global state. This is a source of bugs, with a growing impact as more and more people use the feature. Furthermore, sufficient isolation would facilitate true multi-core parallelism, where interpreters no longer share the GIL. The changes outlined in this proposal will result in that level of interpreter isolation. High-Level Summary == At a high level, this proposal changes CPython in the following ways: * stops sharing the GIL between interpreters, given sufficient isolation * adds several new interpreter config options for isolation settings * adds some public C-API for fine-grained control when creating interpreters * keeps incompatible extensions from causing problems The GIL --- The GIL protects concurrent access to most of CPython's runtime state. So all that GIL-protected global state must move to each interpreter before the GIL can. (In a handful of cases, other mechanisms can be used to ensure thread-safe sharing instead, such as locks or "immortal" objects.) CPython Runtime State - Properly isolating interpreters requires that most of CPython's runtime state be stored in the ``PyInterpreterState`` struct. Currently, only a portion of it is; the rest is found either in global variables or in ``_PyRuntimeState``. Most of that will have to be moved. This directly coincides with an ongoing effort (of many years) to greatly reduce internal use of C global variables and consolidate the runtime state into ``_PyRuntimeState`` and ``PyInterpreterState``. (See `Consolidating Runtime Global State`_ below.) That project has `significant merit on its own `_ and has faced little controversy. So, while a per-interpreter GIL relies on the completion of that effort, that project should not be considered a part of this proposal--only a dependency. Other Isolation Considerations -- CPython's interpreters must be strictly isolated from each other, with few exceptions. To a large extent they already are. Each interpreter has its own copy of all modules, classes, functions, and variables. The CPython C-API docs `explain further `_. .. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats However, aside from what has already been mentioned (e.g. the GIL), there are a couple of ways in which interpreters still share some state. First of all, some process-global resources (e.g. memory, file descriptors, environment variables) are shared. There are no plans to change this. Second, some isolation is faulty due to bugs or implementations that did not take multiple interpreters into account. This includes CPython's runtime and the stdlib, as well as extension modules that rely on global variables. Bugs should be opened in these cases, as some already have been. Depending on Immortal Objects - :pep:`683` introduces immortal objects as a CPython-internal feature. With immortal objects, we can share any otherwise immutable global objects between all interpreters. Consequently, this PEP does not need to address how to deal with the various objects `exposed in the public C-API `_. It also simplifies the question of what to do about the builtin static types. (See `Global Objects`_ below.) Both issues have alternate solutions, but everything is simpler with immortal objects. If PEP 683 is not accepted then this one will be updated with the alternatives. This lets us reduce noise in this proposal. Motivation == The fundamental problem we're solving here is a lack of true multi-core parallelism (for Python code) in the CPython runtime. The GIL is the cause. While it usually isn't a problem in practice, at the very least it makes Python's multi-core story murky, which makes the GIL a consistent distraction. Isolated interpreters are also an effective mechanism to support certain concurrency models. :pep:`554` discusses this in more detail. Indirect Benefits - Most of the effort needed for a per-interpreter GIL has benefits that make those tasks worth doing anyway: * make
[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)
I've updated PEP 683 for the feedback I've gotten. Thanks again for that! The updated PEP text is included below. The largest changes involve either the focus of the PEP (internal mechanism to mark objects immortal) or the possible ways that things can break on older 32-bit stable ABI extensions. All other changes are smaller. Given the last round of discussion, I'm hoping this will be the last round before we go to the steering council. -eric PEP: 683 Title: Immortal Objects, Using a Fixed Refcount Author: Eric Snow , Eddie Elizondo Discussions-To: https://mail.python.org/archives/list/python-dev@python.org/thread/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-Feb-2022 Python-Version: 3.11 Post-History: 15-Feb-2022, 19-Feb-2022, 28-Feb-2022 Resolution: Abstract Currently the CPython runtime maintains a `small amount of mutable state `_ in the allocated memory of each object. Because of this, otherwise immutable objects are actually mutable. This can have a large negative impact on CPU and memory performance, especially for approaches to increasing Python's scalability. This proposal mandates that, internally, CPython will support marking an object as one for which that runtime state will no longer change. Consequently, such an object's refcount will never reach 0, and so the object will never be cleaned up. We call these objects "immortal". (Normally, only a relatively small number of internal objects will ever be immortal.) The fundamental improvement here is that now an object can be truly immutable. Scope - Object immortality is meant to be an internal-only feature. So this proposal does not include any changes to public API or behavior (with one exception). As usual, we may still add some private (yet publicly accessible) API to do things like immortalize an object or tell if one is immortal. Any effort to expose this feature to users would need to be proposed separately. There is one exception to "no change in behavior": refcounting semantics for immortal objects will differ in some cases from user expectations. This exception, and the solution, are discussed below. Most of this PEP focuses on an internal implementation that satisfies the above mandate. However, those implementation details are not meant to be strictly proscriptive. Instead, at the least they are included to help illustrate the technical considerations required by the mandate. The actual implementation may deviate somewhat as long as it satisfies the constraints outlined below. Furthermore, the acceptability of any specific implementation detail described below does not depend on the status of this PEP, unless explicitly specified. For example, the particular details of: * how to mark something as immortal * how to recognize something as immortal * which subset of functionally immortal objects are marked as immortal * which memory-management activities are skipped or modified for immortal objects are not only CPython-specific but are also private implementation details that are expected to change in subsequent versions. Implementation Summary -- Here's a high-level look at the implementation: If an object's refcount matches a very specific value (defined below) then that object is treated as immortal. The CPython C-API and runtime will not modify the refcount (or other runtime state) of an immortal object. Aside from the change to refcounting semantics, there is one other possible negative impact to consider. A naive implementation of the approach described below makes CPython roughly 4% slower. However, the implementation is performance-neutral once known mitigations are applied. Motivation == As noted above, currently all objects are effectively mutable. That includes "immutable" objects like ``str`` instances. This is because every object's refcount is frequently modified as the object is used during execution. This is especially significant for a number of commonly used global (builtin) objects, e.g. ``None``. Such objects are used a lot, both in Python code and internally. That adds up to a consistent high volume of refcount changes. The effective mutability of all Python objects has a concrete impact on parts of the Python community, e.g. projects that aim for scalability like Instragram or the effort to make the GIL per-interpreter. Below we describe several ways in which refcount modification has a real negative effect on such projects. None of that would happen for objects that are truly immutable. Reducing CPU Cache Invalidation --- Every modification of a refcount causes the corresponding CPU cache line to be invalidated. This has a number of effects. For one, the write must be propagated to other cache levels and to main memory. This has small effect on all Python programs. Immortal objects
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Wed, Feb 23, 2022 at 4:21 PM Antonio Cuni wrote: > When refcheck=True (the default), numpy raises an error if you try to resize > an array inplace whose refcnt > 2 (although I don't understand why > 2 and > not > 1, and the docs aren't very clear about this). > > That said, relying on the exact value of the refcnt is very bad for > alternative implementations and for HPy, and in particular it is impossible > to implement ndarray.resize(refcheck=True) correctly on PyPy. So from this > point of view, a wording which explicitly restricts the "legal" usage of the > refcnt details would be very welcome. Thanks for the feedback and example. It helps. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/D23Z3C7CQIIGALDRSU4RDDM7GVUAASGW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Wed, Feb 23, 2022 at 9:16 AM Petr Viktorin wrote: >>> But tp_dict is also public C-API. How will that be handled? >>> Perhaps naively, I thought static types' dicts could be treated as >>> (deeply) immutable, and shared? >> >> They are immutable from Python code but not from C (due to tp_dict). >> Basically, we will document that tp_dict should not be used directly >> (in the public API) and refer users to a public getter function. I'll >> note this in the PEP. > > What worries me is that existing users of the API haven't read the new > documentation. What will happen if users do use it? > Or worse, add things to it? We will probably set it to NULL, so the user code would fail or crash. I suppose we could set it to a dummy object that emits helpful errors. However, I don't think that is worth it. We're talking about where users are directly accessing tp_dict of the builtin static types, not their own. That is already something they should definitely not be doing. > (Hm, the current docs are already rather confusing -- 3.2 added a note > that "It is not safe to ... modify tp_dict with the dictionary C-API.", > but above that it says "extra attributes for the type may be added to > this dictionary [in some cases]") Yeah, the docs will have to be clarified. >> Having thought about it some more, I don't think this PEP should be >> strictly bound to per-interpreter GIL. That is certainly my personal >> motivation. However, we have a small set of users that would benefit >> significantly, the change is relatively small and simple, and the risk >> of breaking users is also small. > > Right, with the recent performance improvements it's looking like it > might stand on its own after all. Great! >> Honestly, it might not have needed a PEP in the first place if I >> had been a bit more clear about the idea earlier. > > Maybe it's good to have a PEP to clear that up :) Yeah, the PEP process has been helpful for that. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AKFMFZ45UJXED24YRB4NHQ4HT442XVSP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
Responses inline below. -eric On Tue, Feb 22, 2022 at 7:22 PM Inada Naoki wrote: > > For a recent example, see > > https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/. > > It is not proven example, but just a hope at the moment. So option is > fine to prove the idea. > > Although I can not read the code, they said "patching ASLR by patching > `ob_type` fields;". > It will cause CoW for most objects, isn't it? > > So reducing memory write don't directly means reducing CoW. > Unless we can stop writing on a page completely, the page will be copied. Yeah, they would have to address that. > > CPU cache invalidation exists regardless. With the current GIL the > > effect it is reduced significantly. > > It's an interesting point. We can not see the benefit from > pypeformance, because it doesn't use much data and it runs one process > at a time. > So the pyperformance can not make enough stress to the last level > cache which is shared by many cores. > > We need multiprocess performance benchmark apart from pyperformance, > to stress the last level cache from multiple cores. > It helps not only this PEP, but also optimizing containers like dict and set. +1 > Can proposed optimizations to eliminate the penalty guarantee that > every __del__, weakref are not broken, > and no memory leak occurs when the Python interpreter is initialized > and finalized multiple times? > I haven't confirmed it yet. They will not break __del__ or weakrefs. No memory will leak after finalization. If any of that happens then it is a bug. > FWIW, I filed an issue to remove hash cache from bytes objects. > https://github.com/faster-cpython/ideas/issues/290 > > Code objects have many bytes objects, (e.g. co_code, co_linetable, etc...) > Removing it will save some RAM usage and make immortal bytes truly > immutable, safe to be shared between interpreters. +1 Thanks! ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QKMPALMWGF5366C6PQRSIIFVNXKF4UAM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Tue, Feb 22, 2022, 20:26 Larry Hastings wrote: > Are these optimizations specifically for the PR, or are these > optimizations we could apply without taking the immortal objects? Kind of > like how Sam tried to offset the nogil slowdown by adding optimizations > that we went ahead and added anyway ;-) > Basically all the optimizations require immortal objects. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7VJVBFBWE3HWTPRVZH3WLSR7EZHZD337/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Sat, Feb 19, 2022 at 12:46 AM Eric Snow wrote: > Performance > --- > > A naive implementation shows `a 4% slowdown`_. > Several promising mitigation strategies will be pursued in the effort > to bring it closer to performance-neutral. See the `mitigation`_ > section below. FYI, Eddie has been able to get us back to performance-neutral after applying several of the mitigation strategies we discussed. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZYGZEQSVBS6ODVAHPL3QN4CJ7JN4FYWO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Mon, Feb 21, 2022 at 4:56 PM Terry Reedy wrote: > We could say that the only refcounts with any meaning are 0, 1, and > 1. Yeah, that should work. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7HZ7VBJQOYHXFV3ZD4V7DCMLBL4Q34WP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
On Mon, Feb 21, 2022 at 10:56 AM wrote: > For what it's worth Cython does this for string concatenation to concatenate > in place if possible (this optimization was copied from CPython). It could be > disabled relatively easily if it became a problem (it's already CPython only > and version checked so it'd just need another upper-bound version check). That's good to know. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OEZS4KGQJET5DL3M2OTB76I4W7F56FJC/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
Thanks for the responses. I've replied inline below. -eric On Mon, Feb 21, 2022 at 9:11 AM Petr Viktorin wrote: > > On 19. 02. 22 8:46, Eric Snow wrote: > > Thanks to all those that provided feedback. I've worked to > > substantially update the PEP in response. The text is included below. > > Further feedback is appreciated. > > Thank you! This version is much clearer. I like the PEP more and more! Great! > I've sent a PR with a some typo fixes: > https://github.com/python/peps/pull/2348 Thank you. > > Public Refcount Details > [...] > > As part of this proposal, we must make sure that users can clearly > > understand on which parts of the refcount behavior they can rely and > > which are considered implementation details. Specifically, they should > > use the existing public refcount-related API and the only refcount value > > with any meaning is 0. All other values are considered "not 0". > > Should we care about hacks/optimizations that rely on having the only > reference (or all references), e.g. mutating a tuple if it has refcount > 1? Immortal objects shouldn't break them (the special case simply won't > apply), but this wording would make them illegal. > AFAIK CPython uses this internally, but I don't know how > prevalent/useful it is in third-party code. Good point. As Terry suggested, we could also let 1 have meaning. Regardless, any documented restriction would only apply to users of the public C-API, not to internal code. > > _Py_IMMORTAL_REFCNT > > --- > > > > We will add two internal constants:: > > > > #define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4)) > > #define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2)) > > As a nitpick: could you say this in prose? > > * ``_Py_IMMORTAL_BIT`` has the third top-most bit set. > * ``_Py_IMMORTAL_REFCNT`` has the third and fourth top-most bits set. Sure. > > Immortal Global Objects > > --- > > > > All objects that we expect to be shared globally (between interpreters) > > will be made immortal. That includes the following: > > > > * singletons (``None``, ``True``, ``False``, ``Ellipsis``, > > ``NotImplemented``) > > * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``) > > * all static objects in ``_PyRuntimeState.global_objects`` (e.g. > > identifiers, > >small ints) > > > > All such objects will be immutable. In the case of the static types, > > they will be effectively immutable. ``PyTypeObject`` has some mutable > > start (``tp_dict`` and ``tp_subclasses``), but we can work around this > > by storing that state on ``PyInterpreterState`` instead of on the > > respective static type object. Then the ``__dict__``, etc. getter > > will do a lookup on the current interpreter, if appropriate, instead > > of using ``tp_dict``. > > But tp_dict is also public C-API. How will that be handled? > Perhaps naively, I thought static types' dicts could be treated as > (deeply) immutable, and shared? They are immutable from Python code but not from C (due to tp_dict). Basically, we will document that tp_dict should not be used directly (in the public API) and refer users to a public getter function. I'll note this in the PEP. > Perhaps it would be best to leave it out here and say say "The details > of sharing ``PyTypeObject`` across interpreters are left to another PEP"? > Even so, I'd love to know the plan. What else would you like to know? There isn't much to it. For each of the builtin static types we will keep the relevant mutable state on PyInterpreterState and look it up there in the relevant getters (e.g. __dict__ and __subclasses__). > (And even if these are internals, > changes to them should be mentioned in What's New, for the sake of > people who need to maintain old extensions.) +1 > > Object Cleanup > > -- > > > > In order to clean up all immortal objects during runtime finalization, > > we must keep track of them. > > > > For GC objects ("containers") we'll leverage the GC's permanent > > generation by pushing all immortalized containers there. During > > runtime shutdown, the strategy will be to first let the runtime try > > to do its best effort of deallocating these instances normally. Most > > of the module deallocation will now be handled by > > ``pylifecycle.c:finalize_modules()`` which cleans up the remaining > > modules as best as we can. It will change which modules are available > > during __del__ but that's already defined as undefined behavior by the > > docs. Optionally, we
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
Thanks for the feedback. I've responded inline below. -eric On Sat, Feb 19, 2022 at 8:50 PM Inada Naoki wrote: > I hope per-interpreter GIL success at some point, and I know this is > needed for per-interpreter GIL. > > But I am worrying about per-interpreter GIL may be too complex to > implement and maintain for core developers and extension writers. > As you know, immortal don't mean sharable between interpreters. It is > too difficult to know which object can be shared, and where the > shareable objects are leaked to other interpreters. > So I am not sure that per interpreter GIL is achievable goal. I plan on addressing this in the PEP I am working on for per-interpreter GIL. In the meantime, I doubt the issue will impact any core devs. > So I think it's too early to introduce the immortal objects in Python > 3.11, unless it *improve* performance without per-interpreter GIL > Instead, we can add a configuration option such as > `--enalbe-experimental-immortal`. I agree that immortal objects aren't quite as appealing in general without per-interpreter GIL. However, there are actual users that will benefit from it, assuming we can reduce the performance penalty to acceptable levels. For a recent example, see https://mail.python.org/archives/list/python-dev@python.org/message/B77BQQFDSTPY4KA4HMHYXJEV3MOU7W3X/. > On Sat, Feb 19, 2022 at 4:52 PM Eric Snow wrote: > > > > Reducing CPU Cache Invalidation > > --- > > > > Avoiding Data Races > > --- > > > > Both benefits require a per-interpreter GIL. CPU cache invalidation exists regardless. With the current GIL the effect it is reduced significantly. Per-interpreter GIL is only one situation where data races matter. Any attempt to generally eliminate the GIL must deal with races on the per-object runtime state. > > > > Avoiding Copy-on-Write > > -- > > > > For some applications it makes sense to get the application into > > a desired initial state and then fork the process for each worker. > > This can result in a large performance improvement, especially > > memory usage. Several enterprise Python users (e.g. Instagram, > > YouTube) have taken advantage of this. However, the above > > refcount semantics drastically reduce the benefits and > > has led to some sub-optimal workarounds. > > > > As I wrote before, fork is very difficult to use safely. We can not > recommend to use it for many users. > And I don't think reducing the size of patch in Instagram or YouTube > is not good rational for this kind of change. What do you mean by "this kind of change"? The proposed change is relatively small. It certainly isn't nearly as intrusive as many changes we make to internals without a PEP. If you are talking about the performance penalty, we should be able to eliminate it. > > Also note that "fork" isn't the only operating system mechanism > > that uses copy-on-write semantics. Anything that uses ``mmap`` > > relies on copy-on-write, including sharing data from shared objects > > files between processes. > > > > It is very difficult to reduce CoW with mmap(MAP_PRIVATE). > > You may need to write hash of bytes and unicode. You may be need to > write `tp_type`. > Immortal objects can "reduce" the memory write. But "at least one > memory write" is enough to trigger the CoW. Correct. However, without immortal objects (AKA immutable per-object runtime-state) it goes from "very difficult" to "basically impossible". > > Accidental Immortality > > -- > > > > While it isn't impossible, this accidental scenario is so unlikely > > that we need not worry. Even if done deliberately by using > > ``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU > > cycle, it would take 2^61 cycles (on a 64-bit processor). At a fast > > 5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)! > > If that CPU were 32-bit then it is (technically) more possible though > > still highly unlikely. > > > > Technically, `[obj] * (2**(32-4))` is 1GB array on 32bit. The question is if this matters. If really necessary, the PEP can demonstrate that it doesn't matter in practice. (Also, the magic value on 32-bit would be 2**29.) > > > > Constraints > > --- > > > > * ensure that otherwise immutable objects can be truly immutable > > * be careful when immortalizing objects that are not otherwise immutable > > I am not sure about what this means. > For example, unicode objects are not immutable because they have hash, > utf8 cache and wchar_t cache. (wchar
[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)
Thanks to all those that provided feedback. I've worked to substantially update the PEP in response. The text is included below. Further feedback is appreciated. -eric PEP: 683 Title: Immortal Objects, Using a Fixed Refcount Author: Eric Snow , Eddie Elizondo Discussions-To: https://mail.python.org/archives/list/python-dev@python.org/thread/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-Feb-2022 Python-Version: 3.11 Post-History: 15-Feb-2022 Resolution: Abstract Currently the CPython runtime maintains a `small amount of mutable state `_ in the allocated memory of each object. Because of this, otherwise immutable objects are actually mutable. This can have a large negative impact on CPU and memory performance, especially for approaches to increasing Python's scalability. The solution proposed here provides a way to mark an object as one for which that per-object runtime state should not change. Specifically, if an object's refcount matches a very specific value (defined below) then that object is treated as "immortal". If an object is immortal then its refcount will never be modified by ``Py_INCREF()``, etc. Consequently, the refcount will never reach 0, so that object will never be cleaned up (unless explicitly done, e.g. during runtime finalization). Additionally, all other per-object runtime state for an immortal object will be considered immutable. This approach has some possible negative impact, which is explained below, along with mitigations. A critical requirement for this change is that the performance regression be no more than 2-3%. Anything worse the performance-neutral requires that the other benefits are proportionally large. Aside from specific applications, the fundamental improvement here is that now an object can be truly immutable. (This proposal is meant to be CPython-specific and to affect only internal implementation details. There are some slight exceptions to that which are explained below. See `Backward Compatibility`_, `Public Refcount Details`_, and `scope`_.) Motivation == As noted above, currently all objects are effectively mutable. That includes "immutable" objects like ``str`` instances. This is because every object's refcount is frequently modified as the object is used during execution. This is especially significant for a number of commonly used global (builtin) objects, e.g. ``None``. Such objects are used a lot, both in Python code and internally. That adds up to a consistent high volume of refcount changes. The effective mutability of all Python objects has a concrete impact on parts of the Python community, e.g. projects that aim for scalability like Instragram or the effort to make the GIL per-interpreter. Below we describe several ways in which refcount modification has a real negative effect on such projects. None of that would happen for objects that are truly immutable. Reducing CPU Cache Invalidation --- Every modification of a refcount causes the corresponding CPU cache line to be invalidated. This has a number of effects. For one, the write must be propagated to other cache levels and to main memory. This has small effect on all Python programs. Immortal objects would provide a slight relief in that regard. On top of that, multi-core applications pay a price. If two threads (running simultaneously on distinct cores) are interacting with the same object (e.g. ``None``) then they will end up invalidating each other's caches with each incref and decref. This is true even for otherwise immutable objects like ``True``, ``0``, and ``str`` instances. CPython's GIL helps reduce this effect, since only one thread runs at a time, but it doesn't completely eliminate the penalty. Avoiding Data Races --- Speaking of multi-core, we are considering making the GIL a per-interpreter lock, which would enable true multi-core parallelism. Among other things, the GIL currently protects against races between multiple concurrent threads that may incref or decref the same object. Without a shared GIL, two running interpreters could not safely share any objects, even otherwise immutable ones like ``None``. This means that, to have a per-interpreter GIL, each interpreter must have its own copy of *every* object. That includes the singletons and static types. We have a viable strategy for that but it will require a meaningful amount of extra effort and extra complexity. The alternative is to ensure that all shared objects are truly immutable. There would be no races because there would be no modification. This is something that the immortality proposed here would enable for otherwise immutable objects. With immortal objects, support for a per-interpreter GIL becomes much simpler. Avoiding Copy-on-Write -- For some applications it makes sense to
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings wrote: > I experimented with this at the EuroPython sprints in Berlin years ago. I > was sitting next to MvL, who had an interesting observation about it. Classic MvL! :) > He suggested(*) all the constants unmarshalled as part of loading a module > should be "immortal", and if we could rejigger how we allocated them to store > them in their own memory pages, that would dovetail nicely with COW > semantics, cutting down on the memory use of preforked server processes. Cool idea. I may mention it in the PEP as a possibility. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2PRODXEVNO53YYFRL6JUWZQF77WOYS4C/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 8:45 PM Inada Naoki wrote: > Is there any common tool that utilize CoW by mmap? > If you know, please its link to the PEP. > If there is no common tool, most Python users can get benefit from this. Sorry, I'm not aware of any, but I also haven't researched the topic much. Regardless, that would be a good line of inquiry. A reference like that would probably help make the PEP a bit more justifiable without per-interpreter GIL. :) > Generally speaking, fork is a legacy API. It is too difficult to know > which library is fork-safe, even for stdlibs. And Windows users can > not use fork. > Optimizing for non-fork use case is much better than optimizing for > fork use cases. +1 > I hope per-interpreter GIL replaces fork use cases. Yeah, that's definitely one big benefit. > But tools using CoW without fork also welcome, especially if it > supports Windows. +1 > Anyway, I don't believe stopping refcounting will fix the CoW issue > yet. See this article [1] again. > > [1] > https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172 That's definitely an important point, given that the main objective of the proposal is to allow disabling mutation of runtime-internal object state so that some objects can be made truly immutable. I'm sure Eddie has some good insight on the matter (and may have even been involved in writing that article). Eddie? > Note that they failed to fix CoW by stopping refcounting code objects! (*) > Most CoW was caused by cyclic GC and finalization caused most CoW. That's a good observation! > (*) It is not surprising to me because eval loop don't incre/decref > most code attributes. They borrow reference from the code object. +1 > So we need a sample application and profile it, before saying it fixes CoW. > Could you provide some data, or drop the CoW issue from this PEP until > it is proved? We'll look into that. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ESRBMP4WTNONED3K6Z5HMYYY2WIMQZT3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
Again, thanks for the reply. It's helpful. My further responses are inline below. -eric On Thu, Feb 17, 2022 at 3:42 AM Petr Viktorin wrote: > > Agreed. However, what behavior do users expect and what guarantees do > > we make? Do we indicate how to interpret the refcount value they > > receive? What are the use cases under which a user would set an > > object's refcount to a specific value? Are users setting the refcount > > of objects they did not create? > > That's what I hoped the PEP would tell me. Instead of simply claiming > that there won't be issues, it should explain why we won't have any issues. > [snip] > IMO, the reasoning should start from the assumption that things will > break, and explain why they won't (or why the breakage is acceptable). > If the PEP simply tells me upfront that things will be OK, I have a hard > time trusting it. > > IOW, it's clear you've thought about this a lot (especially after > reading your replies here), but it's not clear from the PEP. > That might be editorial nitpicking, if it wasn't for the fact that I > want find any gaps in your research and reasoning, and invite everyone > else to look for them as well. Good point.. It's easy to dump a bunch of unnecessary info into a PEP, and it was hard for me to know where the line was in this case. There hadn't been much discussion previously about the possible ways this change might break users. So thanks for bringing this up. I'll be sure to put a more detailed explanation in the PEP, with a bit more evidence too. > Ah, I see. I was confused by this: No worries! I'm glad we cleared it up. I'll make sure the PEP is more understandable about this. > > This is also true even with the GIL, though the impact is smaller. > > Smaller than what? The baseline for that comparison is a hypothetical > GIL-less interpreter, which is only introduced in the next section. > Perhaps say something like "Python's GIL helps avoid this effect, but > doesn't eliminate it." Good point. I'll clarify the point. > >> Weren't you planning a PEP on subinterpreter GIL as well? Do you want to > >> submit them together? > > > > I'd have to think about that. The other PEP I'm writing for > > per-interpreter GIL doesn't require immortal objects. They just > > simplify a number of things. That's my motivation for writing this > > PEP, in fact. :) > > Please think about it. > If you removed the benefits for per-interpreter GIL, the motivation > section would be reduced to is memory savings for fork/CoW. (And lots of > performance improvements that are great in theory but sum up to a 4% loss.) Sounds good. Would this involve more than a note at the top of the PEP? And just to be clear, I don't think the fate of a per-interpreter GIL PEP should not depend on this one. > > It wouldn't match _Py_IMMORTAL_REFCNT, but the high bit of > > _Py_IMMORTAL_REFCNT would still match. That bit is what we would > > actually be checking, rather than the full value. > > It makes sense once you know _Py_IMMORTAL_REFCNT has two bits set. Maybe > it'd be good to note that detail -- it's an internal detail, but crucial > for making things safe. Will do. > >> What about extensions compiled with Python 3.11 (with this PEP) that use > >> an older version of the stable ABI, and thus should be compatible with > >> 3.2+? Will they use the old versions of the macros? How will that be > >> tested? > > > > It wouldn't matter unless an object's refcount reached > > _Py_IMMORTAL_REFCNT, at which point incref/decref would start > > noop'ing. What is the likelihood (in real code) that an object's > > refcount would grow that far? Even then, would such an object ever be > > expected to go back to 0 (and be dealloc'ed)? Otherwise the point is > > moot. > > That's exactly the questions I'd hope the PEP to answer. I could > estimate that likelihood myself, but I'd really rather just check your > work ;) > > (Hm, maybe I couldn't even estimate this myself. The PEP doesn't say > what the value of _Py_IMMORTAL_REFCNT is, and in the ref implementation > a comment says "This can be safely changed to a smaller value".) Got it. I'll be sure that the PEP is more clear about that. Thanks for letting me know. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LRUQDLVTC7GV4K3HHZK2ESPW3AHW4NKJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 10:43 PM Jim J. Jewett wrote: > I suggest being a little more explicit (even blatant) that the particular > details of: > [snip] > are not only Cpython-specific, but are also private implementation details > that are expected to change in subsequent versions. Excellent point. > Ideally, things like the interned string dictionary or the constants from a > pyc file will be not merely immortal, but stored in an immortal-only memory > page, so that they won't be flushed or CoW-ed when a nearby non-immortal > object is modified. That's definitely worth looking into. > Getting those details right will make a difference to performance, and you > don't want to be locked in to the first draft. Yep, that is one big reason I was trying to avoid spelling out every detail of our plan. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/535SKVXHPFZQMKRB2YC6UVQLN2TZ4RMY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 2:41 PM Terry Reedy wrote: > > * the naive implementation shows a 4% slowdown > > Without understanding all the benefits, this seems a bit too much for > me. 2% would be much better. Yeah, we consider 4% to be too much. 2% would be great. Performance-neutral would be even better, of course. :) > > * we have a number of strategies that should reduce that penalty > > I would like to see that before approving the PEP. I expect it would be enough to show where things stand with benchmark results. It did not seem like the actual mitigation strategies were as important, so I opted to leave them out to avoid clutter. Plus it isn't clear yet what approaches will help the most, nor how much we can win back. So I didn't want to distract with hypotheticals. If it's important I can add that in. > > * without immortal objects, the implementation for per-interpreter GIL > > will require a number of non-trivial workarounds > > To me, that says to speed up immortality first. Agreed. > > That last one is particularly meaningful to me since it means we would > > definitely miss the 3.11 feature freeze. > > 3 1/2 months from now. > > > With immortal objects, 3.11 would still be in reach. > > Is it worth trying to rush it a bit? I'd rather not rush this. I'm saying that, for per-interpreter GIL, 3.11 is within reach without rushing if we have immortal objects. Without them, 3.11 is realistic without rushing things. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CYPYFPFGB7ONMVSTDHFDKZL26E7KG6MO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 12:14 PM Kevin Modzelewski wrote: > fwiw Pyston has immortal objects, though with a slightly different goal and > thus design [1]. I'm not necessarily advocating for our design (it makes most > sense if there is a JIT involved), but just writing to report our experience > of making a change like this and the compatibility effects. Thanks! > Importantly, our system allows for the reference count of immortal objects to > change, as long as it doesn't go below half of the original very-high value. > So extension code with no concept of immortality will still update the > reference counts of immortal objects, but this is fine. Because of this we > haven't seen any issues with extension modules. As Guido noted, we are taking a similar approach for the sake of older extensions built with the limited API. As a precaution, we start the refcount for immortal objects basically at _Py_IMMORTAL_REFCNT * 1.5. Then we only need to check the high bit of _Py_IMMORTAL_REFCNT to see if an object is immortal. > The small amount of compatibility challenges we've run into have been in > testing code that checks for memory leaks. For example this code breaks on > Pyston: > [snip] > This might work with this PEP, but we've also seen code that asserts that the > refcount increases by a specific value, which I believe wouldn't. Right, this is less of an issue for us since normally we do not change the refcount of immortal objects. Also, CPython's test suite keeps us honest about leaking references and memory blocks. :) > For Pyston we've simply disabled these tests, figuring that our users still > have CPython to test on. Personally I consider this breakage to be small, but > I hadn't seen anyone mention the potential usage of sys.getrefcount() so I > thought I'd bring it up. Thanks again for that. > [1] Our goal is to entirely remove refcounting operations when we can prove > we are operating on an immortal object. We can prove it in a couple cases: > sometimes simply, such as in Py_RETURN_NONE, but mostly our JIT will often > know the immortality of objects it embeds into the code. So if we can prove > statically that an object is immortal then we elide the incref/decrefs, and > if we can't then we use an unmodified Py_INCREF/Py_DECREF. This means that > our reference counts on immortal objects will change, so we detect > immortality by checking if the reference count is at least half of the > original very-high value. FWIW, we anticipate that we can take a similar approach in CPython's eval loop, specializing for immortal objects. We are also updating Py_RETURN_NONE, etc. to stop incref'ing. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CDBGYUDROQZNEM6LAREIEKSZSQ72BLOH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
Thanks for the feedback. My responses are inline below. -eric On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin wrote: > Thank you very much for writing this down! It's very helpful to see a > concrete proposal, and the current state of this idea. > I like the change, That's good to hear. :) > but I think it's unfortunately more complicated than > the PEP suggests. That would be unsurprising. :) > > This proposal is CPython-specific and, effectively, describes > > internal implementation details. > > I think that is a naïve statement. Refcounting is > implementation-specific, but it's hardly an *internal* detail. Sorry for any confusion. I didn't mean to say that refcounting is an internal detail. Rather, I was talking about how the proposed change in refcounting behavior doesn't affect any guaranteed/documented behavior, hence "internal". Perhaps I missed some documented behavior? I was going off the following: * https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-counts * https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT > There is > code that targets CPython specifically, and relies on the details. Could you elaborate? Do you mean such code relies on specific refcount values? > The refcount has public getters and setters, Agreed. However, what behavior do users expect and what guarantees do we make? Do we indicate how to interpret the refcount value they receive? What are the use cases under which a user would set an object's refcount to a specific value? Are users setting the refcount of objects they did not create? > and you need a pretty good > grasp of the concept to write a C extension. I would not expect this to be affected by this PEP, except in cases where users are checking/modifying refcounts for objects they did not create (since none of their objects will be immortal). > I think that it's safe to assume that this will break people's code, Do you have some use case in mind, or an example? From my perspective I'm having a hard time seeing what this proposed change would break. That said, Kevin Modzelewski indicated [1] that there were affected cases for Pyston (though their change in behavior is slightly different). [1] https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/ > and > this PEP should convince us that the breakage is worth it rather than > dismiss the issue. Sorry, I didn't mean to be dismissive. I agree that if there is breakage this PEP must address it. > It would be good to note that “container” refers to the GC term, as in > https://devguide.python.org/garbage_collector/#identifying-reference-cycles > > and not e.g. > https://docs.python.org/3/library/collections.abc.html#collections.abc.Container +1 > > This has a concrete impact on active projects in the Python community. > > Below we describe several ways in which refcount modification has > > a real negative effect on those projects. None of that would > > happen for objects that are truly immutable. > > > > Reducing Cache Invalidation > > --- > > Explicitly saying “CPU cache” would make the PEP easier to skim. +1 > > Every modification of a refcount causes the corresponding cache > > line to be invalidated. This has a number of effects. > > > > For one, the write must be propagated to other cache levels > > and to main memory. This has small effect on all Python programs. > > Immortal objects would provide a slight relief in that regard. > > > > On top of that, multi-core applications pay a price. If two threads > > are interacting with the same object (e.g. ``None``) then they will > > end up invalidating each other's caches with each incref and decref. > > This is true even for otherwise immutable objects like ``True``, > > ``0``, and ``str`` instances. This is also true even with > > the GIL, though the impact is smaller. > > This looks out of context. Python has a per-process GIL. It should it go > after the next section. This isn't about a data race. I'm talking about how if an object is active in two different threads (on distinct cores) then incref/decref in one thread will invalidate the cache (line) in the other thread. The only impact of the GIL in this case is that the two threads aren't running simultaneously and the cache invalidation on the idle thread has less impact. Perhaps I've missed something? > > The proposed solution is obvious enough that two people came to the > > same conclusion (and implementation, more or less) independently. > > Who was it? Assuming it's not a secret :) Me and Eddit. :) I don't mind saying so. > > In the case of per-interpreter GIL, the only realistic alternative > > is to move all global objects into ``PyInterpreterState`` and add > > one or more lookup functions to access them. Then we'd have to > > add some hacks to the C-API to preserve compatibility for the > > may objects exposed there. The story is much, much
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
On Wed, Feb 16, 2022 at 12:37 AM Inada Naoki wrote: > +1 for overall idea. Great! > > Also note that "fork" isn't the only operating system mechanism > > that uses copy-on-write semantics. > > Could you elaborate? mmap, maybe? > [snip[ > So if you know how to get benefit from CoW without fork, I want to know it. Sorry if I got your hopes up. Yeah, I was talking about mmap. > > There will likely be others we have not enumerated here. > > How about interned strings? Marking every interned string as immortal may make sense. > Should the intern dict be belonging to runtime, or (sub)interpreter? > > If the interned dict is belonging to runtime, all interned dict should > be immortal to be shared between subinterpreters. Excellent questions. Making immutable objects immortal is relatively simple. For the most part, mutable objects should not be shared between interpreters without protection (e.g. the GIL). The interned dict isn't exposed to Python code or the C-API, so there's less risk, but it still wouldn't work without cleverness. So it should be per-interpreter. It would be nice if it were global though. :) > If the interned dict is belonging to interpreter, should we register > immortalized string to all interpreters? That's a good point. It may be worth doing something like that. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VQYLSPHHP2EE2KPDWCXDLMBAXYAE72D3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount"
Eddie and I would appreciate your feedback on this proposal to support treating some objects as "immortal". The fundamental characteristic of the approach is that we would provide stronger guarantees about immutability for some objects. A few things to note: * this is essentially an internal-only change: there are no user-facing changes (aside from affecting any 3rd party code that directly relies on specific refcounts) * the naive implementation shows a 4% slowdown * we have a number of strategies that should reduce that penalty * without immortal objects, the implementation for per-interpreter GIL will require a number of non-trivial workarounds That last one is particularly meaningful to me since it means we would definitely miss the 3.11 feature freeze. With immortal objects, 3.11 would still be in reach. -eric --- PEP: 683 Title: Immortal Objects, Using a Fixed Refcount Author: Eric Snow , Eddie Elizondo Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 10-Feb-2022 Python-Version: 3.11 Post-History: Resolution: Abstract Under this proposal, any object may be marked as immortal. "Immortal" means the object will never be cleaned up (at least until runtime finalization). Specifically, the `refcount`_ for an immortal object is set to a sentinel value, and that refcount is never changed by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``. For immortal containers, the ``PyGC_Head`` is never changed by the garbage collector. Avoiding changes to the refcount is an essential part of this proposal. For what we call "immutable" objects, it makes them truly immutable. As described further below, this allows us to avoid performance penalties in scenarios that would otherwise be prohibitive. This proposal is CPython-specific and, effectively, describes internal implementation details. .. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts Motivation == Without immortal objects, all objects are effectively mutable. That includes "immutable" objects like ``None`` and ``str`` instances. This is because every object's refcount is frequently modified as it is used during execution. In addition, for containers the runtime may modify the object's ``PyGC_Head``. These runtime-internal state currently prevent full immutability. This has a concrete impact on active projects in the Python community. Below we describe several ways in which refcount modification has a real negative effect on those projects. None of that would happen for objects that are truly immutable. Reducing Cache Invalidation --- Every modification of a refcount causes the corresponding cache line to be invalidated. This has a number of effects. For one, the write must be propagated to other cache levels and to main memory. This has small effect on all Python programs. Immortal objects would provide a slight relief in that regard. On top of that, multi-core applications pay a price. If two threads are interacting with the same object (e.g. ``None``) then they will end up invalidating each other's caches with each incref and decref. This is true even for otherwise immutable objects like ``True``, ``0``, and ``str`` instances. This is also true even with the GIL, though the impact is smaller. Avoiding Data Races --- Speaking of multi-core, we are considering making the GIL a per-interpreter lock, which would enable true multi-core parallelism. Among other things, the GIL currently protects against races between multiple threads that concurrently incref or decref. Without a shared GIL, two running interpreters could not safely share any objects, even otherwise immutable ones like ``None``. This means that, to have a per-interpreter GIL, each interpreter must have its own copy of *every* object, including the singletons and static types. We have a viable strategy for that but it will require a meaningful amount of extra effort and extra complexity. The alternative is to ensure that all shared objects are truly immutable. There would be no races because there would be no modification. This is something that the immortality proposed here would enable for otherwise immutable objects. With immortal objects, support for a per-interpreter GIL becomes much simpler. Avoiding Copy-on-Write -- For some applications it makes sense to get the application into a desired initial state and then fork the process for each worker. This can result in a large performance improvement, especially memory usage. Several enterprise Python users (e.g. Instagram, YouTube) have taken advantage of this. However, the above refcount semantics drastically reduce the benefits and has led to some sub-optimal workarounds. Also note that "fork" isn't the only operating system mechanism that uses copy-on-write seman
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 10:15 AM Eric Snow wrote: > Yes, I plan on benchmarking the change as soon as we can run > pyperformance on main. I just ran the benchmarks and the PR makes CPython 4% slower. See https://github.com/python/cpython/pull/19474#issuecomment-1032944709. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3TT6Q5TQLUMLL5TWTKHRTXQ3XATHIUBW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Fri, Feb 4, 2022 at 8:25 PM Eric Snow wrote: > On Fri, Feb 4, 2022, 16:03 Guido van Rossum wrote: >> I wonder if a better solution than that PR wouldn't be to somehow change the >> implementation of _Py_IDENTIFIER() to do that, > > Yeah, I had the same realization today. I'm going to try it out. I updated _Py_IDENTIFIER() to use a statically initialized string object and it isn't too bad. The tricky thing is that PyASCIIObject expects to the data to be an array after the object. So the field must be a pre-sized array (like I did in gh-30928). That makes things messier. The alternative is to do what Steve is suggesting. I ran the benchmarks and making _Py_IDENTIFIER() a statically initialized object makes things 2% slower (instead of 1% faster). There are a few things I could do to speed that up a little, but at best we'd get back to performance-neutral. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DDWOJLFOTXTZ35LMBCPH2DHFMCSVLHH5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Fri, Feb 4, 2022, 16:03 Guido van Rossum wrote: > I wonder if a better solution than that PR wouldn't be to somehow change > the implementation of _Py_IDENTIFIER() to do that, > Yeah, I had the same realization today. I'm going to try it out. -eric > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/A7Q4TBBOCEAXZYOY6GSY3NA2FSVNUMHL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Thu, Feb 3, 2022 at 3:49 PM Eric Snow wrote: > I suppose I'd like to know what the value of _Py_IDENTIFIER() is for > 3rd party modules. Between Guido, Victor, Stefan, and Sebastian, I'm getting the sense that a public replacement for _Py_IDENTIFER() would be worth pursuing. Considering that it would probably help numpy move toward subinterpreter support, I may work on this after all. :) (For core CPython we'll still benefit from the statically initialized strings, AKA gh-30928.) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/D2LHEZZUQH66Q5ZIOEJTGSCEMQEMKCUQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Fri, Feb 4, 2022 at 8:21 AM Stefan Behnel wrote: > Correct. We (intentionally) have our own way to intern strings and do not > depend on CPython's identifier framework. You're talking about __Pyx_StringTabEntry (and __Pyx_InitString())? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q5AL3SLW5BCUA6FLDBUNZTH5Z7ZYAHER/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Thu, Feb 3, 2022 at 4:01 PM Guido van Rossum wrote: > Why not read through some of that code and see what they are doing with it? Yep, I'm planning on it. > I imagine one advantage is that _Py_IDENTIFIER() can be used entirely local > to a function. Yeah, they'd have to put something like this in their module init: state->partial_str = PyUnicode_InternFromString("partial"); if (state->partial_str == NULL) { return NULL; } > E.g. (from _operator.c): > > _Py_IDENTIFIER(partial); > functools = PyImport_ImportModule("functools"); > if (!functools) > return NULL; > partial = _PyObject_GetAttrId(functools, _partial); > > That's convenient since it means they don't have to pass module state around. I might call that cheating. :) For an extension module this means they are storing a little bit of their state in the runtime/interpreter state instead of in their module state. Is there precedent for that with any of our other API? Regardless, the status quo certainly is simpler (if they aren't already using module state in the function). Without _Py_IDENTIFER() it would look like: functools = PyImport_ImportModule("functools"); if (!functools) return NULL; my_struct *state = (my_struct*)PyModule_GetState(module); if (state == NULL) { Py_DECREF(functools); return NULL; } partial = PyObject_GetAttr(functools, state->partial_str); If they are already using the module state in their function then the code would be simpler: functools = PyImport_ImportModule("functools"); if (!functools) return NULL; partial = PyObject_GetAttr(functools, state->partial_str); -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4625QOLXLZAAU2XNXEQM5W2JWX3FH4VM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Thu, Feb 3, 2022 at 7:26 AM Victor Stinner wrote: > In bpo-39465, I made the _PyUnicode_FromId() compatible with running > sub-interpreters in parallel (one GIL per interpreter). > > A "static" PyUnicodeObject would have to share the reference count > between sub-interpreters, whereas Py_INCREF/Py_DECREF are not > thread-safe: there is lock to prevent data races. Yeah, if we end up not being able to safely share the global string objects between interpreters then we would move them under PyInterpreterState. Currently I'm putting them under _PyRuntimeState. Doing that might reduce the performance benefits a little, since Py_GET_GLOBAL_STRING() would have to look up the interpreter to use (or we'd have to pass it in). That doesn't seem like much of a penalty though and doesn't impact the other benefits of the change. > Is there a way to push the "immortal objects" strategy discussed in > bpo-40255? I'm planning on circling back to that next week. > The deepfreeze already pushed some functions related to > that, like _PyObject_IMMORTAL_INIT() in the internal C API. > Moreover... deepfreeze already produces "immortal" PyUnicodeObject > strings using the "ob_refcnt = 9" hack. Note we only set the value really high as a safety precaution since these objects are all statically allocated. Eddie Elizondo's proposal involves a number of other key points, including keeping the refcount from changing. > IMO we should decide on a strategy. Either we move towards immortal > objects (modify Py_INCREF/Py_DECREF to not modify the ref count if an > object is immortal), or we make sure that no Python is shared between > two Python interpreters. +1 The catch is that things get messier when we make some objects per-interpreter while others stay runtime-global. I'm going to write a bit more about this next week, but the best strategy will probably be to first consolidate all the global objects under _PyRuntimeState first and then move them to PyInterpreterState all at once when we can do it safely. > > I'd also like to actually get rid of _Py_IDENTIFIER(), along with > > other related API including ~14 (private) C-API functions. Dropping > > all that helps reduce maintenance costs. > > Is it required by your work on static strings, or is it more about > removing the API which would no longer be consumed by Python itself? It is definitely not required for that. Rather, we won't need it any more so we should benefit from getting rid of it. The only blocker is that some 3rd party modules are using it. > If it's not required, would it make sense to follow the PEP 387 > deprecation (mark functions as deprecated, document the deprecation, > and wait 2 releases to remove it)? If you think it's worth it. It's a private API. I'd rather work to get 3rd party modules off it and then move on sooner. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WR33NVCSIHOMN5X7YGCL2DHNCBQGKWAU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Thu, Feb 3, 2022 at 7:17 AM Victor Stinner wrote: > In the top 5000 PyPI projects, I found 11 projects using them: > [snip[ > They use the these 17 functions: Thanks! That is super helpful. > If the _Py_IDENTIFIER() API is removed, it would be *nice* to provide > a migrate path (tool?) to help these projects moving away the > _Py_IDENTIFIER() API. Or at least do the work to update these 11 > projects. If something like _Py_IDENTIFIER() provides genuine value then we should consider a proper public API. Otherwise I agree that we should work with those projects to stop using it. I guess either way they should stop using the "private" API. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IGXAXUSDBKYOOVVFSAUYLE5R5TXVZT4A/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Thu, Feb 3, 2022 at 6:46 AM Ronald Oussoren wrote: > Although my gut feeling is that adding a the CI check you mention is good > enough and adding the tooling for generating code isn’t worth the additional > complexity. Yeah, I came to the same conclusion. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FAMDFSTN5KR2Z7LOVTK5GGF6YKR6G65Z/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Wed, Feb 2, 2022 at 11:50 PM Inada Naoki wrote: > It would be nice to provide something similar to _PY_IDENTIFIER, but > designed (and documented) for 3rd party modules like this. I suppose I'd like to know what the value of _Py_IDENTIFIER() is for 3rd party modules. They can already use PyUnicode_InternFromString() to get a "global" object and then store it in their module state. I would not expect _Py_IDENTIFIER() to provide much of an advantage over that. Perhaps I'm missing something? If there is a real benefit then we should definitely figure out a good public API for it (if the current private one isn't sufficient). I won't be authoring that PEP though. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AVJKKWITJPHUQTE2IXDYBCTQTKVPZPD7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Moving away from _Py_IDENTIFIER().
I'm planning on moving us to a simpler, more efficient alternative to _Py_IDENTIFIER(), but want to see if there are any objections first before moving ahead. Also see https://bugs.python.org/issue46541. _Py_IDENTIFIER() was added in 2011 to replace several internal string object caches and to support cleaning up the cached objects during finalization. A number of "private" functions (each with a _Py_Identifier param) were added at that time, mostly corresponding to existing functions that take PyObject* or char*. Note that at present there are several hundred uses of _Py_IDENTIFIER(), including a number of duplicates. My plan is to replace our use of _Py_IDENTIFIER() with statically initialized string objects (as fields under _PyRuntimeState). That involves the following: * add a PyUnicodeObject field (not a pointer) to _PyRuntimeState for each string that currently uses _Py_IDENTIFIER() (or _Py_static_string()) * statically initialize each object as part of the initializer for _PyRuntimeState * add a macro to look up a given global string * update each location that currently uses _Py_IDENTIFIER() to use the new macro instead Pros: * reduces indirection (and extra calls) for C-API functions that need the strings (making the code a little easier to understand and speeding it up) * the objects are referenced from a fixed address in the static data section instead of the heap (speeding things up and allowing the C compiler to optimize better) * there is no lazy allocation (or lookup, etc.) so there are fewer possible failures when the objects get used (thus less error return checking) * saves memory (at little, at least) * if needed, the approach for per-interpreter is simpler * helps us get rid of several hundred static variables throughout the code base * allows us to get rid of _Py_IDENTIFIER() and a bunch of related C-API functions * "deep frozen" modules can use the global strings * commonly-used strings could be pre-allocated by adding _PyRuntimeState fields for them Cons: * a little less convenient: adding a global string requires modifying a separate file from the one where you actually want to use the string * strings can get "orphaned" (I'm planning on checking in CI) * some strings may never get used for any given ./python invocation (not that big a difference though) I have a PR up (https://github.com/python/cpython/pull/30928) that adds the global strings and replaces use of _Py_IDENTIFIER() in our code base, except for in non-builtin stdlib extension modules. (Those will be handled separately if we proceed.) The PR also adds a CI check for "orphaned" strings. It leaves _Py_IDENTIFIER() for now, but disallows any Py_BUILD_CORE code from using it. With that change I'm seeing a 1% improvement in performance (see https://github.com/faster-cpython/ideas/issues/230). I'd also like to actually get rid of _Py_IDENTIFIER(), along with other related API including ~14 (private) C-API functions. Dropping all that helps reduce maintenance costs. However, at least one PyPI project (blender) is using _Py_IDENTIFIER(). So, before we could get rid of it, we'd first have to deal with that project (and any others). To sum up, I wanted to see if there are any objections before I start merging anything. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DNMZAMB4M6RVR76RDZMUK2WRLI6KAAYS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
On Wed, Feb 2, 2022 at 3:41 PM Eric Snow wrote: > I'd also like to actually get rid of _Py_IDENTIFIER(), along with > other related API including ~14 (private) C-API functions. FTR, here is the (private/internal) C-API affected by getting rid of _Py_IDENTIFIER(): * 21 C-API functions with `_Py_Identifer` parameters - would be dropped + _PyUnicode_FromId() + _PyUnicode_EqualToASCIIId() + _PyObject_CallMethodId() + _PyObject_CallMethodId_SizeT() + _PyObject_CallMethodIdObjArgs() + _PyObject_VectorcallMethodId() + _PyObject_CallMethodIdNoArgs() + _PyObject_CallMethodIdOneArg() + _PyEval_GetBuiltinId() + _PyDict_GetItemId() + _PyDict_SetItemId() + _PyDict_DelItemId() + _PyDict_ContainsId() + _PyImport_GetModuleId() + _PyType_LookupId() + _PyObject_LookupSpecial() + _PyObject_GetAttrId() + _PyObject_SetAttrId() + _PyObject_LookupAttrId() + _PySys_GetObjectId() + _PySys_SetObjectId() * 7 new internal functions to replace the _Py*Id() functions that didn't already have a normal counterpart + _PyObject_CallMethodObj() + _PyObject_IsSingleton() + _PyEval_GetBuiltin() + _PySys_SetAttr() + _PyObject_LookupSpecial() (with PyObject* param) + _PyDict_GetItemWithError() + _PyObject_CallMethod() * the runtime state related to identifiers - would be dropped * _Py_Identifier, _Py_IDENTIFIER(), _Py_static_string() - would be dropped -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OXKABHIUDUQETWXXBKUWD63XN65IVC22/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Python no longer leaks memory at exit
On Thu, Jan 27, 2022 at 8:40 AM Victor Stinner wrote: > tl; dr Python no longer leaks memory at exit on the "python -c pass" command > ;-) Thanks to all for the effort on this! Would it be worth adding a test to make sure we don't start leaking memory again? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ELLXKXMQAZ3WMLDDNKU7QLR6AGE36JJR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)
On Wed, Jan 5, 2022, 15:02 Trent Nelson wrote: > I thought that was pretty interesting. Potentially many, many upper > bits for the taking. The code also had some logic that would int 3 > as soon as a 32-bit refcnt overflowed, and that never hit either > (obviously, based on the numbers above). > > I also failed to come up with real-life code that would result in a > Python object having a reference count higher than None's refcnt, but > that may have just been from lack of creativity. > > Just thought I'd share. > Thanks, Trent. That's super helpful. -eric > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FMSE7AFZVJVBFRQMMYAEAXELITHN2E3B/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Static types and subinterpreters running in parallel
On Thu, Dec 16, 2021 at 10:54 AM Guido van Rossum wrote: > > Eric has been looking into this. It's probably the only solution if we can't > get immutable objects. Yep. I've investigated the following approach (for the objects exposed in the public and limited C-API): * add a pointer field to PyInterpreterState (or a sub-struct) for each of the objects * for the main interpreter, set those pointers to the existing statically declared objects * for subinterpreters make a copy (memcpy()?) and fix it up * add a lookup API and encourage extensions to use it * for 3.11+ change the symbols to macros: + in the internal C-API (Py_BUILD_CORE), the macro would resolve to the corresponding PyInterpreterState field + in the public C-API (and limited API extensions built with 3.11+), the macro would resolve to a call to a (non-inline) lookup function + for limited API extensions built against earlier Python versions we'd still export the existing symbols * limited API extensions built against pre-3.11 Python would only be allowed to run in the main interpreter on 3.11+ + they probably weren't built with subinterpreters in mind anyway There are still a number of details to sort out, but nothing that seems like a huge obstacle. Here are the ones that come to mind, along with other details, caveats, and open questions: * the static types exposed in the C-API are PyObject values rather than pointers + I solved this by dereferencing the result of the lookup function (Guido's idea), e.g. #define PyTuple_Type (*(_Py_GetObject_Tuple())) * there is definitely a penalty to using a per-interpreter lookup function + this would only apply to extension modules since internally we would access the PyInterpreterState fields directly + this is mostly a potential problem only when the object is directly referenced frequently (e.g. a tight loop), + the impact would probably center on use of the high-frequency singletons (None, True, False) and possibly with Py*_CheckExact() calls + would it be enough of a problem to be worth mitigating? how would we do so? * static types in extensions can't have tp_base set to a builtin type (since the macro won't resolve) + extensions that support subinterpreters (i.e. PEP 489) won't be using static types (a weak assumption) + extensions that do not support subinterpreters and still have static types would probably break + how to fix that? * limited API extensions built against 3.11+ but running under older Python versions would break? + how to fix that? > But I would prefer the latter, if we can get the performance penalty low > enough. Absolutely. Using immortal objects to solve this is a much simpler solution. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7RPTHCLEUHR34PIJKRN453UEWCAI56NW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
On Thu, Dec 16, 2021 at 4:34 AM Antoine Pitrou wrote: > As a data point, in PyArrow, we have a bunch of C++ code that interacts > with Python but doesn't belong in a particular Python module. That C++ > code can of course have global state, including perhaps Python objects. Thanks for that example! > What might be nice would be a C API to allow creating interpreter-local > opaque structs, for example: > > void* Py_GetInterpreterLocal(const char* unique_name); > void* Py_SetInterpreterLocal(const char* unique_name, > void* ptr, void(*)() destructor); That's interesting. I can imagine that as just a step beyond the module state API, with the module being implicit. Do you think this would be an improvement over using module state? (I'm genuinely curious.) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KB7ET6XXJFTJDBHL7ABEPSGTD3M2RNAW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)
On Thu, Dec 16, 2021 at 2:48 AM Petr Viktorin wrote: > But does the sign bit need to stay intact, and do we actually need to > rely on the immortal bit to always be set for immortal objects? > If the refcount rolls over to zero, an immortal object's dealloc could > bump it back and give itself another few minutes. > Allowing such rollover would mean having to deal with negative > refcounts, but that might be acceptable. FWIW, my original attempt at immortal objects (quite a while ago) used the sign bit as the marker (negative refcount meant immortal). However, this broke GC and Py_DECREF() and getting those to work right was a pain. It also made a few things harder to debug because a negative refcount no longer necessarily indicated something had gone wrong. In the end I switched to a really high bit as the marker and it was all much simpler. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LJ2WVSUPJY2X3VVJW4EEEFNOBRJ7AB4V/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)
On Tue, Dec 14, 2021 at 10:12 AM Eric Snow wrote: > * it is fully backward compatible and the C-API is essentially unaffected Hmm, this is a little misleading. It will definitely be backward incompatible for extension modules that don't work under multiple subinterpreters (or rely on the GIL to protect global state). Hence that other thread I started. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UDMQXP6GO5SYJGHKHX2W4VRSNAZ55PMI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 12:18 PM Chris Angelico wrote: > Sorry if this is a dumb question, but would it be possible to solve > that last point with an immortal arena [1] from which immortal objects > could be allocated? None/True/False could be allocated there, but so > could anything that is more dynamic, if it's decided as important > enough. It would still be possible to recognize them by pointer (since > the immortal arena would be a specific block of memory). That's an interesting idea. An immortal arena would certainly be one approach to investigate. However, I'm not convinced there is enough value to justify going out of our way to allow dynamically allocated objects to be immortal. Keep in mind that the concept of immortal objects would probably not be available outside the internal API, and, internally, any objects we want to be immortal will probably be statically allocated. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/I2ZG4J577Q4CDWXQHYCOMOFMPJPP5XJT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Tue, Dec 14, 2021 at 11:19 AM Eric Snow wrote: > There is one solution that would help both of the above in a nice way: > "immortal" objects. FYI, here are some observations that came up during some discussions with the "faster-cpython" team today: * immortal objects should probably only be immutable ones (other than ob_refcnt, of course) * GC concerns are less of an issue if a really high ref count (bit) is used to identify immortal objects * ob_refcnt is part of the public API (sadly), so using it to mark immortal objects may be sensitive to interference * ob_refcnt is part of the stable ABI (even more sadly), affecting any solution using ref counts * using the ref count isn't the only viable approach; another would be checking the pointer itself + put the object in a specific section of static data and compare the pointer against the bounds + this avoids loading the actual object data if it is immortal + for objects that are mostly treated as markers (e.g. None), this could have a meaningful impact + not compatible with dynamically allocated objects -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LVLFPOIOXM34NQ2G73BAXIRS4TIN74JV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)
On Wed, Dec 15, 2021 at 6:16 AM Antoine Pitrou wrote: > Did you try to take into account the envisioned project for adding a > "complete" GC and removing the GIL? Yeah. I was going to start a separate thread about per-interpreter GIL vs. no-gil, but figured I was already pushing my luck with 3 simultaneous related threads here. :) It would definitely be covered by the info doc/PEP. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XHJ3PNBW23HXCT4BI3LXYFE4Q5NW576P/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 8:16 AM Skip Montanaro wrote: > It might be worth (re)reviewing Sam Gross's nogil effort to see how he > approached this: Yeah, there is good info in there. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BRQQ4FKWPXIEBSPKR4G2UUC4U4LDF3OV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 4:03 AM Victor Stinner wrote: > The last time I saw a benchmark on immortal object, it was clearly 10% > slower overall on the pyperformance benchmark suite. That's a major > slowdown. Yes, I plan on benchmarking the change as soon as we can run pyperformance on main. > > * abandon all hope > > I wrote https://bugs.python.org/issue39511 and > https://github.com/python/cpython/pull/18301 to have per-interpreter > None, True and False singletons. Yeah, I took a similar approach in the alternative to immortal objects that I prototyped. > By the way, I made the _Py_IDENTIFIER() API and _PyUnicode_FromId() > compatible with subinterpreters in Python 3.10. This change caused a > subtle regression when using subintepreters (because an optimization > made on an assumption on interned strings which is no longer true). > The fix is trivial but I didn't wrote it yet: > https://bugs.python.org/issue46006 FYI, I'm looking into statically allocating (and initializing) all the string objects currently using _Py_IDENTIFIER(). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3XG4QY77MCRXEFUCJHB44RRIHFEM4MDD/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Tue, Dec 14, 2021 at 11:19 AM Eric Snow wrote: > The idea of objects that never get deallocated isn't new and has been > explored here several times. Not that long ago I tried it out by > setting the refcount really high. That worked. Around the same time > Eddie Elizondo at Facebook did something similar but modified > Py_INCREF() and Py_DECREF() to keep the refcount from changing. Our > solutions were similar but with different goals in mind. (Facebook > wants to avoid copy-on-write in their pre-fork model.) FTR, here are links to the above efforts: * reducing CoW (Instagram): https://bugs.python.org/issue40255 * Eddie's PR: https://github.com/python/cpython/pull/19474 * my PR: https://github.com/python/cpython/pull/24828 * some other discussion: https://github.com/faster-cpython/ideas/issues/14 (I don't have a link to any additional work Eddie did to reduce the performance penalty.) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OUJHQY22BZY5TJXYGPQQOBTCLUWB6OVQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 2:42 AM Christian Heimes wrote: > Would it be possible to write the Py_INCREF() and Py_DECREF() macros in > a way that does not depend on branching? For example we could use the > highest bit of the ref count as an immutable indicator and do something like As Antoine pointed out, wouldn't that cause too much cache invalidation between threads, especially for None, True, and False. That's the main reason I abandoned my previous effort (https://github.com/ericsnowcurrently/cpython/pull/9). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UA7CVGRI4N6ADOHDPMM4GC66XYKTW3KL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Wed, Dec 15, 2021 at 2:50 AM Pablo Galindo Salgado wrote: > One thing to consider: ideally, inmortal objects should not participate in > the GC. There is nothing inheritly wrong if they do but we would need to > update the GC (and therefore add more branching in possible hot paths) to > deal with these as the algorithm requires the refcount to be exact to > correctly compute the cycles. That's a good point. Do static types and the global singletons already opt out of GC participation? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ASIWOGWC5CKB3TNIFYS6767HEES5ATSP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
On Tue, Dec 14, 2021 at 4:09 PM Brett Cannon wrote: > There's also the concern of memory usage if these immortal objects are never > collected. > > But which objects are immortal? You only listed None, True, and False. > Otherwise assume/remember I'm management and provide a list and/or link of > what would get marked as immortal so we can have an idea of the memory impact. Pretty much we would mark any object as immortal which would exist for the lifetype of the runtime (or the respective interpreter in some cases). So currently that would include the global singletons (None, True, False, small ints, empty tuple, etc.) and the static types. We would likely also include cached strings (_Py_Identifier, interned, etc.). >From another angle: I'm working on static allocation for nearly all the objects currently dynamically allocated during runtime/interpreter init. All of them would be marked immortal. This is similar to the approach taken by Eddie with walking the heap and marking all objects found. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JKRY6FQYZIFFYQ64BSKLFGWUKX74NZ7M/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
Yeah, no (mutable) global state at the C level. It would also be good to implement multi-phase init (PEP 489), but I don't expect that to require much work itself. -eric On Tue, Dec 14, 2021 at 4:04 PM Brett Cannon wrote: > > > > On Tue, Dec 14, 2021 at 9:41 AM Eric Snow wrote: >> >> One of the open questions relative to subinterpreters is: how to >> reduce the amount of work required for extension modules to support >> them? Thanks to Petr Viktorin for a lot of work he's done in this >> area (e.g. PEP 489)! Extensions also have the option to opt out of >> subinterpreter support. >> >> However, that's only one part of the story. A while back Nathaniel >> expressed concerns with how making subinterpreters more accessible >> will have a negative side effect affecting projects that publish large >> extensions, e.g. numpy. Not all extensions support subinterpreters >> due to global state (incl. in library dependencies). The amount of >> work to get there may be large. As subinterpreters increase in usage >> in the community, so will demand increase for subinterpreter support >> in those extensions. Consequently, such projects be pressured to do >> the extra work (which is made even more stressful by the short-handed >> nature of most open source projects) . >> >> So we (the core devs) would effectively be requiring those extensions >> to support subinterpreters, regardless of letting them opt out. This >> situation has been weighing heavily on my mind since Nathaniel brought >> this up. Here are some ideas I've had or heard of about what we could >> do to help: >> >> * add a page to the C-API documentation about how to support subinterpreters >> * identify the extensions most likely to be impacted and offer to help >> * add more helpers to the C-API to make adding subinterpreter support >> less painful >> * fall back to loading the extension in its own namespace (e.g. use >> ldm_open()) >> * fall back to copying the extension's file and loading from the copied file >> * ... >> >> I'd appreciate your thoughts on what we can do to help. Thanks! > > > What are the requirements put upon an extension in order to support > subinterpreters? you hint at global state at the C level, but nothing else is > mentioned. Is that it? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BQU3PVN6MHR2P24RAUPJSWFS547W7FPM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] "immortal" objects and how they would help per-interpreter GIL
Most of the work toward interpreter isolation and a per-interpreter GIL involves moving static global variables to _PyRuntimeState or PyInterpreterState (or module state). Through the effort of quite a few people, we've made good progress. However, many globals still remain, with the majority being objects and most of those being static strings (e.g. _Py_Identifier), static types (incl. exceptions), and singletons. On top of that, a number of those objects are exposed in the public C-API and even in the limited API. :( Dealing with this specifically is probably the trickiest thing I've had to work through in this project. There is one solution that would help both of the above in a nice way: "immortal" objects. The idea of objects that never get deallocated isn't new and has been explored here several times. Not that long ago I tried it out by setting the refcount really high. That worked. Around the same time Eddie Elizondo at Facebook did something similar but modified Py_INCREF() and Py_DECREF() to keep the refcount from changing. Our solutions were similar but with different goals in mind. (Facebook wants to avoid copy-on-write in their pre-fork model.) A while back I concluded that neither approach would work for us. The approach I had taken would have significant cache performance penalties in a per-interpreter GIL world. The approach that modifies Py_INCREF() has a significant performance penalty due to the extra branch on such a frequent operation. Recently I've come back to the idea of immortal objects because it's much simpler than the alternate (working) solution I found. So how do we get around that performance penalty? Let's say it makes CPython 5% slower. We have some options: * live with the full penalty * make other changes to reduce the penalty to a more acceptable threshold than 5% * eliminate the penalty (e.g. claw back 5% elsewhere) * abandon all hope Mark Shannon suggested to me some things we can do. Also, from a recent conversation with Dino Viehland it sounds like Eddie was able to reach performance-neutral with a few techniques. So here are some things we can do to reduce or eliminate that penalty: * reduce refcount operations on high-activity objects (e.g. None, True, False) * reduce refcount operations in general * walk the heap at the end of runtime initialization and mark all objects as immortal * mark all global objects as immortal (statics or in _PyRuntimeState; for PyInterpreterState not needed) What do you think? Does this sound realistic? Are there additional things we can do to counter that penalty? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7O3FUA52QGTVDC6MDAV5WXKNFEDRK5D6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] subinterpreters and their possible impact on large extension projects
One of the open questions relative to subinterpreters is: how to reduce the amount of work required for extension modules to support them? Thanks to Petr Viktorin for a lot of work he's done in this area (e.g. PEP 489)! Extensions also have the option to opt out of subinterpreter support. However, that's only one part of the story. A while back Nathaniel expressed concerns with how making subinterpreters more accessible will have a negative side effect affecting projects that publish large extensions, e.g. numpy. Not all extensions support subinterpreters due to global state (incl. in library dependencies). The amount of work to get there may be large. As subinterpreters increase in usage in the community, so will demand increase for subinterpreter support in those extensions. Consequently, such projects be pressured to do the extra work (which is made even more stressful by the short-handed nature of most open source projects) . So we (the core devs) would effectively be requiring those extensions to support subinterpreters, regardless of letting them opt out. This situation has been weighing heavily on my mind since Nathaniel brought this up. Here are some ideas I've had or heard of about what we could do to help: * add a page to the C-API documentation about how to support subinterpreters * identify the extensions most likely to be impacted and offer to help * add more helpers to the C-API to make adding subinterpreter support less painful * fall back to loading the extension in its own namespace (e.g. use ldm_open()) * fall back to copying the extension's file and loading from the copied file * ... I'd appreciate your thoughts on what we can do to help. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/X3ZOSP2A4RTSKTBZ4XYHROSJBONCEDID/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] my plans for subinterpreters (and a per-interpreter GIL)
Hi all, I'm still hoping to land a per-interpreter GIL for 3.11. There is still a decent amount of work to be done but little of it will require solving any big problems: * pull remaining static globals into _PyRuntimeState and PyInterpreterState * minor updates to PEP 554 * finish up the last couple pieces of the PEP 554 implementation * maybe publish a companion PEP about per-interpreter GIL There are also a few decisions to be made. I'll open a couple of other threads to get feedback on those. Here I'd like your thoughts on the following: Do we need a PEP about per-interpreter GIL? I haven't thought there would be much value in such a PEP. There doesn't seem to be any decision that needs to be made. At best the PEP would be an explanation of the project, where: * the objective has gotten a lot of support (and we're working on addressing the concerns of the few objectors) * most of the required work is worth doing regardless (e.g. improve runtime init/fini, eliminate static globals) * the performance impact is likely to be a net improvement * it is fully backward compatible and the C-API is essentially unaffected So the value of a PEP would be in consolidating an explanation of the project into a single document. It seems like a poor fit for a PEP. (You might wonder, "what about PEP 554?" I purposefully avoided any discussion of the GIL in PEP 554. It's purpose is to expose subinterpreters to Python code.) However, perhaps I'm too close to it all. I'd like your thoughts on the matter. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)
On Thu, Dec 9, 2021, 11:26 Petr Viktorin wrote: > I'll not get back to CPython until Tuesday, but I'll add a quick note > for now. It's a bit blunt for lack of time; please don't be offended. > Not at all. :) The tooling is a secondary concern to my point. Mostly, I wish the declarations in the header files had the extra classifications, rather than having to remember to refer to a separate text file. > -eric > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OV7BOBOINCBWLZS3DZRWWJGY3BE4IOZB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)
(replying to https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ2NVUFNZDXVNK5DU3R3JGLL3J/) On Wed, Dec 8, 2021 at 10:06 AM Eric Snow wrote: > What about the various symbols listed in Misc/stable_abi.txt that were > accidentally added to the limited API? Can we move toward dropping > them from the stable ABI? tl;dr We should consider making classifications related to the stable ABI harder to miss. Knowing what is in the limited API is fairly straightforward. [1] However, it's clear that identifying what is part of the stable ABI, and why, is not so easy. Currently, we must rely on Misc/stable_abi.txt [2] (and the associated Tools/scripts/stable_abi.py). Documentation (C-API docs, PEPs, devguide) help too. Yet, there's a concrete disconnect here: the header files are by definition the authoritative single-source-of-truth for the C-API and it's too easy to forget about supplemental info in another file or document. This out-of-sight-out-of-mind situation is part of how we accidentally added things to the limited API for a while. [3] The stable ABI isn't the only area where we must identify different subsets of the C-API. However, in those other cases we use different structural/naming conventions to explicitly group things. Most importantly, each of those conventions makes the grouping unavoidable when reading the code. [4] For example: * closely related declarations go in the same header file (and then also exposed via Include/Python.h) * prefixes (e.g. Py_, PyDict_) provides similar grouping * an additional underscore prefix identifies "private" C-API * symbols are explicitly identified as part of the C-API via macros (PyAPI_FUNC, PyAPI_DATA) [5] * relatively recently, different directories correspond to different API layers (Include, Include/cpython, Include/internal) [3] Could we take a similar explicit, coupled-to-the-code approach to identify when the different stable ABI situations apply? Here's the specific approach I had in mind, with macros similar to PyAPI_FUNC: * PyAPI_ABI_FUNC - in stable ABI when it wouldn't be normally (e.g. underscore prefix, in Include/internal) * PyAPI_ABI_INDIRECT - exposed in stable ABI due to a macro * PyAPI_ABI_ONLY - it only exists for ABI compatibility and isn't actually used any more * PyAPI_ABI_ACCIDENTAL - unintentionally added to limited API, probably not used there (...or perhaps use a PyABI_ prefix, though that's a bit easy to miss when reading.) As a reader I would find markers like this helpful in recognizing those special situations, as well as the constraints those situations impose on modification. At the least such macros would indicate something different is going on, and the macro name would be something I could look up if I needed more info. I expect others reading the code would get comparable value. I also expect tools like Tools/scripts/stable_abi.py would benefit. -eric [1] in Include/*.h and not #ifndef Py_LIMITED_API (sadly also making it easy to accidentally add things to the limited API, see [3]) [2] Before that you had to rely on comments or external documents or, in the worst case, work it out through careful study of the code, commit history, and mailing list archives. [3] The addition of Include/cpython and Include/internal helped us stop accidentally adding to the limited API. [4] It also makes the groupings deterministically discoverable by tools. [5] explicit use of "extern" indicates a different intent ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7BSVTXDYCEOURQTLDRUXPXNPRYMM3I4G/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*
On Thu, Dec 9, 2021 at 1:56 AM Petr Viktorin wrote: > It's possible to remove them just like _PyObject_GC_Malloc was removed, > but check that it was unusable (e.g. not called from public macros) in > all versions of Python from 3.2 up to now. That's what I expected. Thanks. > Could you check if this PR makes things clear? > https://github.com/python/devguide/pull/778 Yeah, that text is super helpful. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TB7JEBQXUJJKK4SZVLCMUNOTRTD5KQ5C/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*
On Wed, Dec 8, 2021 at 2:23 AM Petr Viktorin wrote: > That really depends on what function we'd want to remove. There are > usually alternatives to deleting things, but the options depend on the > function. If we run out of other options we can make the function always > fail or make it leak memory. > And the regular backwards compatibility policy gives us 2 years to > figure something out :) What about the various symbols listed in Misc/stable_abi.txt that were accidentally added to the limited API? Can we move toward dropping them from the stable ABI? Most notably, there are quite a few functions listed there that are in the stable ABI but no longer in the limited API. This implies that either they were already deprecated in the limited API (and removed) or they were just removed. At least in some cases they were moved to header files in Include/cpython or Include/internal. So I would not expect extensions to be using them. This subset of those symbols seems entirely appropriate to remove from the stable ABI. Is that okay? Do we even need to bother deprecating them? What about just the "private" ones? For example, I went to change/remove _PyThreadState_Init() (internal API declared in Include/internal/pycore_pystate.h) and found that it is in the stable ABI but not the limited API. It's highly unlikely anyone is using it and plan on double-checking. As far as I can tell, the function was accidentally exposed in the limited API and stable ABI and later removed from the limited API. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ2NVUFNZDXVNK5DU3R3JGLL3J/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 6:55 AM Eric V. Smith wrote: > As a compromise, how about go with #1, but print a warning if python > detects that it's not built with optimizations or is run from a source > tree (the conditions in #2 and #3)? The warning could suggest running > with "-X frozen_modules=off". I realize that it will probably be ignored > over time, but maybe it will provide enough of a reminder if someone is > debugging and sees the warning. Yeah, that would probably be sufficient (and much simpler). I'll try it out. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EFNVXABK36DTKO6IDFC2PTP6P4OHM46B/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 6:47 AM Pablo Galindo Salgado wrote: > One interesting consequence of what Eric mentioned (They have a different > loader and repr. Also, frozen modules do not > have __file__ set (and __path__ is always []).) is that frozen modules don't > have a `__file__` attribute IIRC and therefore > tracebacks won't include the source. FYI, we are planning on setting __file__ on the frozen stdlib modules, whenever possible. (We can do that whenever we can determine the stdlib dir during startup. See https://bugs.python.org/issue45211.) Regardless, for tracebacks we would need to set co_filename on the module's code objects, right? -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/W6F2V3H3KHGLOL5CJDLTO7DGO37LYIG5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 6:36 AM Victor Stinner wrote: > Honestly, for me, #1: always on, is the most reasonable choice. > > I dislike when Python behaves differently depending on subtle things > like "was it built with optimizations" or "is Python started from its > source tree"? > > When I built Python without optimization and/or from its source tree, > I do that to debug an issue. If the bug goes away in this case, it can > waste my time. > > So I prefer to teach everybody how to use "-X frozen_modules=off" if > they want to hack the stdlib for their greatest pleasure. I prefer > that such special use case requires an opt-in option, the special use > case is not special enough to be the default. Agreed. I just don't want to discourage potential contributors nor waste anyone's time. I suppose that's the fundamental question I originally posted: would it be too annoying for contributors if we made the default "on" always? I expect most non-docs contributions are made against the stdlib so that factors in. > It means that the site module module can no longer be "customized" by > modifying directly the site.py file (inject a path in PYTHONPATH env > var where the customized site.py lives). But there is already a > supported way to customize the site module: create a module named > "sitecustomize" or "usercustomizer". I recall that virtualenv likes to > override stdlib site.py with its own code. tox uses virtualenv by > default. Someone should check if freezing site doesn't break > virtualenv and tox, since they seem to be popular in Python. The venv > doesn't need to override site.py and tox can use venv if I recall > correctly. > > If site.py customization is too popular, I would suggest to not freeze > this one, until the community stops doing that. Good point. I'll look into that. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/M53U66ZP7QUSHDBYK2HONALLKW2EKSFQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 2:22 AM Marc-Andre Lemburg wrote: > #3 sounds like a good solution, but how would you detect "running > from the source tree" ? This sounds like you need another stat call > somewhere, which is what the frozen modules try to avoid. We already look for the stdlib dir in Modules/getpath.c. We can use that information without an extra stat. (See https://bugs.python.org/issue45211.) > I'd like to suggest adding an environment variable to enable / > disable the setting instead. This makes it easy to customize the > behavior without introducing complicated logic. That's essentially what "-X frozen_modules=..." provides, though with an env var you don't have to adjust your CLI invocation each time. That said, there are a couple reasons why an env var might not be suitable. For one, I expect use of the -X option to be very uncommon, especially outside of core development, so more of a one-off feature. In contrast, to me environment variables imply repeated usage. Also, if we use an env var to override the default (of "on"), contributors will still get bitten by the problem I described originally. To me, it's important that the default in that case be "off" without any other intervention. FWIW, I consider the "complicated logic" part as the negative side of going with running-in-source-tree. So, at this point I'm leaning more toward Brett's suggestion of using "configure --with-pydebug" (AKA Py_DEBUG) to determine the default. That should be a suitable approximation of running-in-source-tree. We can circle back if it proves inadequate. On Tue, Sep 28, 2021 at 2:26 AM Marc-Andre Lemburg wrote: > Just to clarify: the modules would still always be frozen with > the env var setting, but Python would simply not import them > as frozen modules, but instead go and look on the PYTHONPATH > for the modules. > > This could be achieved by special casing the frozen module > finder function to only trigger on importlib modules and > return NULL for all other possibly frozen modules. Right. That is essentially what we're doing. (See find_frozen() in Python/import.c.) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BVXMNZYHBPTKYC4QEVHGWUKQMLR2XGSZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 6:02 AM Ronald Oussoren via Python-Dev wrote: > Of course. I mentioned it because the proposal is to add a new option that’s > enabled after installation, and basically not when the testsuite is run. > That’s not a problem, we could just enable the option in most CI jobs. FYI, I already added the CLI option (-X frozen_modules=[on|off]) a couple weeks ago, with the default always "off", and have frozen about 10 of the stdlib modules (see _imp._frozen_module_names()). This thread is about a satisfactory approach to changing the default to "on". -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HD7GBPNT74GPY6COVQ6W4V7MTJ4NIHUT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Tue, Sep 28, 2021 at 2:54 AM Ronald Oussoren via Python-Dev wrote: > I agree, but… Most CPython tests are run while running from the source tree, > that means that there will have to be testrunner configurations that run with > “-X frozen_modules=on”. If the build option that determines the default is covered by existing builtbots then we will be running the test suite in both modes without any extra work. The alternative is that we do for other modules what we do with importlib: run the relevant tests one in each mode. However, it's better to run the whole suite in both modes, so I'd favor relying on the build-option-specific buildbots to get us coverage. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/M2LMJYETM7KXVAWQ6UY7DMAZUXO6H33K/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 3:31 PM Victor Stinner wrote: > Which stdlib modules are currently frozen? If I really want to hack > site.py or os.py for whatever reason, I just have to use "python3 -X > frozen_modules=off"? The single-source-of-truth is Tools/scripts/freeze_modules.py. After running "make regen-frozen" you'll find a cleaner list in Python/frozen_modules/MANIFEST. You can also look at the generated code in Makefile.pre.in or Python/frozen.c. Finally, you can run "./python -X frozen_modules=on -c 'import _imp; print(_imp._frozen_module_names())'" > > 1. always default to "on" (the annoyance for contributors isn't big enough?) > > What is the annoyance? The annoyance of changes to the .py files not getting used (at least not until after running "make all" > What is different between frozen and not frozen? They have a different loader and repr. Also, frozen modules do not have __file__ set (and __path__ is always []). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/J5IINOU6JBHNBA4ZOTXWDCBC3QIQT2EF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 3:04 PM Barry Warsaw wrote: > If you’re planning a runtime -X option, then does that mean that the modules > will be frozen at build time but Python will decide at runtime whether to use > the frozen modules or the unfrozen ones? Correct. FYI, this was already done. > Are you planning on including the currently frozen importlib modules in that > same mechanism? No. They must always be frozen. See is_essential_frozen_module() in Python/import.c. > Will `make test` and/or CI run Python with both options? How will we make > sure that frozen modules (or not) don’t break Python? If "configure --with-optimizations" always sets the default to "on" and the default is "off" otherwise, then the PGO buildbots will exercise the frozen path. Likewise if "--with-pydebug" (or in-source-tree) makes the default "off" and otherwise it's "on". Without a build-time option already handled by one of the buildbots, we'd need to either add a dedicated buildbot or run it both ways (like we do with importlib). I expect that won't be necessary. > Option #3 seems like the most reasonable one to me, with the ability to turn > it on when running from the source tree. It's definitely the one that fits most naturally for me. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/I2JTBQSFFA2GFMSGRDGHDARUPSZTLMQ2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 2:59 PM Brett Cannon wrote: > What about opting out when `--with-pydebug` is used? I'm not sure how many > people actively develop in a non-debug build other than testing something, > but at that point I would be having to run `make` probably anyway for > whatever I'm mucking with if it's that influenced by a debug build. Yeah, that's an option too. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/C2IMGPQMJQAFCH26SYHE4JE4WJRCPDBM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 12:40 PM Steve Dower wrote: > Having it be implied by an "--enable-optimizations" option is totally > fine (and we'd add one to build.bat for this), but I still think it > needs to be discoverable later whether the frozen modules build option > was used or not, independent of other build options. That's reasonable. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KKLMPEB6SI2EC34MUPLSTFBJJYG4O4WE/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 10:51 AM Eric Snow wrote: > Possible solutions: > > 1. always default to "on" (the annoyance for contributors isn't big enough?) > 2. default to "on" if it's a PGO build (and "off" otherwise) > 3. default to "on" unless running from the source tree FWIW, I'm planning on doing (2) (and (3) if it isn't complicated). Mostly I wanted to verify my assumptions about the possible annoyance before getting too far. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/J656PLVTGTVDCLV2GSZPNV46UTKU4S7M/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The Default for python -X frozen_modules.
On Mon, Sep 27, 2021 at 11:09 AM Chris Angelico wrote: > When exactly does the freezing happen? When you build the executable (e.g. "make -j8", ".\PCbuild\build.bat"). So your changes to those .py files wouldn't show up until then. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IGDYUDVFHDU77OLPP3744FIG3IHZWS4D/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] The Default for python -X frozen_modules.
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread. Possible solutions: 1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree Thoughts? -eric [1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNOX43DRFKLEW3IMDXDKPDMNRGR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: A better way to freeze modules
On Fri, Sep 3, 2021 at 5:32 AM Paul Moore wrote: > On Fri, 3 Sept 2021 at 10:29, Simon Cross > wrote: > > I think adding a meta path importer that reads from a standard > > optimized format could be a great addition. > > I think the biggest open question would be "what benefits does this > have over the existing zipimport?" +1 > > As you mentioned in your email, this is a big detour from the current > > start-up performance work, so I think practically the people working > > on performance are unlikely to take a detour from their detour right > > now. > > Agreed, it would probably have to be an independent development > initially. If it delivers better performance, then switching the > startup work to use it would give a second set of performance > improvements, which no-one is going to object to. Similarly, if it's > simpler to manage, then the maintainability benefits could justify > switching over. +1 > > * Write the meta path importer in a separate package (it sounds like > > you've already done a lot of the work and gained a lot of > > understanding of the issues while writing PyOxidizer!) > > This is the key thing, though. The import machinery allows new > importers to be written as standalone modules, so I'd strongly > recommend that the proposed format/importer gets developed as a PyPI > module initially, with the PEP then being simply a proposal that the > module gets added to the stdlib and/or built into the interpreter. FWIW, I'm a big fan of folks taking advantage of the flexibility of the import machinery and writing importers like this (especially ones that folks must explicitly enable). As noted elsewhere, it would need to prove its worth before we consider putting it into importlib. > The key argument would be bootstrapping, IMO. I would definitely expect > interest in something like this to be lower if it's an external module > (needing a dependency to load your other dependencies is suboptimal). > Conversely, though, if no-one shows any interest in a PyPI version of > this idea, that would strongly imply that it's not as useful in > practice as you'd hoped. Excellent point! > In particular, I'd involve the maintainers of pyinstaller in the > design. If a new "frozen module importer" mechanism isn't of interest > to them, it's probably not going to get the necessary support to be > worth adding to the stdlib. +1 > On a personal note, I love the flexibility of Python's import system, > and I've always wanted to write importers for additional storage > formats (import from a sqlite database, for instance). But I've never > actually done so, because a zipfile is basically always sufficient for > any practical use case I've had. One day I hope to find a real use > case, though :-) Cool! I'd love to see what you make. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YV2K3BPVDZRZTGLM4HWQEJWMVPI6BGHD/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: A better way to freeze modules
On Thu, Sep 2, 2021 at 10:46 PM Gregory Szorc wrote: > Over in https://bugs.python.org/issue45020 there is some exciting work around > expanding the use of the frozen importer to speed up Python interpreter > startup. I wholeheartedly support the effort and don't want to discourage > progress in this area. > > Simultaneously, I've been down this path before with PyOxidizer and feel like > I have some insight to share. Thanks for the support and for taking the time to share your insight! Your work on PyOxidizer is really neat. Before I dive in to replying, I want to be clear about what we are discussing here. There are two related topics: the impact of freezing stdlib modules and usability problems with frozen modules in general (stdlib or not). https://bugs.python.org/issue45020 is concerned with the former but prompted some good discussion about the latter. From what I understand, this python-dev thread is more about the latter (and then some). That's totally worth discussing! I just don't want the two topics to be unnecessarily conflated. FYI, frozen modules (effectively the .pyc data) are compiled into the Python binary and lhen loaded from there during import rather than from the filesystem. This allows us to avoid disk access, giving us a performance benefit, but we still have to unmarshal and execute the module code. It also allows us to have the import machinery written in pure Python (importlib._bootstrap and importlib._bootstrap_external). (Thanks Brett!) While frozen modules are derived from .py files, they currently have some differences from the corresponding source modules: the loader (which has less capability), the repr, frozen packages have __path__ set to [], and frozen modules don't have __file__, __cached__, etc. set. This has been the case for a long time. MAL worked on addressing __file__ but the effort stalled out. (See https://bugs.python.org/issue45020#msg400769 and especially https://bugs.python.org/issue21736.) The challenge with solving this for non-stdlib modules is that the frozen importer would need help to know where to find corresponding .py files. bpo-45020 is about freezing a small subset of the stdlib as a performance improvement. It's the 11 stdlib modules (plus encodings) that get imported every time during "./python -c pass". Freezing them provides a roughly 15% startup time improvement. (The 11 modules are: abc, codecs, encodings, io, _collections_abc, _site_builtins, os, os.path, genericpath, site, and stat. Maybe there are a few other modules it would make sense to freeze but we're starting with those 11.) This work is probably somewhat affected by the differences between frozen and source modules, and we may need to set an appropriate __file__ on frozen stdlib modules to avoid impacting folks that expect any of those stdlib modules to have it set. Otherwise, for bpo-45020 there likely isn't much more we need to do about frozen stdlib modules shipping with CPython by default. Regardless, bpo-45020 doesn't introduce any new problems; rather it slightly exposes the existing ones. In contrast to the use of frozen modules in default Python builds, there are a number of tools in the community for freezing modules (both stdlib and not) into custom Python binaries, like PyOxidizer and MAL's PyRun. Such tools would benefit from broader compatibility between frozen modules and the corresponding source modules. Consequently the tool maintainers would be the most likely drivers of any effort to improve frozen modules (which the discussion with MAL and Gregory bears out). The tools would especially benefit if those improvements could apply to non-stdlib modules, which requires a more complex solution than is needed for stdlib modules. At the (relative) extreme is to throw out the existing frozen module approach (or even the "unmarshal + exec" approach of source-based modules) and replace it with something more efficient and/or more compatible (and cross-platform). From what I understood, this is the main focus of this thread. It's interesting stuff and I hope the discussion renders a productive result. FTR, in bpo-45020 Gregory helpfully linked to some insightful material related to PyOxidizer and frozen modules: * https://github.com/indygreg/PyOxidizer/issues/69 * https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_behavior_and_compliance.html?highlight=__file__#file-and-cached-module-attributes * https://pypi.org/project/oxidized-importer/ and https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer.html With that said, on to replying. :) > I don't think I'll be offending anyone by saying the existing CPython frozen > importer is quite primitive in terms of functionality: it does the minimum it > needs to do to support importing module bytecode embedded in the interpreter > binary [for purposes of bootstrapping the Python-based importlib modules]. > The C struct representing frozen modules is literally just the
[Python-Dev] Re: Is anyone relying on new-bugs-announce/python-bugs-list/bugs.python.org summaries
On Mon, Aug 23, 2021 at 4:16 PM Ammar Askar wrote: > As part of PEP 588, migrating bugs.python.org issues to Github, Thanks for working on this! > 1. Weekly summary emails with bug counts and issues from the week, > 2. Emails sent to the new-bugs-announce and python-bugs-list for new I rely on both these. They help improve signal-to-noise and make it easier to quickly get back up-to-date if I'm out for a while. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IZJQIDKKL7KBGPUEL2YQO44FZIMLTZPO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Making code object APIs unstable
On Fri, Aug 13, 2021 at 11:29 AM Guido van Rossum wrote: > If these weren't part of the stable ABI, I'd choose (E). They aren't in the stable ABI (or limited API). Instead, they are part of the broader public API (declared in Include/cpython/code.h, along with "struct PyCodeObject" and others). FWIW, there is actually very little API related to PyCodeObject that is in the limited API: * Include/code.h:typedef struct PyCodeObject PyCodeObject; * Include/genobject.h:PyCodeObject *prefix##_code; * Include/pyframe.h:PyAPI_FUNC(PyCodeObject *) PyFrame_GetCode(PyFrameObject *frame); All that said, the issue of compatibility remains. I mostly agree with Guido's analysis and his choice of (E), as long as it's appropriately documented as unstable. However, I'd probably pick (C) with a caveat. We already have a classification for this sort of unstable API: "internal". Given how code objects are so coupled to the CPython internals, I suggest that most API related to PyCodeObject belongs in the internal API (in Include/internal/pycore_code.h) and thus moved out of the public API. Folks that are creating code objects manually via the C-API are probably already doing low-level stuff that requires other "internal" API (via Py_BUILD_CORE, etc.). Otherwise they should use types.CodeType instead. Making that change would naturally include dropping PyCode_New() and PyCode_NewWithPosArgs(), as described in (C). However, we already have _PyCode_New() in the internal API. (It is slightly different but effectively equivalent.) We could either drop the underscore on _PyCode_New() or move the existing PyCode_NewWithPosArgs() (renamed to PyCode_New) to live beside it. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KWWRLL56EI2S5BVADKMDCG4UED76GXXG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Why aren't we allowing the use of C11?
On Thu, Jan 28, 2021 at 9:28 AM Mark Shannon wrote: > Is there a good reason not to start using C11 now? Would C17 be a better choice? It sounds like it exists to fix problems with C11 (and doesn't actually add any new features). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2D25C5KI73LBRVLFHDBGH4OIKSCCEPUO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 622 railroaded through?
On Fri, Jul 3, 2020, 12:40 Eric Snow wrote: > Also, keep in mind that PEPs are a tool for the decision maker (i.e. > BDFL delegate). Effectively, everything else is convention. The process > usually involves community feedback, but has never been community-driven. > All this has become more painful for volunteers as the Python community has > grown. > > -eric > To further elaborate on that, a PEP isn't legislation to be approved by the community. Rather, it is meant to capture the proposal and discussion sufficiently that the BDFL/delegate can make a good decision. Ultimately there isn't much more to the process than that, beyond convention. The BDFL-delegate is trusted to do the right thing and the steering council is there as a backstop. It's up to the decision maker to reach a conclusion and it makes sense that they especially consider community impact. However, there is no requirement of community approval. This is not new. Over the years quite a few decisions by Guido (as BDFL) sparked controversy yet in hindsight Python is better for each of those decisions. (See PEP 20.) The main difference in recent years is the growth of the Python community, which is a happy problem even if a complex one. :) There has been a huge influx of folks without context on Python's governance but with contrary expectations and loud voices. On the downside, growth has greatly increased communications traffic and signal-to-noise, as well as somewhat shifting tone in the wrong direction. Unfortunately all this contributed to us losing our BDFL. :( Thankfully we have the steering council as a replacement. Regardless, Python is not run as a democracy nor by a representative body. Instead, this is a group of trusted volunteers that are trying their best to keep Python going and make it better. The sacrifices they make reflect how much they care about the language and the community, especially as dissenting voices increase in volume and vitriol. That negativity has a real impact. -eric > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GTOD2OHTIJU34DQS6XH756X4K2FLL2C2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 622 railroaded through?
On Fri, Jul 3, 2020, 09:18 Antoine Pitrou wrote: > I think what you describe as "the usual procedure" isn't as usual as > you think. > +1 Also, keep in mind that PEPs are a tool for the decision maker (i.e. BDFL delegate). Effectively, everything else is convention. The process usually involves community feedback, but has never been community-driven. All this has become more painful for volunteers as the Python community has grown. -eric > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QUFB5NI64ASIESOCWHNPUQZPR5BEMXQF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?
On Wed, Jun 17, 2020 at 11:42 AM Emily Bowman wrote: > So most likely there wouldn't be any way to share something like a bytearray > or another > buffer interface-compatible type for some time. That's too bad, I was hoping > to have > shared arrays that I could put a memoryview on in each thread/interpreter and > deal with > locking if I need to, Earlier versions of PEP 554 did have a "SendChannel.send_buffer()" method for this but we tabled it in the interest of simplifying. That said, I expect we'll add something like that separately later. > but I suppose I can work through an extension once the changes stabilize. Yep. This should be totally doable in an extension and hopefully without much effort. > Packages like NumPy have had their own opaque C types and C-only routines to > handle all the big threading outside of Python as a workaround for a long > time now. As a workaround for what? This sounds interesting. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/D2APLOLR4UL7VXLNRFGFWOUN5MPIO2BV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)
On Fri, Jun 12, 2020 at 2:49 AM Mark Shannon wrote: > The overhead largely comes from what you do with the process. The > additional cost of starting a new interpreter is the same regardless of > whether it is in the same process or not. FWIW, there's more to it than that: * there is some overhead to starting the runtime and main interpreter that does not apply to additional in-process interpreters * I don't see why we shouldn't be able to come up with a strategy for interpreter startup that does not involve copying or sharing a lot of interpreter state, thus reducing startup time and memory consumption * I'm guessing that re-importing builtin/extension modules is faster than importing then new in a separate process -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/M7FZL6LVEP2CRMDKGZE4BA6G7WOS542H/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
On Wed, Jun 3, 2020 at 7:12 AM Mark Shannon wrote: > The size of the C API, as measured by `git grep PyAPI_FUNC | wc -l` has > been steadily increasing over the last few releases. > > 3.5 1237 > 3.6 1304 > 3.7 1408 > 3.8 1478 > 3.9 1518 > > > For reference the 2.7 branch has "only" 973 functions It isn't as bad as that. Here I'm only looking at PyAPI_FUNC under Include/. From 3.5 to master the *public* C-API has increased by 71 functions (and the "private"/internal C-API by 189). "Private" is functions starting with "_" and VER TOT PUB + "_" 2.7 932 (752 + 178) 3.5 1181 (846 + 320) 3.6 1247 (851 + 380) 3.7 1350 (875 + 460 + 13 internal) 3.8 1424 (908 + 422 + 79 internal) 3.9 1447 (917 + 403 + 110 internal) m1443 (917 + 401 + 108 internal) (This does not count changes in the number of macros, which may have gone down...or not.) FWIW, relative to the "cpython" API split that happened in 3.8 (and "internal" in 3.7): VERtotal Include/*.h Include/cpython/*.h Include/internal/*.h 2.7 932 932 (752 + 178) - - 3.51181 1181 (846 + 320)- - 3.612471247 (851 + 380)- - 3.713501350 (875 + 460)-13 (0 + 13) 3.814241050 (800 + 249) 295 (108 + 173) 79 (0 + 79) 3.91447 944 (789 + 153) 393 (128 + 250) 110 (105 + 5) m 1443 941 (789 + 150) 394 (128 + 251) 108 (103 + 5) Here's the "command" I ran: for pat in 'Include/' 'Include/*.h' 'Include/cpython/*.h' 'Include/internal/*.h'; do echo " -- $pat --" echo $(git grep 'PyAPI_FUNC(' -- $pat | wc -l) '('$(git grep 'PyAPI_FUNC(.*) [^_]' -- $pat | wc -l) '+' $(git grep 'PyAPI_FUNC(.*) [_]' -- $pat | wc -l)')' done > Every one of these functions represents a maintenance burden. > Removing them is painful and takes a lot of effort, but adding them is > done casually, without a PEP or, in many cases, even a review. I agree with regards to the public C-API, particularly the stable API. > We need to address what to do about the C API in the long term, but for > now can we just stop making it larger? Please. > > Also, can we remove all the new API functions added in 3.9 before the > release and it is too late? In 3.9 we have added 9 functions to the public C-API and removed 19 from the "private" C-API. The "internal" C-API grew by 31, but I don't see the point in changing any of those. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FHC5SWV6JTF4FQ4TZWLHVEJ5S22KPBFM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Summary of Python tracker Issues
On Fri, May 29, 2020 at 12:16 PM Python tracker wrote: > ACTIVITY SUMMARY (2020-05-22 - 2020-05-29) > Python tracker at https://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open7487 ( +9) > closed 45080 (+80) > total 52567 (+89) > > ... How hard would it be to add PRs (in the same way) to this weekly report? Also, where is the script for this hosted and where is the source repo (if any)? it might be helpful to have a link back to that info, perhaps somewhere in the devguide. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WCUONFB5MRSC6LHWT442QBF7FBN7TGQJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround
On Thu, May 7, 2020 at 2:50 AM Emily Bowman wrote: > While large object copies are fairly fast -- I wouldn't say trivial, a > gigabyte copy will introduce noticeable lag when processing enough of them -- > the flip side of having large objects is that you want to avoid having so > many copies that you run into memory pressure and the dreaded swapping. A > multiprocessing engine that's fully parallel, every fork takes chunks of data > and does everything needed to them won't gain much from zero-copy as long as > memory limits aren't hit. But a pipeline of processing would involve many > copies, especially if you have a central dispatch thread that passes things > from stage to stage. This is a big deal where stages may take longer or > slower at any time, especially in low-latency applications, like video > conferencing, where dispatch needs the flexibility to skip steps or add extra > workers to shove a frame out the door, and using signals to interact with > separate processes to tell them to do so is more latency and overhead. > > Not that I'm recommending someone go out and make a pure Python > videoconferencing unit right now, but it's a use case I'm familiar with. > (Since I use Python to test new ideas before converting them into C++.) Thanks for the insight, Emily (and everyone else). It's really helpful to get many different expert perspectives on the matter. I am definitely not an expert on big-data/high-performance use cases so, personally, I rely on folks like Nathaniel, Travis Oliphant, and yourself. The more, the better. :) Again, thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5KA262LMVS3IBXUZQD6VJ5IQTZOSMR5U/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Latest PEP 554 updates.
On Wed, May 6, 2020 at 2:25 PM Jeff Allen wrote: > Many thanks for working on this so carefully for so long. I'm happy to see > the per-interpreter GIL will now be studied fully before final commitment to > subinterpreters in the stdlib. I would have chipped in in those terms to the > review, but others succesfully argued for "provisional" inclusion, and I was > content with that. No problem. :) > My reason for worrying about this is that, while the C-API has been there for > some time, it has not had heavy use in taxing cases AFAIK, and I think there > is room for it to be incorrect. I am thinking more about Jython than CPython, > but ideally they are the same structures. When I put the structures to taxing > use cases on paper, they don't seem quite to work. Jython has been used in > environments with thread-pools, concurrency, and multiple interpreters, and > this aspect has had to be "fixed" several times. That insight would be super helpful and much appreciated. :) Is that all on the docs you've linked? > My use cases include sharing objects between interpreters, which I know the > PEP doesn't. The C-API docs acknowledge that object sharing can't be > prevented, but do their best to discourage it because of the hazards around > allocation. Trouble is, I think it can happen unawares. The fact that Java > takes on lifecycle management suggests it shouldn't be a fundamental problem > in Jython. I know from other discussion it's where many would like to end up, > even in CPython. Yeah, for now we will strictly disallow sharing actual objects between interpreters in Python Code. It would be an interesting project to try loosening that at some point (especially with immutable type), but we're going to start from the safer position. We have no plans to add any similar restrictions to the C-API, where by you're typically much more free to shoot your own foot. :) > This is all theory: I don't have even a model implementation, so I won't > pontificate. However, I do have pictures, without which I find it impossible > to think about this subject. I couldn't find your pictures, so share mine > here (WiP): > > https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#runtime-thread-and-interpreter-cpython > > I would be interested in how you solve the problem of finding the current > interpreter, discussed in the article. My preferred answer is: > > https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#critical-structures-revisited > > That's the API change I think is needed. It might not have a visible effect > on the PEP, but it's worth bearing in mind the risk of exposing a thing you > might shortly find you want to change. This is great stuff, Jeff! Thanks for sharing it. I was able to skim through but don't have time to dig in at the moment. I'll reply in detail as soon as I can. In the meantime, the implementation of PEP 554 exposes a single part of PyInterpreterState: the ID (an int). The only other internal-ish info we expose is whether or not an interpreter (by ID) is currently running. The only functionality we provide is: create, destroy, and run_string(). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7RZCIKVRIKXTNFT7IRNLA3OQ5CX2AIJ6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Latest PEP 554 updates.
On Mon, May 4, 2020 at 11:30 AM Eric Snow wrote: > Further feedback is welcome, though I feel like the PR is ready (or > very close to ready) for pronouncement. Thanks again to all. FYI, after consulting with the steering council I've decided to change the target release to 3.10, when we expect to have per-interpreter GIL landed. That will help maximize the impact of the module and avoid any confusion. I'm undecided on releasing a 3.9-only module on PyPI. If I do it will only be for folks to try it out early and I probably won't advertise it much. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PZO7ZQB7OAOEJ7AXMJNMDKZC3B3UVDZA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Latest PEP 554 updates.
On Mon, May 4, 2020 at 1:22 PM Paul Moore wrote: > One thing I would like to see is a comment confirming that as part of > the implementation, all stdlib modules will be made > subinterpreter-safe. Yeah, I'd meant to put a note. I'll add one. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CVGSOKANMSIGPZKRL6IQDOYJYZRZ3NE2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Latest PEP 554 updates.
On Mon, May 4, 2020 at 11:30 AM Eric Snow wrote: > Further feedback is welcome, though I feel like the PR is ready (or > very close to ready) for pronouncement. Thanks again to all. oops s/the PR is ready/the PEP is ready/ -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/D36JNZMMA2O746KLWBBWMGN2F7MQHVWM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Latest PEP 554 updates.
Hi all, Thanks for the great feedback. I've updated PEP 554 (Multiple Interpreters in the Stdlib) following feedback. https://www.python.org/dev/peps/pep-0554/ Here's a summary of the main changes: * [API] dropped/deferred the "release" and "close" methods from RecvChannel and SendChannel (they were unnecessary and the "association" stuff was too confusing) * [API] dropped RecvChannel/SendChannel.interpreters * [API] dropped/deferred SendChannel.send_buffer() * [API] renamed Interpreter.destroy() to Interpreter.close() * [API] added a per-interpreter "isolated" mode (default: on) * added a section about "Help for Extension Module Maintainers" * added a section about documentation * added many entries to the "deferred" and "rejected" sections Further feedback is welcome, though I feel like the PR is ready (or very close to ready) for pronouncement. Thanks again to all. -eric ------ PEP: 554 Title: Multiple Interpreters in the Stdlib Author: Eric Snow BDFL-Delegate: Antoine Pitrou Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.9 Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017, 09-May-2018, 20-Apr-2020, 01-May-2020 Abstract CPython has supported multiple interpreters in the same process (AKA "subinterpreters") since version 1.5 (1997). The feature has been available via the C-API. [c-api]_ Subinterpreters operate in `relative isolation from one another `_, which facilitates novel alternative approaches to `concurrency `_. This proposal introduces the stdlib ``interpreters`` module. The module will be `provisional `_. It exposes the basic functionality of subinterpreters already provided by the C-API, along with new (basic) functionality for sharing data between interpreters. A Disclaimer about the GIL == To avoid any confusion up front: This PEP is unrelated to any efforts to stop sharing the GIL between subinterpreters. At most this proposal will allow users to take advantage of any results of work on the GIL. The position here is that exposing subinterpreters to Python code is worth doing, even if they still share the GIL. Proposal The ``interpreters`` module will be added to the stdlib. To help authors of extension modules, a new page will be added to the `Extending Python `_ docs. More information on both is found in the immediately following sections. The "interpreters" Module - The ``interpreters`` module will provide a high-level interface to subinterpreters and wrap a new low-level ``_interpreters`` (in the same way as the ``threading`` module). See the `Examples`_ section for concrete usage and use cases. Along with exposing the existing (in CPython) subinterpreter support, the module will also provide a mechanism for sharing data between interpreters. This mechanism centers around "channels", which are similar to queues and pipes. Note that *objects* are not shared between interpreters since they are tied to the interpreter in which they were created. Instead, the objects' *data* is passed between interpreters. See the `Shared data`_ section for more details about sharing between interpreters. At first only the following types will be supported for sharing: * None * bytes * str * int * PEP 554 channels Support for other basic types (e.g. bool, float, Ellipsis) will be added later. API summary for interpreters module --- Here is a summary of the API for the ``interpreters`` module. For a more in-depth explanation of the proposed classes and functions, see the `"interpreters" Module API`_ section below. For creating and using interpreters: +-+--+ | signature | description | +=+==+ | ``list_all() -> [Interpreter]`` | Get all existing interpreters. | +-+--+ | ``get_current() -> Interpreter``| Get the currently running interpreter. | +-+--+ | ``get_main() -> Interpreter`` | Get the main interpreter.| +-+--+ | ``create(*, isolated=True) -> Interpreter`` | Initialize a new (idle) Python interpreter. | +-+--+ | +
[Python-Dev] Re: PEP 554 comments
On Wed, Apr 29, 2020, 22:05 Greg Ewing wrote: > > Furthermore, IMHO "release" is better at communicating the > > per-interpreter nature than "close". > > Channels are a similar enough concept to pipes that I think > it would be confusing to have "close" mean "close for all > interpreters". Everyone understands that "closing" a pipe > only means you're closing your reference to one end of it, > and they will probably assume closing a channel means the > same. > FWIW, I'd compare channels more closely to queues than to pipes. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BTOEUU7VAF55KZYPYHJJE4ZVWIEMNZNK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 554 for 3.9 or 3.10?
Thanks for the thoughtful post! I'm going to address some of your comments here and some in a separate discussion in the next few days. On Wed, Apr 29, 2020 at 10:36 AM Sebastian Berg wrote: > While I still think it is probably not part of PEP 554 as such, I guess > it needs a full blown PEP on its own. Saying that Python should > implement subinterpreters. (I am saying "implement" because I believe > you must consider subinterpreters basically a non-feature at this time. > It has neither users nor reasonable ecosystem support.) FWIW, at this point it would be hard to justify removing the existing public subinterpreters C-API. There are several large public projects using it and likely many more private ones we do not know about. That's not to say that alone justifies exposing the C-API, of course. :) > In many ways I assume that a lot of the ground work for subinterpreters > was useful on its own. There has definitely been a lot of code health effort related to the CPython runtime code, partly motivated by this project. :) > But please do not underestimate how much effort > it will take to make subinterpreters first class citizen in the > language! If you are talking about on the CPython side, most of the work is already done. The implementation of PEP 554 is nearly complete and subinterpreter support in the runtime has a few rough edges to buff out. The big question is the effort it will demand of the Python community, which is the point Nathaniel has been emphasizing (understandably). > Believe me, I have been there and its tough to write these documents > and then get feedback which you are not immediately sure what to make > of. > Thus, I hope those supporting the idea of subinterpreters will help you > out and formulate a better framework and clarify PEP 554 when it comes > to the fuzzy long term user-impact side of the PEP. FYI, I started working on this project in 2015 and proposed PEP 554 in 2017. This is actually the 6th round of discussion since then. :) -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SIKH5NK6B67BLLVHDRAMK64PMO6EZ5VI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 554 for 3.9 or 3.10?
Thanks for the great insights into PyObjC! On Wed, Apr 29, 2020 at 9:02 AM Ronald Oussoren wrote: > I don’t know how much the move of global state to per-interpreter state > affects extensions, other than references to singletons and static types. That's the million dollar question. :) FYI, one additional challenge is when an extension module depends on a third-party C library which itself keeps global state which might leak between subinterpreters. The Cryptography project ran into this several years ago with OpenSSL and they were understandably grumpy about it. > But with some macro trickery that could be made source compatible for > extensions. Yeah, that's one approach that we've discussed in the past (e.g. at the last core sprint). -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/H3RCS47ZUITKKXR3BVYOPXNXBZYF5ZN4/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 554 for 3.9 or 3.10?
On Wed, Apr 29, 2020 at 6:27 AM Julien Salort wrote: > If your proposal leads to an intelligible actual error, and a clear > warning in the documentation, instead of a silent crash, this sounds > like progress, even for those packages which won't work on > subinterpreters anytime soon... That's helpful. Thanks! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BURRXBY24URPSZXRIB7OHYCEBY2G4U67/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 554 for 3.9 or 3.10?
On Wed, Apr 29, 2020 at 1:52 AM Paul Moore wrote: > One thing that isn't at all clear to me here is that when you say > "Subinterpreters run all Python code", do you *just* mean the core > language? Or the core language plus all builtins? Or the core > language, builtins and the standard library? Because I think that the > vast majority of users would expect a core/stdlib function like > subinterpreters to support the full core+stdlib language. Agreed. > So my question would be, do all of the stdlib C extension modules > support subinterpreters[1]? If they don't, then I think it's very > reasonable to expect that to be fixed, in the spirit of "eating our > own dogfood" - if we aren't willing or able to make the stdlib support > subinterpreters, it's not exactly reasonable or fair to expect 3rd > party extensions to do so. That is definitely the right question. :) Honestly I had not thought of it that way (nor checked of course). While many stdlib modules have been updated to use heap types (see PEP 384) and support PEP 489 (Multi-phase Extension Module Initialization), there are still a few stragglers. Furthermore, I expect that there are few modules that would give us trouble (maybe ssl, cdecimal). It's all about global state that gets shared inadvertently between subinterpreters. Probably the best way to find out is to run the entire test suite in a subinterpreter. I'll do that as soon as I can. > If, on the other hand, the stdlib *is* supported, then I think that > "all of Python and the stdlib, plus all 3rd party pure Python > packages" is a significant base of functionality, and an entirely > reasonable starting point for the feature. Yep, that's what I meant. I just need to identify modules where we need fixes. Thanks for bringing this up! > It certainly still excludes > big parts of the Python ecosystem (notably scientific / data science > users) but that seems fine to me - big extension users like those can > be expected to have additional limitations. It's not really that > different from the situation around C extension support in PyPy. Agreed. -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TGW2OESYUGMMRVU6JIXQXWEP3VMH7WPL/ Code of Conduct: http://python.org/psf/codeofconduct/