[Python-Dev] PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 2)

2022-02-18 Thread Eric Snow
Thanks to all those that provided feedback.  I've worked to
substantially update the PEP in response.  The text is included below.
Further feedback is appreciated.

-eric



PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow , Eddie Elizondo

Discussions-To:
https://mail.python.org/archives/list/python-dev@python.org/thread/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2022
Python-Version: 3.11
Post-History: 15-Feb-2022
Resolution:


Abstract


Currently the CPython runtime maintains a
`small amount of mutable state `_ in the
allocated memory of each object.  Because of this, otherwise immutable
objects are actually mutable.  This can have a large negative impact
on CPU and memory performance, especially for approaches to increasing
Python's scalability.  The solution proposed here provides a way
to mark an object as one for which that per-object
runtime state should not change.

Specifically, if an object's refcount matches a very specific value
(defined below) then that object is treated as "immortal".  If an object
is immortal then its refcount will never be modified by ``Py_INCREF()``,
etc.  Consequently, the refcount will never reach 0, so that object will
never be cleaned up (unless explicitly done, e.g. during runtime
finalization).  Additionally, all other per-object runtime state
for an immortal object will be considered immutable.

This approach has some possible negative impact, which is explained
below, along with mitigations.  A critical requirement for this change
is that the performance regression be no more than 2-3%.  Anything worse
the performance-neutral requires that the other benefits are proportionally
large.  Aside from specific applications, the fundamental improvement
here is that now an object can be truly immutable.

(This proposal is meant to be CPython-specific and to affect only
internal implementation details.  There are some slight exceptions
to that which are explained below.  See `Backward Compatibility`_,
`Public Refcount Details`_, and `scope`_.)


Motivation
==

As noted above, currently all objects are effectively mutable.  That
includes "immutable" objects like ``str`` instances.  This is because
every object's refcount is frequently modified as the object is used
during execution.  This is especially significant for a number of
commonly used global (builtin) objects, e.g. ``None``.  Such objects
are used a lot, both in Python code and internally.  That adds up to
a consistent high volume of refcount changes.

The effective mutability of all Python objects has a concrete impact
on parts of the Python community, e.g. projects that aim for
scalability like Instragram or the effort to make the GIL
per-interpreter.  Below we describe several ways in which refcount
modification has a real negative effect on such projects.
None of that would happen for objects that are truly immutable.

Reducing CPU Cache Invalidation
---

Every modification of a refcount causes the corresponding CPU cache
line to be invalidated.  This has a number of effects.

For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.

On top of that, multi-core applications pay a price.  If two threads
(running simultaneously on distinct cores) are interacting with the
same object (e.g. ``None``)  then they will end up invalidating each
other's caches with each incref and decref.  This is true even for
otherwise immutable objects like ``True``, ``0``, and ``str`` instances.
CPython's GIL helps reduce this effect, since only one thread runs at a
time, but it doesn't completely eliminate the penalty.

Avoiding Data Races
---

Speaking of multi-core, we are considering making the GIL
a per-interpreter lock, which would enable true multi-core parallelism.
Among other things, the GIL currently protects against races between
multiple concurrent threads that may incref or decref the same object.
Without a shared GIL, two running interpreters could not safely share
any objects, even otherwise immutable ones like ``None``.

This means that, to have a per-interpreter GIL, each interpreter must
have its own copy of *every* object.  That includes the singletons and
static types.  We have a viable strategy for that but it will require
a meaningful amount of extra effort and extra complexity.

The alternative is to ensure that all shared objects are truly immutable.
There would be no races because there would be no modification.  This
is something that the immortality proposed here would enable for
otherwise immutable objects.  With immortal objects,
support for a per-interpreter GIL
becomes much simpler.

Avoiding Copy-on-Write
--

For some applications it makes sense to get the application into
a 

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow
On Wed, Feb 16, 2022 at 11:06 AM Larry Hastings  wrote:
> I experimented with this at the EuroPython sprints in Berlin years ago.  I 
> was sitting next to MvL, who had an interesting observation about it.

Classic MvL! :)

>  He suggested(*) all the constants unmarshalled as part of loading a module 
> should be "immortal", and if we could rejigger how we allocated them to store 
> them in their own memory pages, that would dovetail nicely with COW 
> semantics, cutting down on the memory use of preforked server processes.

Cool idea.  I may mention it in the PEP as a possibility.  Thanks!

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2PRODXEVNO53YYFRL6JUWZQF77WOYS4C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Github Issues Migration is coming soon

2022-02-18 Thread Łukasz Langa
I just published the migration plan and a call to action on Discourse:
https://discuss.python.org/t/github-issues-migration-is-coming-soon/13791 


--
See you there,
Łukasz Langa
CPython Developer in Residence
Python Software Foundation



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YC3ZS27CRPA5JHCKTHYTRKA3IFONTIJ6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Summary of Python tracker Issues

2022-02-18 Thread Python tracker

ACTIVITY SUMMARY (2022-02-11 - 2022-02-18)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open7218 (+40)
  closed 51299 (+27)
  total  58517 (+67)

Open issues with patches: 2932 


Issues opened (52)
==

#46725: Unpacking without parentheses is allowed since 3.9
https://bugs.python.org/issue46725  opened by pablogsal

#46726: Thread spuriously marked dead after interrupting a join call
https://bugs.python.org/issue46726  opened by Kevin Shweh

#46727: Should shutil functions support bytes paths?
https://bugs.python.org/issue46727  opened by Jelle Zijlstra

#46729: Better str() for BaseExceptionGroup
https://bugs.python.org/issue46729  opened by iritkatriel

#46731: posix._fcopyfile flags addition
https://bugs.python.org/issue46731  opened by devnexen

#46732: object.__bool__ docstring is wrong
https://bugs.python.org/issue46732  opened by Jelle Zijlstra

#46733: pathlib.Path methods can raise NotImplementedError
https://bugs.python.org/issue46733  opened by barneygale

#46734: Add Maildir.get_flags() to access message flags without openin
https://bugs.python.org/issue46734  opened by gildea

#46735: gettext.translations crashes when locale is unset
https://bugs.python.org/issue46735  opened by amazingminecrafter2015

#46736: Generate HTML 5 with SimpleHTTPRequestHandler.list_directory
https://bugs.python.org/issue46736  opened by dom1310df

#46740: Improve Telnetlib's throughput
https://bugs.python.org/issue46740  opened by martin_kirch

#46742: Add '-d $fd' option to trace module, akin to bash -x feature
https://bugs.python.org/issue46742  opened by PenelopeFudd

#46743: Enable usage of object.__orig_class__ in __init__
https://bugs.python.org/issue46743  opened by Gobot1234

#46744: installers on ARM64 suggest wrong folders
https://bugs.python.org/issue46744  opened by conio

#46746: IDLE: Consistently handle non .py source files
https://bugs.python.org/issue46746  opened by terry.reedy

#46748: Python.h includes stdbool.h
https://bugs.python.org/issue46748  opened by petr.viktorin

#46749: Support cross compilation on macOS
https://bugs.python.org/issue46749  opened by autoantwort

#46750: some code paths in ssl and _socket still import idna unconditi
https://bugs.python.org/issue46750  opened by slingamn

#46751: Windows-style path is not recognized under cygwin
https://bugs.python.org/issue46751  opened by mikekaganski

#46752: Introduce task groups to asyncio and change task cancellation 
https://bugs.python.org/issue46752  opened by gvanrossum

#46753: Statically allocate and initialize the empty tuple.
https://bugs.python.org/issue46753  opened by eric.snow

#46754: Improve Python Language Reference based on [K??hl 2020]
https://bugs.python.org/issue46754  opened by gvanrossum

#46755: QueueHandler logs stack_info twice
https://bugs.python.org/issue46755  opened by erik.montnemery

#46756: Incorrect authorization check in urllib.request
https://bugs.python.org/issue46756  opened by serhiy.storchaka

#46757: dataclasses should define an empty __post_init__
https://bugs.python.org/issue46757  opened by NeilGirdhar

#46758: Incorrect behaviour creating a Structure with ctypes.c_bool bi
https://bugs.python.org/issue46758  opened by dudenwatschn

#46759: sys.excepthook documentation doesn't mention that it isn't cal
https://bugs.python.org/issue46759  opened by cjwatson

#46760: test_dis should test the dis module, not everything else
https://bugs.python.org/issue46760  opened by Mark.Shannon

#46761: functools.update_wrapper breaks the signature of functools.par
https://bugs.python.org/issue46761  opened by larry

#46763: os.path.samefile incorrect results for shadow copies
https://bugs.python.org/issue46763  opened by nijave

#46764: Wrapping a bound method with a @classmethod no longer works
https://bugs.python.org/issue46764  opened by msullivan

#46765: Replace Locally Cached Strings with Statically Initialized Obj
https://bugs.python.org/issue46765  opened by eric.snow

#46767: [Doc] sqlite3 Cursor.execute() return value is unspecified
https://bugs.python.org/issue46767  opened by kephas

#46769: Improve documentation for `typing.TypeVar`
https://bugs.python.org/issue46769  opened by AlexWaygood

#46770: ConfigParser(dict_type=) not behaving as expected
https://bugs.python.org/issue46770  opened by malonn

#46771: Add some form of cancel scopes
https://bugs.python.org/issue46771  opened by gvanrossum

#46772: Statically Initialize PyArg_Parser in clinic.py
https://bugs.python.org/issue46772  opened by eric.snow

#46773: Add a Private API for Looking Up Global Objects
https://bugs.python.org/issue46773  opened by eric.snow

#46774: Importlib.metadata.version picks first distribution not latest
https://bugs.python.org/issue46774  opened by kkirsche-github

#46775: [Windows] OSError should unconditionally call winerror_to_errn

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-18 Thread Eric Snow
On Wed, Feb 16, 2022 at 8:45 PM Inada Naoki  wrote:
> Is there any common tool that utilize CoW by mmap?
> If you know, please its link to the PEP.
> If there is no common tool, most Python users can get benefit from this.

Sorry, I'm not aware of any, but I also haven't researched the topic
much.  Regardless, that would be a good line of inquiry.  A reference
like that would probably help make the PEP a bit more justifiable
without per-interpreter GIL. :)

> Generally speaking, fork is a legacy API. It is too difficult to know
> which library is fork-safe, even for stdlibs. And Windows users can
> not use fork.
> Optimizing for non-fork use case is much better than optimizing for
> fork use cases.

+1

> I hope per-interpreter GIL replaces fork use cases.

Yeah, that's definitely one big benefit.

> But tools using CoW without fork also welcome, especially if it
> supports Windows.

+1

> Anyway, I don't believe stopping refcounting will fix the CoW issue
> yet. See this article [1] again.
>
> [1] 
> https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

That's definitely an important point, given that the main objective of
the proposal is to allow disabling mutation of runtime-internal object
state so that some objects can be made truly immutable.

I'm sure Eddie has some good insight on the matter (and may have even
been involved in writing that article).  Eddie?

> Note that they failed to fix CoW by stopping refcounting code objects! (*)
> Most CoW was caused by cyclic GC and finalization caused most CoW.

That's a good observation!

> (*) It is not surprising to me because eval loop don't incre/decref
> most code attributes. They borrow reference from the code object.

+1

> So we need a sample application and profile it, before saying it fixes CoW.
> Could you provide some data, or drop the CoW issue from this PEP until
> it is proved?

We'll look into that.

-eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ESRBMP4WTNONED3K6Z5HMYYY2WIMQZT3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-18 Thread Mark Shannon

Hi Fabio,

On 17/02/2022 7:30 pm, Fabio Zadrozny wrote:


Em qui., 17 de fev. de 2022 às 16:05, Mark Shannon mailto:m...@hotpy.org>> escreveu:

Hi Fabio,

This happened as part of implementing PEP 626.
The previous behavior isn't very robust w.r.t doc strings and
compiler optimizations.

OOI, why would you want to revert to the old behavior?


Hi Mark,

The issue I'm facing is that ipython uses an approach of obtaining the ast for 
a function to be executed and then it goes on node by node executing it.

When running in the debugger, the debugger caches some information based on 
(co_firstlineno, co_name, co_filename) to have information saved across multiple calls to 
the same function, which works in general because each function in a given python file 
would have its own co_firstlineno, but in this specific case here it gets a single function 
and then recompiles it expression by expression -- so, it'll have the same co_filename 
() and the same co_name (), but then the co_firstlineno would be 
different (because the statement resides in a different line), but with Python 3.10 this 
assumption fails as even the co_firstlineno will be the same...


A bit off topic, but why not use a different name for each cell?



You can see the actual issues at: https://github.com/microsoft/vscode-jupyter/issues/8803 
 / 
https://github.com/ipython/ipykernel/issues/841/ 
 
https://github.com/microsoft/debugpy/issues/844 


After thinkering a bit it seems it's possible to create a new code object based 
on an existing code object with `code.replace` (re-assembling the 
co_lnotab/co_firstlineno), so, I'm going to propose that as a fix to ipython, 
but I found it really strange that this did change in Python 3.10 in the first 
place as the old behavior seemed reasonable for me (i.e.: with the new behavior 
it's a bit strange that the user is compiling something with a single statement 
on line 99 and yet the resulting code object will have the co_firstlineno == 1).


That's the behavior for functions. If I define a function on line 10, but the 
first line of code in that function is on line 100, then 
`func.__code__.co_firstlineno == 10`, not 100. Modules start on line 1, by 
definition.

You can find the first line of actual code using the `co_lines()` iterator.

firstline = next(mod.__code__.co_lines())[2]

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4JWS4QUENUSBWVXUFPNR5IWYFMC7AV53/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Compiling of ast.Module in Python 3.10 and co_firstlineno behavior

2022-02-18 Thread Fabio Zadrozny
Em qui., 17 de fev. de 2022 às 17:55, Gabriele 
escreveu:

> Hi Fabio
>
> Does the actual function object get re-created as well during the
> recompilation process that you have described? Perhaps it might help
> to note that the __code__ attribute of a function object f can be
> mutated and that f is hashable?
>

Thank you for the reminder... Right now the way that it works in ipython
the code object is really recreated and then is directly executed (which
kind of makes sense since it's expected that cells change for
re-evaluation).

I had previously considered caching in the debugger using the code object,
but as code objects can be created during the regular execution, the
debugger could end up creating a huge leak.

Best regards,

Fabio
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z5DH3HOV73OS2N3C4ZKYI4UB2WQYTS2I/
Code of Conduct: http://python.org/psf/codeofconduct/