[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Nathaniel Smith
On Wed, Dec 15, 2021 at 3:07 AM Victor Stinner  wrote:
> I wrote https://bugs.python.org/issue39511 and
> https://github.com/python/cpython/pull/18301 to have per-interpreter
> None, True and False singletons. My change is backward compatible on
> the C API: you can still use "Py_None" in your C code. The code gets
> the singleton object from the current interpreter with a function
> call:
>
> #define Py_None Py_GetNone()
>
> Py_GetNone() is implemented as: "return _PyInterpreterState_GET()->none;"

It's backward compatible for the C API, but not for the stable C ABI
-- that exports Py_None directly as a symbol.

You also need a solution for all the static global PyTypeObjects in C
extensions. I don't think there's any API-compatible way to make those
heap-allocated.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HS3NENZXKYSBCN4OVUOZ2CUX55Z5CXG5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Nathaniel Smith
On Wed, Dec 15, 2021 at 2:21 AM Antoine Pitrou  wrote:
>
> On Wed, 15 Dec 2021 10:42:17 +0100
> Christian Heimes  wrote:
> > On 14/12/2021 19.19, Eric Snow wrote:
> > > A while back I concluded that neither approach would work for us.  The
> > > approach I had taken would have significant cache performance
> > > penalties in a per-interpreter GIL world.  The approach that modifies
> > > Py_INCREF() has a significant performance penalty due to the extra
> > > branch on such a frequent operation.
> >
> > Would it be possible to write the Py_INCREF() and Py_DECREF() macros in
> > a way that does not depend on branching? For example we could use the
> > highest bit of the ref count as an immutable indicator and do something like
> >
> >  ob_refcnt += !(ob_refcnt >> 63)
> >
> > instead of
> >
> >  ob_refcnt++
>
> Probably, but that would also issue spurious writes to immortal
> refcounts from different threads at once, so might end up worse
> performance-wise.

Unless the CPU is clever enough to skip claiming the cacheline in
exclusive-mode for a "+= 0". Which I guess is something you'd have to
check empirically on every microarch and instruction pattern you care
about, because there's no way it's documented. But maybe? CPUs are
very smart, except when they aren't.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZLARVPQCPZXWVHGYOZNSDRTCNNJ67ANM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-14 Thread Nathaniel Smith
Whoops, never mind, I see that you started the "immortal objects"
thread to discuss this.

On Tue, Dec 14, 2021 at 4:54 PM Nathaniel Smith  wrote:
>
> How did you end up solving the issue where Py_None is a static global
> that's exposed as part of the stable C ABI?
>
> On Tue, Dec 14, 2021 at 9:13 AM Eric Snow  wrote:
> >
> > Hi all,
> >
> > I'm still hoping to land a per-interpreter GIL for 3.11.  There is
> > still a decent amount of work to be done but little of it will require
> > solving any big problems:
> >
> > * pull remaining static globals into _PyRuntimeState and PyInterpreterState
> > * minor updates to PEP 554
> > * finish up the last couple pieces of the PEP 554 implementation
> > * maybe publish a companion PEP about per-interpreter GIL
> >
> > There are also a few decisions to be made.  I'll open a couple of
> > other threads to get feedback on those.  Here I'd like your thoughts
> > on the following:
> >
> > Do we need a PEP about per-interpreter GIL?
> >
> > I haven't thought there would be much value in such a PEP.  There
> > doesn't seem to be any decision that needs to be made.  At best the
> > PEP would be an explanation of the project, where:
> >
> > * the objective has gotten a lot of support (and we're working on
> > addressing the concerns of the few objectors)
> > * most of the required work is worth doing regardless (e.g. improve
> > runtime init/fini, eliminate static globals)
> > * the performance impact is likely to be a net improvement
> > * it is fully backward compatible and the C-API is essentially unaffected
> >
> > So the value of a PEP would be in consolidating an explanation of the
> > project into a single document.  It seems like a poor fit for a PEP.
> >
> > (You might wonder, "what about PEP 554?"  I purposefully avoided any
> > discussion of the GIL in PEP 554.  It's purpose is to expose
> > subinterpreters to Python code.)
> >
> > However, perhaps I'm too close to it all.  I'd like your thoughts on the 
> > matter.
> >
> > Thanks!
> >
> > -eric
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at 
> > https://mail.python.org/archives/list/python-dev@python.org/message/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY/
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> Nathaniel J. Smith -- https://vorpus.org



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QDXHLDVHIW4B3GN7PNEL4LPZRJVGGW2R/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-14 Thread Nathaniel Smith
How did you end up solving the issue where Py_None is a static global
that's exposed as part of the stable C ABI?

On Tue, Dec 14, 2021 at 9:13 AM Eric Snow  wrote:
>
> Hi all,
>
> I'm still hoping to land a per-interpreter GIL for 3.11.  There is
> still a decent amount of work to be done but little of it will require
> solving any big problems:
>
> * pull remaining static globals into _PyRuntimeState and PyInterpreterState
> * minor updates to PEP 554
> * finish up the last couple pieces of the PEP 554 implementation
> * maybe publish a companion PEP about per-interpreter GIL
>
> There are also a few decisions to be made.  I'll open a couple of
> other threads to get feedback on those.  Here I'd like your thoughts
> on the following:
>
> Do we need a PEP about per-interpreter GIL?
>
> I haven't thought there would be much value in such a PEP.  There
> doesn't seem to be any decision that needs to be made.  At best the
> PEP would be an explanation of the project, where:
>
> * the objective has gotten a lot of support (and we're working on
> addressing the concerns of the few objectors)
> * most of the required work is worth doing regardless (e.g. improve
> runtime init/fini, eliminate static globals)
> * the performance impact is likely to be a net improvement
> * it is fully backward compatible and the C-API is essentially unaffected
>
> So the value of a PEP would be in consolidating an explanation of the
> project into a single document.  It seems like a poor fit for a PEP.
>
> (You might wonder, "what about PEP 554?"  I purposefully avoided any
> discussion of the GIL in PEP 554.  It's purpose is to expose
> subinterpreters to Python code.)
>
> However, perhaps I'm too close to it all.  I'd like your thoughts on the 
> matter.
>
> Thanks!
>
> -eric
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KRGRUHLHKBN5QQUY4VBO33DUS2Z2H5A4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python multithreading without the GIL

2021-10-08 Thread Nathaniel Smith
On Thu, Oct 7, 2021 at 7:54 PM Sam Gross  wrote:
> Design overview:
> https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/edit

Whoa, this is impressive work.

I notice the fb.com address -- is this a personal project or something
facebook is working on? what's the relationship to Cinder, if any?

Regarding the tricky lock-free dict/list reads: I guess the more
straightforward approach would be to use a plain ol' mutex that's
optimized for this kind of fine-grained per-object lock with short
critical sections and minimal contention, like WTF::Lock. Did you try
alternatives like that? If so, I assume they didn't work well -- can
you give more details?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XAZYWRYXKIVUSMRSMFAETKQDLGL27L7X/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Worried about Python release schedule and lack of stable C-API

2021-09-28 Thread Nathaniel Smith
On Tue, Sep 28, 2021 at 12:40 AM Guido van Rossum  wrote:
>
> What I have heard repeatedly, from people who are paid to know, is that most 
> users don’t care about the latest features, and would rather stick to a 
> release until it becomes unsupported. (Extreme example: Python 2.)
>
> Numpy isn’t random, it’s at the bottom of the food chain for a large 
> ecosystem or two — if it doesn’t support a new Python release, none of its 
> dependent packages can even start porting. (I guess only Cython is even 
> lower, but it’s a build-time tool. And indeed it has supported 3.10 for a 
> long time.)

Well, no, it wasn't entirely random :-).

Being on the bottom of the food chain is important, but I don't think
it's the full story -- Tensorflow is also at the bottom of a huge
ecosystem. I think it's also related to NumPy being mostly
volunteer-run, which means they're sensitive to feedback from
individual enthusiasts, and enthusiasts are the most aggressive early
adopters. OTOH Tensorflow is a huge commercial collaboration, and
companies *hate* upgrading.

Either way though, it doesn't seem to be anything to do with CPython's
ABI stability or release cadence.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UNE7TY7ZFGBOOOG3H3YNB5MFGX5IWZ76/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Worried about Python release schedule and lack of stable C-API

2021-09-28 Thread Nathaniel Smith
On Sun, Sep 26, 2021 at 3:38 AM  wrote:
> Open3D is an example. Will finally move to Python 3.9 some time the coming 
> month. Its dependency graph contains about 70 other packages.
>
> In this specific case, the underlying problem was that TensorFlow was stuck 
> at 3.8. The TensorFlow codebase got ported in November 2020, then released 
> early 2021. Then Open3D included the new Tensorflow (plus whatever else 
> needed to be adapted) in their codebase in May. They’re now going through 
> their release schedule, and their 0.14 release should be up on PyPI soon.

I took a minute to look up the release dates to fill in this timeline:

Python 3.9 released: October 2020
Tensorflow adds 3.9 support: November 2020
Tensorflow v2.5.0 released with the new 3.9 support: May 2021
Open3d adds 3.9 support: May 2021
First Open3d release to include the new 3.9 support: ~October 2021

So it seems like in this case at least, the year long delay consists
of ~1 month of porting work, and ~11 months of projects letting the
finished code sit in their repos without shipping to users.

It seems like the core problem here is that these projects don't
consider it important to keep up with the latest Python release. I'm
not sure what CPython upstream can do about that. Maybe you could
lobby these projects to ship releases more promptly?

By contrast, to pick a random library that uses the unstable C API
extensively, NumPy is already shipping wheels for 3.10 -- and 3.10
isn't even out yet. So it's certainly possible to do, even for
projects with a tiny fraction of Tensorflow's engineering budget.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OGBANTGGJAI2ZM5SYSLCWFGRIML5VWUL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Worried about Python release schedule and lack of stable C-API

2021-09-25 Thread Nathaniel Smith
On Sat, Sep 25, 2021 at 5:40 PM  wrote:
> PyPI packages and wheels are targeted to specific Python versions, which 
> means that any project that depends on some of the larger extension packages 
> (of which there are many, and many of which are must-have for many projects) 
> now start lagging Python versions by years, because somewhere deep down in 
> the dependency graph there is something that is still stuck at Python 3.8 
> (for example).

Can you give some examples of the packages you're thinking of, that
are prominent/must-have and stuck on years-old Pythons?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YF7Z4EDN3DHBKPMYCIJFTI2JDTIJYHXW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: python-dev thread w/ Marco Sulla

2021-08-16 Thread Nathaniel Smith
Was this post intended to go to python-dev or...?

On Mon, Aug 16, 2021 at 9:53 AM Brett Cannon  wrote:
>
> https://mail.python.org/archives/list/python-dev@python.org/thread/JRFJ4QH7TR35HFRQWOYPPCGOYRFAXK24/
>
> I can't be objective with Marco as I believe we have recorded issues with him 
> previously (as with Steven if you take Marco's initial side with this).
>
> The thing that pushed me over the edge to report this was 
> https://mail.python.org/archives/list/python-dev@python.org/message/O3JB3FE33KMT3OHZCVH3XO6VNJTGH5NL/.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/TX44HOPXBSHDPA5ZXCAUGWTQDHCTD723/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K3HEVE3HD5Y3DNB5ARF7V4USF6FWXYMU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 558, the simplest thing I could come up with

2021-07-29 Thread Nathaniel Smith
On Thu, Jul 29, 2021 at 4:52 PM Nick Coghlan  wrote:
>
> On Fri, 30 Jul 2021, 6:05 am Mark Shannon,  wrote:
>>
>> Hi Nick,
>>
>> Our discussion on PEP 558 got me thinking
>> "What is the simplest thing that would work?".
>>
>> This is what I came up (in the form of a draft PEP):
>> https://github.com/markshannon/peps/blob/pep-locals/pep-06xx.rst
>>
>> It doesn't have O(1) len(f_locals), and it does break
>> `PyEval_GetLocals()` but I think the that is a small price to pay for
>> simplicity and consistency.
>
>
> I don't think it is OK to break PyEval_GetLocals() when we really don't need 
> to,
> and the proposal also discards all the feedback that I received on earlier 
> iterations of PEP 558. (I particularly recommend reading Nathaniel's analysis 
> of why returning the proxy from locals() would be more likely to cause bugs 
> in existing code than it would be to eliminate any).

Heh, I was actually just re-reading PEP 558 and going to ask you to
include more details to justify the complexity, as compared to
something like Mark's latest proposal here -- I'd totally forgotten I
wrote that old post :-). So that was a timely reminder!

Looking at the references in the PEP, is this the writeup you're talking about?

https://mail.python.org/pipermail/python-dev/2019-May/157738.html

The conclusion there is:

> I'm leaning towards saying that on net, [snapshot] beats [PEP-minus-tracing]: 
> it's dramatically simpler, and the backwards incompatibilities that we've 
> found so far seem pretty minor, on par
with what we do in every point release. (In fact, in 3/4 of the cases
I looked at, [snapshot] is actually what users seemed to trying to use
in the first place.)
>
> For [proxy] versus [snapshot], a lot depends on what we think of changing the 
> semantics of exec(). [proxy] is definitely more consistent and elegant, and 
> if we could go back in time I think it's
what we'd have done from the start. Its compatibility is maybe a bit
worse than [snapshot] on non-exec() cases, but this seems pretty minor
overall (it often doesn't matter, and if it does just write
dict(locals()) instead of locals(), like you would in non-function
scope). But the change in exec() semantics is an actual language
change, even though it may not affect much real code, so that's what
stands out for me.

I *think* (please correct me if I'm wrong) that what that calls
[PEP-minus-tracing] is now corresponds to the current PEP draft, and
[proxy] corresponds to Mark's draft at the beginning of this thread?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TB6FZXVO3LQFFVCKAN7OTAJ427HY67TF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Roundup to GitHub Issues migration

2021-06-22 Thread Nathaniel Smith
On Tue, Jun 22, 2021 at 4:42 AM Sebastian Rittau  wrote:
>
> Am 22.06.21 um 10:00 schrieb Tiziano Zito:
> > I think it is important to notice that GitHub actively blocks user
> > registration and activity from countries that are sanctioned by the US
> > government. At least in 2019 GitHub was blocking users from IPs
> > located in Cuba, North Corea, Syria, Crimea, Iran, etc (see for
> > example [1]). They block, of course, users of any nationality, if they
> > happen to be traveling or living in those countries.
> >
> > I could not find any clear official statement from GitHub, but I think
> > this is something to consider nonetheless, especially now that the
> > Python community is making great efforts to become more welcoming and
> > diverse. The fact of excluding a significant part of the potential
> > contributors based on a random list by a random government over which
> > the Python community as a whole has no influence whatsoever seems a
> > move in the wrong direction.
>
> I was overall in favor of moving Python issues over to GitHub, for
> convenience, easier access, and a more usable interface. But I think the
> issue above is a showstopper. This problem of course already exists for
> pull requests, but discriminating against users based on their place of
> residence is absolutely unacceptable to me. In fact, it is directly in
> violation to the PSF's mission statement that says in part: "... to
> support and facilitate the growth of a diverse and international
> community of Python programmers." This issue hasn't been addresses in
> PEP 581, so I believe it wasn't considered when accepting the PEP. But
> it's serious enough that I would like to ask the steering council to
> reconsider their decision to accept PEP 581.

As much as we might wish otherwise, the PSF is also a US entity and
has to comply with US laws. GitHub's official policy at

   https://docs.github.com/en/github/site-policy/github-and-trade-controls

gives the impression that they're reading the law as narrowly as
possible, and allowing access to every person that they legally can.
In particular, that policy page claims that there are no restrictions
on users from Cuba or Iran, and that users from Syria and Crimea are
allowed to participate in OSS projects, just not give GitHub money.
(They do disallow use by North Koreans and "Specially Designated
Nationals".)

It is even possible for the PSF to do better without breaking the law?
I'm not an expert in this area at all, so happy to be educated if
so...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JLTR2LD73ZRS6UQJQVKF2XOP467RTK2U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Critique of PEP 657 -- Include Fine Grained Error Locations in Tracebacks

2021-05-20 Thread Nathaniel Smith
On Wed, May 19, 2021 at 7:28 PM Pablo Galindo Salgado
 wrote:
>>
>> Excellent point! Do you know how reliable this is in practice, i.e.
>> what proportion of bytecode source spans are something you can
>> successfully pass to ast.parse? If it works it's obviously nicer, but
>> I can't tell how often it works. E.g. anything including
>> return/break/continue/yield/await will fail, since those require an
>> enclosing context to be legal. I doubt return/break/continue will
>> raise exceptions often, but yield/await do all the time.
>
>
> All those limitations are compiler-time limitations because they imply
> scoping. A valid AST is any piece of a converted parse tree, or a piece
> of the PEG sub grammar:
>
> >>> ast.dump(ast.parse("yield"))
> 'Module(body=[Expr(value=Yield())], type_ignores=[])'
> >>> ast.dump(ast.parse("return"))
> 'Module(body=[Return()], type_ignores=[])'
> >>> ast.dump(ast.parse("continue"))
> 'Module(body=[Continue()], type_ignores=[])'
> >>> ast.dump(ast.parse("await x"))
> "Module(body=[Expr(value=Await(value=Name(id='x', ctx=Load(], 
> type_ignores=[])"

Ah, nice! I guess I was confused by memories of the behavior in 3.6
and earlier, where 'await' was a pseudokeyword:

❯ docker run -it --rm python:3.6-alpine
>>> import ast
>>> ast.parse("await f()")
SyntaxError: invalid syntax

Hopefully if we add more pseudokeywords in the future it won't break
the ability to parse traceback spans.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OGGRDHOPOVXS25UMYYLZ55FGDMA25UMU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Critique of PEP 657 -- Include Fine Grained Error Locations in Tracebacks

2021-05-19 Thread Nathaniel Smith
On Tue, May 18, 2021 at 2:49 PM Pablo Galindo Salgado
 wrote:
> * It actually doesn't have more advantages. The current solution in the PEP 
> can do exactly the same as this solution if you allow reparsing when
> displaying tracebacks. This is because with the start line, end line, start 
> offset and end offset and the original file, you can extract the source that
> is associated with the instruction, parse it (and this
> is much faster because you just need to parse the tiny fragment) and then you 
> get an AST node that you can use for whatever you want.

Excellent point! Do you know how reliable this is in practice, i.e.
what proportion of bytecode source spans are something you can
successfully pass to ast.parse? If it works it's obviously nicer, but
I can't tell how often it works. E.g. anything including
return/break/continue/yield/await will fail, since those require an
enclosing context to be legal. I doubt return/break/continue will
raise exceptions often, but yield/await do all the time.

You could kluge it by wrapping the source span in a dummy 'async def'
before parsing, since that makes yield/await legal, but OTOH it makes
'yield from' and 'from X import *' illegal.

I guess you could have a helper that attempts passing the string to
ast.parse, and if that fails tries wrapping it in a loop/sync
def/async def/etc. until one of them succeeds. Maybe that would be a
useful utility to add to the traceback module?

Or add a PyCF_YOLO flag that tries to make sense of an arbitrary
out-of-context string.

(Are there any other bits of syntax that require specific contexts
that I'm not thinking of? If __enter__/__exit__ raise an exception,
then what's the corresponding span? The entire 'with' block, or just
the 'with' line itself?)

-n

PS: this is completely orthogonal to PEP 657, but if you're excited
about making tracebacks more readable, another piece of low-hanging
fruit would be to print method __qualname__s instead of __name__s in
the traceback output. The reason we don't do that now is that
__qualname__ lives on the function object, but in a traceback, we
can't get the function object. The traceback only has access to the
code object, and the code object doesn't have __qualname__, just
__name__. Probably the cleanest way to do this would be to make the
traceback or code object have a pointer back to the function object.
See also https://bugs.python.org/issue12857.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JWZHMW6WQOQMSAGWKMRTEHHRSZRMNW3C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Critique of PEP 657 -- Include Fine Grained Error Locations in Tracebacks

2021-05-17 Thread Nathaniel Smith
On Mon, May 17, 2021 at 6:18 AM Mark Shannon  wrote:
> 2. Repeated binary operations on the same line.
>
> A single location can also be clearer when all the code is on one line.
>
> i1 + i2 + s1
>
> PEP 657:
>
> i1 + i2 + s1
> 
>
> Using a single location:
>
> i1 + i2 + s1
>  ^

It's true this case is a bit confusing with the whole operation span
highlighted, but I'm not sure the single location version is much better. I
feel like a Really Good UI would like, highlight the two operands in
different colors or something, or at least underline the two separate items
whose type is incompatible separately:

TypeError: unsupported operand type(s) for +: 'int' + 'str':
i1 + i2 + s1
^^^   ~~

More generally, these error messages are the kind of thing where the UI can
always be tweaked to improve further, and those tweaks can make good use of
any rich source information that's available.

So, here's another option to consider:

- When parsing, assign each AST node a unique, deterministic id (e.g.
sequentially across the AST tree from top-to-bottom, left-to-right).
- For each bytecode offset, store the corresponding AST node id in an
lnotab-like table
- When displaying a traceback, we already need to go find and read the
original .py file to print source code at all. Re-parse it, and use the ids
to find the original AST node, in context with full structure. Let the
traceback formatter do whatever clever stuff it wants with this info.

Of course if the .py and .pyc files don't match, this might produce
gibberish. We already have that problem with showing source lines, but it
might be even more confusing if we get some random unrelated AST node. This
could be avoided by storing some kind of hash in the code object, so that
we can validate the .py file we find hasn't changed (sha512 if we're
feeling fancy, crc32 if we want to save space, either way is probably fine).

This would make traceback printing more expensive, but only if you want the
fancy features, and traceback printing is already expensive (it does file
I/O!). Usually by the time you're rendering a traceback it's more important
to optimize for human time than CPU time. It would take less memory than
PEP 657, and the same as Mark's proposal (both only track one extra integer
per bytecode offset). And it would allow for arbitrarily rich traceback
display.

(I guess in theory you could make this even cheaper by using it to replace
lnotab, instead of extending it. But I think keeping lnotab around is a
good idea, as a fallback for cases where you can't find the original source
but still want some hint at location information.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BUXFOSAEBXLIHH432PKBCXOGXUAHQIVP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Future PEP: Include Fine Grained Error Locations in Tracebacks

2021-05-07 Thread Nathaniel Smith
On Fri, May 7, 2021 at 8:14 PM Neil Schemenauer  wrote:
>
> On 2021-05-07, Pablo Galindo Salgado wrote:
> > Technically the main concern may be the size of the unmarshalled
> > pyc files in memory, more than the storage size of disk.
>
> It would be cool if we could mmap the pyc files and have the VM run
> code without an unmarshal step.  One idea is something similar to
> the Facebook "not another freeze" PR but with a twist.  Their
> approach was to dump out code objects so they could be loaded as if
> they were statically defined structures.
>
> Instead, could we dump out the pyc data in a format similar to Cap'n
> Proto?  That way no unmarshal is needed.  The VM would have to be
> extensively changed to run code in that format.  That's the hard
> part.
>
> The benefit would be faster startup times.  The unmarshal step is
> costly.  It would mostly solve the concern about these larger
> linenum/colnum tables.  We would only load that data into memory if
> the table is accessed.

A simpler version would be to pack just the docstrings/lnotab/column
numbers into a separate part of the .pyc, and store a reference to the
file + offset to load them lazily on demand. No need for mmap.

Could also store them in memory, but with some cheap compression
applied, and decompress on access. None of these get accessed often.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q2DBRE5YKLTSPVCMUCXPEDXKFCA4UUGQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-05-06 Thread Nathaniel Smith
On Thu, May 6, 2021 at 2:17 AM Nathaniel Smith  wrote:
>
> On Thu, Apr 29, 2021 at 9:14 AM Yury Selivanov  
> wrote:
> > Nathaniel, at this point it's clear that this thread somehow does not
> > help us understand what you want. Could you please just write your own
> > PEP clearly outlining your proposal, its upsides and downsides?
> > Without a PEP from you this thread is just a distraction.
>
> If that's the best way to move forward, then ok. My main thing is just
> that I don't want to make this some antagonistic me-vs-you thing.
> After all, we all want the best design to be chosen, and none of us
> know what that is yet, so there's no need for conflict :-).
>
> Irit, Yury, would you be interested in co-authoring a PEP for the
> "flat EG" approach? Basically trying to set down the best possible
> version of each approach, so that we can put them next to each other?

Uh, probably this is obvious, but I mean "co-authoring with me". I'm
not suggesting you two go off to do it without me :-)

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5TFK5LEMDQ7GNYFKNS7QO7GGT33M6KAP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-05-06 Thread Nathaniel Smith
On Thu, Apr 29, 2021 at 9:14 AM Yury Selivanov  wrote:
> Nathaniel, at this point it's clear that this thread somehow does not
> help us understand what you want. Could you please just write your own
> PEP clearly outlining your proposal, its upsides and downsides?
> Without a PEP from you this thread is just a distraction.

If that's the best way to move forward, then ok. My main thing is just
that I don't want to make this some antagonistic me-vs-you thing.
After all, we all want the best design to be chosen, and none of us
know what that is yet, so there's no need for conflict :-).

Irit, Yury, would you be interested in co-authoring a PEP for the
"flat EG" approach? Basically trying to set down the best possible
version of each approach, so that we can put them next to each other?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LZBM5F6SAULEN7KEYNHBCLUPTB4JHBGO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-28 Thread Nathaniel Smith
On Fri, Apr 23, 2021 at 4:08 AM Irit Katriel  wrote:
>
> On Fri, Apr 23, 2021 at 9:22 AM Nathaniel Smith  wrote:
>> I'm not trying to filibuster here -- I really want some form of EGs to
>> land.
>
> I'm very glad to hear that. It's been hard to know where you stand, because 
> you didn't even decline our invitation in October to work together on this, 
> and several later invitations to look at the implementation and try it with 
> Trio -- you just didn't reply.   The only responses I saw were public, on 
> this list, taking us right back to the drawing board (for example - the 
> suggestion you mention in the previous paragraph as 
> not-sufficiently-explored, actually appears in the rejected ideas section of 
> the PEP, and the reason for rejection is not that it's incompatible with 
> nesting). So thank you for clarifying that you are, ultimately, supportive of 
> our efforts.

Yes, I apologize again for the radio silence there -- I had real world
stuff that left me with no cope for open-source. I, uh, was feeling
guilty about not getting back to you the whole time, if that helps?
Probably that doesn't help.

My memory is that in our initial discussions, I suggested having each
'except Blah as exc' clause be executed once, receiving an
ExceptionGroup containing all the Blah exceptions. Guido pointed out
that this broke typing -- 'exc' would not longer have type 'Blah' --
and I was like... okay yeah that's a fatal flaw, never mind. And I
never seriously raised the 'execute a single clause multiple times'
option, because of the issue where in the nested design, taking
individual exceptions out of an ExceptionGroup breaks tracebacks.

Looking at the relevant section of the PEP again [1], it notes the
same fatal flaw with my first suggestion, and then says that the
multiple-except-executions option should be rejected because users
have written code like 'except SomeError: ...' with the expectation
that the 'except' clause would run exactly once. That's definitely
true, and it's a downside of the multiple-except-executions approach,
but I don't think it's convincing enough to rule this out on its own.
The problem is, *all* our options for how 'except' should interact
with ExceptionGroups will somehow break previous expectations.

Concretely: imagine you have a pre-existing 'except SomeError', and
some new code inside the 'try' block raises some number of
'SomeError's wrapped in an ExceptionGroup. There are three options:

- Execute the 'except' block multiple times. This breaks the
expectation that it should be executed at most once.
- Execute the 'except' block exactly once. But if there are multiple
SomeError's, this require they be grouped and delivered as a single
exception, which breaks typing.
- Execute the 'except' block zero times. This is what the current PEP
chooses, and breaks the expectation that 'except SomeError' should
catch 'SomeError'.

So we have to pick our poison.

[1] 
https://www.python.org/dev/peps/pep-0654/#extend-except-to-handle-exception-groups

> We do realize that we didn't adequately credit you for the contributions of 
> your 2018 work to this PEP, and have now added an acknowledgements section 
> for that purpose. Apologies for that.

Oh, that didn't bother me at all, but thanks :-). And I'm sorry if I
was denying you credit for things that you had actually independently
re-invented.

> I'm confused about the flattening suggestion - above you talk about "flat 
> EG", but below about tracebacks. It's not clear to me whether you want EG to 
> be flat (ie no nesting of EGs) or just the traceback to be flat (but you can 
> still have a nested EG).

Hmm, I was thinking about making both of them flat, so no nested EGs.
In all my designs, the only reason I ever had nesting was because I
couldn't figure out any other way to make the tracebacks work. Do you
have some other motivation for wanting nesting? If so that would be
interesting, because it might point to why we're talking past each
other and help us understand the problem better...

> I also don't know what problem you are trying to solve with this.

I'm not saying that there's some fatal problem with the current PEP.
(In my first message I explicitly said that it would be an improvement
over the status quo :-).) But I think that nesting will be really
counterintuitive/confusing for users in some ways. And concurrency
APIs will be offputting if they force you to use a different special
form of 'except' all the time. Basically the 'flat' version might be a
lot more ergonomic, and that's important for a language like Python.

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives

[Python-Dev] Re: Existing asyncio features scheduled for removal in Python 3.9 and 3.10

2021-04-26 Thread Nathaniel Smith
@asyncio.coroutine and @types.coroutine are different beasts.
@asyncio.coroutine is the deprecated one; @types.coroutine is
lower-level and not deprecated.

On Mon, Apr 26, 2021 at 2:48 PM Luciano Ramalho  wrote:
>
> I don't understand how it's possible to "Deprecate @coroutine for sake
> of async def" when native coroutines ultimately depend on a generator
> to be driven by the event loop.
>
> What am I missing?
>
> Perhaps in asyncio the generator magic is now written in C, but as
> Nathaniel J. Smith points out, Trio and Curio both use Python
> generators at their cores.
>
> Cheers,
>
> Luciano
>
> On Mon, Apr 26, 2021 at 5:55 PM Illia Volochii  
> wrote:
> >
> > Hi everyone,
> >
> > There are a couple of uncompleted asyncio feature removals scheduled
> > for 3.9 and 3.10 releases.
> > It will be great if we either complete them or reschedule before the
> > 3.10 feature freeze. There are two stale pull requests related to
> > this.
> >
> > Removal of @asyncio.coroutine in version 3.10 deprecated since version 3.8
> > Documentation: 
> > https://docs.python.org/3.10/library/asyncio-task.html#asyncio.coroutine
> > Issue deprecating the decorator: https://bugs.python.org/issue36921
> > Issue for the removal: https://bugs.python.org/issue43216
> > There is no pull request yet, mainly because of an unclarified
> > question regarding types.coroutine in 36921.
> >
> > Prohibiting non-ThreadPoolExecutor in loop.set_default_executor
> > Warning scheduling the prohibiting in version 3.9:
> > https://github.com/python/cpython/blob/425434dadc30d96dc1c0c628f954f9b6f5edd2c9/Lib/asyncio/base_events.py#L816-L821
> > Issue: https://bugs.python.org/issue43234
> > Stale pull request: https://github.com/python/cpython/pull/24540
> >
> > Prohibiting previously deprecated operations on 
> > asyncio.trsock.TransportSocket
> > Warning scheduling the prohibiting in version 3.9:
> > https://github.com/python/cpython/blob/425434dadc30d96dc1c0c628f954f9b6f5edd2c9/Lib/asyncio/trsock.py#L20-L24
> > Issue: https://bugs.python.org/issue43232
> > Stale pull request: https://github.com/python/cpython/pull/24538
> >
> > Thanks,
> > Illia
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at 
> > https://mail.python.org/archives/list/python-dev@python.org/message/PLSLFTJXY2JUIRGJARBER4SRUWDXX2AQ/
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> Luciano Ramalho
> |  Author of Fluent Python (O'Reilly, 2015)
> | http://shop.oreilly.com/product/0636920032519.do
> |  Technical Principal at ThoughtWorks
> |  Twitter: @ramalhoorg
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/L3Y4TWVWYE3PD3O5BQIW35LYE55EBTCU/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V6OZH6YFUOV5XPXKRWRBKRKFDJXTJQR3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-25 Thread Nathaniel Smith
On Fri, Apr 23, 2021 at 2:45 AM Chris Angelico  wrote:
>
> On Fri, Apr 23, 2021 at 6:25 PM Nathaniel Smith  wrote:
> > The main possibility that I don't think we've examined fully is to
> > make 'except' blocks fire multiple times when there are multiple
> > exceptions.
>
> Vanilla except blocks? Not sure if I'm misunderstanding, but that
> could cause some problems. Consider:
>
> trn.begin()
> try:
> ...
> except BaseException:
> trn.rollback()
> raise
> else:
> trn.commit()
>
> What happens if multiple exceptions get raised? Will the transaction
> be rolled back more than once? What gets reraised?
>
> If I'm completely misunderstanding you here, my apologies.

Yeah, you've understood correctly, and you see why I wrote "both the
current proposal and the alternative have very complex implications
and downsides" :-)

A lot depends on the details, too. One possible design is "in general,
vanilla except blocks will trigger multiple times, but as a special
case, except: and except BaseException: will only fire once, receiving
the whole ExceptionGroup as a single exception". (Rationale for making
a special case: if you're saying "catch anything, I don't care what",
then you obviously didn't intend to do anything in particular with
those exceptions, plus this is the only case where the types work.) In
that design, your example becomes correct again.

Some other interesting cases:

# Filters out all OSError's that match the condition, while letting
all other OSError's continue
try:
...
except OSError as exc:
if exc.errno != errno.EWHATEVER:
raise

# If this gets an ExceptionGroup([ValueError, KeyboardInterrupt]),
then it logs the ValueError and
# then exits with the selected error code
try:
...
except Exception as exc:
logger.log_exception(exc)
except KeyboardInterrupt:
sys.exit(17)

OTOH you can still come up with cases that aren't handled correctly.
If you have old 'except' code mixed with new code raising
ExceptionGroup, then there's no way to make that do the right thing in
*all* cases. By definition, when the 'except' code was written, the
author wasn't thinking about ExceptionGroups! So the question is which
combination of semantics ends up doing the least harm on average
across all sorts of different real-world cases...

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3V6YEKHOD5MYLSCPVDB7IYGX4EOO4KWB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Anyone else gotten bizarre personal replies to mailing list posts?

2021-04-23 Thread Nathaniel Smith
I just got the reply below sent directly to my personal account, and I'm
confused about what's going on. If it's just a one off I'll chalk it up to
random internet weirdness, but if other folks are getting these too it
might be something the list admins should look into? Or... something?

-- Forwarded message -
From: Hoi lam Poon 
Date: Fri, Apr 23, 2021, 02:01
Subject: Re: [Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]
To: Nathaniel Smith 


Stop pretending, I can definitely get the key control file, your working
group, all past actions and instructions cannot be cleared in front of me
at all. You have been playing around for a few days, and I won’t stop you.
Your face? I won’t, you know, you can’t drive me away, and that file is
all, after I get it, you will be convicted even if you disband, I swear

在 2021年4月23日 週五 16:23,Nathaniel Smith  寫道:

> On Wed, Apr 21, 2021 at 4:50 PM Guido van Rossum  wrote:
> > On Wed, Apr 21, 2021 at 3:26 PM Nathaniel Smith  wrote:
> >> Sure. This was in my list of reasons why the backwards compatibility
> >> tradeoffs are forcing us into awkward compromises. I only elaborated
> >> on it b/c in your last email you said you didn't understand why this
> >> was a problem :-). And except* is definitely useful. But I think there
> >> are options for 'except' that haven't been considered fully.
> >
> > Do you have any suggestions, or are you just telling us to think harder?
> Because we've already thought as hard as we could and within all the
> constraints (backwards compatibility and otherwise) we just couldn't think
> of a better one.
>
> The main possibility that I don't think we've examined fully is to
> make 'except' blocks fire multiple times when there are multiple
> exceptions. We ruled it out early b/c it's incompatible with nested
> EGs, but if flat EGs are better anyway, then the balance shifts around
> and it might land somewhere different. it's a tricky discussion
> though, b/c both the current proposal and the alternative have very
> complex implications and downsides. So we probably shouldn't get too
> distracted by that until after the flat vs nested discussion has
> settled down more.
>
> I'm not trying to filibuster here -- I really want some form of EGs to
> land. I think python has the potential to be the most elegant and
> accessible language around for writing concurrent programs, and EGs
> are a key part of that. I don't want to fight about anything; I just
> want to work together to make sure we have a full picture of our
> options, so we can be confident we're making the best choice.
>
> > The real cost here is that we would need a new "TracebackGroup" concept,
> since the internal data structures and APIs keep the traceback chain and
> the exception object separated until the exception is caught. In our early
> design stages we actually explored this and the complexity of the data
> structures was painful. We eventually realized that we didn't need this
> concept at all, and the result is much clearer, despite what you seem to
> think.
>
> I'm not talking about TracebackGroups (at least, I think I'm not?). I
> think it can be done with exactly our current data structures, nothing
> new.
>
> - When an EG is raised, build the traceback for just that EG while
> it's unwinding. This means if any C code peeks at exc_info while it's
> in flight, it'll only see the current branch of the traceback tree,
> but that seems fine.
> - When the exception is caught and we go to write back the traceback
> to its __traceback__ attribute, instead "peek through" the EG and
> append the built-up traceback entries onto each of the constituent
> exceptions.
>
> You could get cleverer for efficiency, but that basic concept seems
> pretty simple and viable to me. What am I missing?
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VOBOWZGW44GNMW6IUZU6P5OO2A5YKB53/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4BAOL763Y2O2AXLEILYGHSNG2VMZJIN6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-23 Thread Nathaniel Smith
On Wed, Apr 21, 2021 at 4:50 PM Guido van Rossum  wrote:
> On Wed, Apr 21, 2021 at 3:26 PM Nathaniel Smith  wrote:
>> Sure. This was in my list of reasons why the backwards compatibility
>> tradeoffs are forcing us into awkward compromises. I only elaborated
>> on it b/c in your last email you said you didn't understand why this
>> was a problem :-). And except* is definitely useful. But I think there
>> are options for 'except' that haven't been considered fully.
>
> Do you have any suggestions, or are you just telling us to think harder? 
> Because we've already thought as hard as we could and within all the 
> constraints (backwards compatibility and otherwise) we just couldn't think of 
> a better one.

The main possibility that I don't think we've examined fully is to
make 'except' blocks fire multiple times when there are multiple
exceptions. We ruled it out early b/c it's incompatible with nested
EGs, but if flat EGs are better anyway, then the balance shifts around
and it might land somewhere different. it's a tricky discussion
though, b/c both the current proposal and the alternative have very
complex implications and downsides. So we probably shouldn't get too
distracted by that until after the flat vs nested discussion has
settled down more.

I'm not trying to filibuster here -- I really want some form of EGs to
land. I think python has the potential to be the most elegant and
accessible language around for writing concurrent programs, and EGs
are a key part of that. I don't want to fight about anything; I just
want to work together to make sure we have a full picture of our
options, so we can be confident we're making the best choice.

> The real cost here is that we would need a new "TracebackGroup" concept, 
> since the internal data structures and APIs keep the traceback chain and the 
> exception object separated until the exception is caught. In our early design 
> stages we actually explored this and the complexity of the data structures 
> was painful. We eventually realized that we didn't need this concept at all, 
> and the result is much clearer, despite what you seem to think.

I'm not talking about TracebackGroups (at least, I think I'm not?). I
think it can be done with exactly our current data structures, nothing
new.

- When an EG is raised, build the traceback for just that EG while
it's unwinding. This means if any C code peeks at exc_info while it's
in flight, it'll only see the current branch of the traceback tree,
but that seems fine.
- When the exception is caught and we go to write back the traceback
to its __traceback__ attribute, instead "peek through" the EG and
append the built-up traceback entries onto each of the constituent
exceptions.

You could get cleverer for efficiency, but that basic concept seems
pretty simple and viable to me. What am I missing?

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VOBOWZGW44GNMW6IUZU6P5OO2A5YKB53/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-21 Thread Nathaniel Smith
On Tue, Apr 20, 2021 at 3:15 AM Irit Katriel  wrote:
> On Tue, Apr 20, 2021 at 2:48 AM Nathaniel Smith  wrote:
>>
>>
>> The problem is that most of the time, even if you're using concurrency
>> internally so multiple things *could* go wrong at once, only one thing
>> actually *does* go wrong. So it's unfortunate if some existing code is
>> prepared for a specific exception that it knows can be raised, that
>> exact exception is raised... and the existing code fails to catch it
>> because now it's wrapped in an EG.
>
> Yes, this was discussed at length on this list. Raising an exception group is 
> an API-breaking change. If a function starts raising exception groups its 
> callers need to be prepared for that. Realistically we think exception groups 
> will be raised by new APIs.  We tried and were unable to define exception 
> group semantics for except that would be reasonable and backwards compatible. 
> That's why we added except*.

Sure. This was in my list of reasons why the backwards compatibility
tradeoffs are forcing us into awkward compromises. I only elaborated
on it b/c in your last email you said you didn't understand why this
was a problem :-). And except* is definitely useful. But I think there
are options for 'except' that haven't been considered fully.

Saying that you have to make a new API every time you start using
concurrency inside a function is extremely restrictive. The whole
point of structured concurrency (and structured programming in
general) is that function callers don't need to know about the control
flow decisions inside a function. So right now, the EG proposal is
like saying that if you every have a function that doesn't contain a
'for' loop, and then you want to add a 'for' loop, then you have to
define a whole new API instead of modifying the existing one.

I absolutely get why the proposal looks like that. I'm just making the
point that we should make sure we've exhausted all our options before
settling for that as a compromise.

>> > It is easy enough to write a denormalize() function in traceback.py that 
>> > constructs this from the current EG structure, if you need it (use the 
>> > leaf_generator from the PEP). I'm not sure I see why we should trouble the 
>> > interpreter with this.
>>
>> In the current design, once an exception is wrapped in an EG, then it
>> can never be unwrapped, because its traceback information is spread
>> across the individual exception + the EG tree around it. This is
>> confusing to users ("how do I check errno?"), and makes the design
>> more complicated (the need for topology-preserving .split(), the
>> inability to define a sensible EG.__iter__, ...). The advantage of
>> making the denormalized form the native form is that now the leaf
>> exceptions would be self-contained objects like they are now, so you
>> don't need EG nesting at all, and users can write intuitive code like:
>>
>> except OSError as *excs:
>> remainder = [exc for exc in excs if exc.errno != ...]
>> if remainder:
>> raise ExceptionGroup(remainder)
>
>
> We have this precise example in the PEP:
>match, rest = excs.split(lambda e: e.errno != ...)
>
> You use split() instead of iteration for that.  split() preserves all 
> __context__, __cause__ and __traceback__ information, on all leaf and 
> non-leaf exceptions.

Well, yeah, I know, I'm the one who invented split() :-). My point was
to compare these two options: in the flat EG example, most Python
users could write and read that code without knowing anything except
"there can be multiple exceptions now". It's all old, well-known
constructs used in the obvious way.

For the .split() version, you have to write a lambda (which is allowed
to access parts of the exception object, but not all of it!), and use
this idiosyncratic method that only makes sense if you know about EG
tree structures. That's a lot more stuff that users have to understand
and hold in their head.

Or here's another example. Earlier you suggested:

> If you are using concurrency internally and don't want to raise EGs 
> externally, then surely you will catch EGs, select one of the exceptions to 
> raise and throw away all the others

But with nested EGs, this is difficult, because you *can't* just pull
out one of the exceptions to raise. The leaf exceptions aren't
standalone objects; you need some obscure traceback manipulation to do
this. I guarantee that users will get this wrong, even if we provide
the tools, because the explanation about when and why you need the
tools is complicated and people won't internalize it.

With flat EGs, this is trivial: it's just `raise
ExceptionGroup[the_selected_exception_index]`.

>> > For display purposes, it is probably nicer to 

[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-21 Thread Nathaniel Smith
On Tue, Apr 20, 2021 at 2:15 PM srku...@mail.de  wrote:
>
> So, forgive me my relatively simple mental model about ExceptionGroup. I 
> still try to create one for daily use.
>
> As noted in the discussion, an EG provides a way to collect exceptions from 
> different sources and raise them as a bundle. They have no apparent relation 
> up until this point in time (for whatever reason they have been separate and 
> for whatever reason they are bundled now). The result would be a tree graph 
> in any case.
>
> A usual datastructure for a tree is to store all child nodes at the parent 
> node.
>
> That was the idea behind the content of BaseException.__group__: it’s the 
> list of child exceptions bundled at a specific point in time and raise as 
> such a bundle. So all exceptions could become EGs with the additional 
> semantics you‘ve described in the PEP.
>
> Illustrative Example:
> >>> bundle_exc.__group__
> [IOError(123), RuntimerError(‘issue somewhere’)]
>
> I was wondering what of the PEP could be removed to make it simpler and more 
> acceptable/less confusing (also looking at reactions from Twitter etc.) and I 
> found these additional classes to be a part of it. Additionally, I fail to 
> see how to access these bundled exceptions in an easy manner like __cause__ 
> and __context__. (As the PEP also referring to them). So, I removed the 
> classes and added a regular attribute.

This seems more confusing to me too. Instead of having a single
ExceptionGroup class, you're suggesting that all exceptions should
effectively become ExceptionGroups. I know what
ExceptionGroup([ValueError]) means -- it means there was a ValueError,
and it got put in a group, so it should probably be handled the same
way as a ValueError. I have no idea what a KeyError([ValueError])
would mean. Is that a KeyError or a ValueError? Adding flexibility
doesn't necessarily make things simpler :-)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F6LVP7CLQGIAZ2PI2SL6FN3VXF2MVY2M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Keeping Python a Duck Typed Language.

2021-04-20 Thread Nathaniel Smith
On Tue, Apr 20, 2021 at 10:07 AM Mark Shannon  wrote:
>
> Hi everyone,
>
> Once upon a time Python was a purely duck typed language.
>
> Then came along abstract based classes, and some nominal typing starting
> to creep into the language.
>
> If you guarded your code with `isinstance(foo, Sequence)` then I could
> not use it with my `Foo` even if my `Foo` quacked like a sequence. I was
> forced to use nominal typing; inheriting from Sequence, or explicitly
> registering as a Sequence.

You say this like it's a bad thing, but how is this avoidable, even in
principle? Structural typing lets you check whether Foo is duck-shaped
-- has appropriate attribute names, etc. But quacking like a duck is
harder: you also have to implement the Sequence behavioral contract,
and realistically the only way to know that is if the author of Foo
tells you.

I'm not even sure that this *is* nominal typing. You could just as
well argue that "the operation `isinstance(..., Sequence)` returns
`True`" is just another of the behavioral constraints that are
required to quack like a sequence.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GKQV6MXBIDL3AISZE6TFPZBUSMAGPSKD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-19 Thread Nathaniel Smith
On Mon, Apr 5, 2021 at 9:48 AM Irit Katriel  wrote:
> On Mon, Apr 5, 2021 at 11:01 AM Nathaniel Smith  wrote:
>> - I'm uncomfortable with how in some contexts we treat EG's as placeholders 
>> for the contained exceptions, and other places we treat them like a single 
>> first-class exceptions. (Witness all the feedback about "why not just catch 
>> the ExceptionGroup and handle it by hand?", and imagine writing the docs 
>> explaining all the situations where that is or isn't a good idea and the 
>> pitfalls involved...) If we could somehow pick one and stick to it then I 
>> think that would make it easier for users to grasp. (My gut feeling is that 
>> making them pure containers is better, which to me is why it makes sense for 
>> them to be @final and why I keep hoping we can figure out some better way 
>> for plain 'except' and EGs to interact.)
>
>
> I'm not sure what you mean by "a placeholder". An exception group is an 
> exception, and except* is a new syntax that helps you manipulate exception 
> groups.  I don't think the confusion you mention was due to the fact that 
> except can catch EGs, but rather due to the fact that the simplest examples 
> don't show you why except is not good enough, and why we need except*.
>
> Suppose we did as you suggest and made exception group a container which is 
> not exception. Now suppose that an exception is raised in an except* block. 
> What is the context of this exception? Do we change the requirement that an 
> exception's context is always an exception?
>
> How do you raise an exception group if it's not an Exception? Do we go back 
> to allowing random objects being raised?

Ah, right. I'm talking about the conceptual/semantic level, not the
implementation. For the implementation, backcompat certainly means we
need some single object that can represent the whole group of
exceptions, so that it can be returned by sys.exc_info(), C code can
access it through the tstate, etc. But the more important question is
how we explain this thing to users, and make it easy for them to do
the right thing. Is the EG the exception they're trying to catch or
otherwise work with? or is the EG just a detail that most users should
ignore because they're focused on the leaf exceptions inside?

For the concurrency use case (asyncio/trio/etc.), it's all about the
leaf exceptions. ExceptionGroup([SomeError()]) and SomeError() both
mean exactly the same thing, and treating them differently is
basically always a bug. The ExceptionGroup object doesn't tell you
anything about what went wrong, like normal exceptions do. It just
tells you that there was some frame in between where the exception was
raised and where it was caught where some other exception *could* have
happened, but didn't. You *can* give 'except ExceptionGroup' a
meaning, but that meaning doesn't make sense -- it's like saying "I
want to catch any exceptions that was raised by code that *could* have
raised a different exception instead" or "I want to catch all
exceptions whose traceback contains an entry that's a certain subclass
of TracebackType". Similarly, it doesn't make sense to attach error
strings to an EG, or define custom subclasses, etc.. Ideally, plain
'except' would (somehow) handle 'EG([SomeError()])' and 'SomeError()'
in exactly the same way; and if it doesn't, then using plain 'except'
in concurrent code is going to usually be a bug.

For other use cases, it does make sense to think of EG as a regular
exception. Like, if Hypothesis wants to report that it ran some tests
and there were failures, then modelling that as a single
HypothesisError seems like a nice API. You never need 'except*'
semantics to catch part of a HypothesisError. 'except HypothesisError'
is totally sensible. The only real value you get from the PEP is that
if you *don't* catch the HypothesisError, then the default traceback
machinery can now automatically include info about the individual test
failures.

If we were starting from scratch, I don't think this would be a big
conflict. I think the PEP would focus 100% on the concurrency case.
That's the one that's really difficult to solve without language
support. And, if you solve that, then there's a very natural way to
handle the Hypothesis case too: do 'raise HypothesisError from
EG(individual failures)'. That makes it explicit that HypothesisError
is a single exception that should be handled as a unit, while still
letting you introspect the individual exceptions and including them in
the default traceback output. (In fact Trio uses this trick right now,
in it's TCP connection code [1]. We report a single
OSError("connection failed"), but then include all the details about
each individual failure in its __cause__.)

The actual problem is that we're not starting from scratch; we're
trying to retrofit the concurrency-s

[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

2021-04-05 Thread Nathaniel Smith
OK, better late than never... here's a much-delayed review of the PEP.
Thank you Irit and Guido for carrying this forward while I've been AWOL!
It's fantastic to see my old design sketches turned into something like,
actually real.

== Overall feelings ==

Honestly, I have somewhat mixed feelings ExceptionGroups. I don't see any
way around adding ExceptionGroups in some form, because it's just a fact of
life that in a concurrent program, multiple things can go wrong at once,
and we want Python to be usable for writing concurrent programs. Right now
the state of the art is "exceptions in background threads/tasks get dropped
on the floor", and almost anything is better than that. The current PEP is
definitely better than that. But at the same time, there are a lot of
compromises needed to retrofit this onto Python's existing system, and the
current proposal feels like a bunch of awkward hacks with hacks on top.
That's largely my fault for not thinking of something better, and maybe
there is nothing better. But I still wish we could come up with something
more elegant, and I do see why this proposal has made people uncomfortable.
For example:

- I'm uncomfortable with how in some contexts we treat EG's as placeholders
for the contained exceptions, and other places we treat them like a single
first-class exceptions. (Witness all the feedback about "why not just catch
the ExceptionGroup and handle it by hand?", and imagine writing the docs
explaining all the situations where that is or isn't a good idea and the
pitfalls involved...) If we could somehow pick one and stick to it then I
think that would make it easier for users to grasp. (My gut feeling is that
making them pure containers is better, which to me is why it makes sense
for them to be @final and why I keep hoping we can figure out some better
way for plain 'except' and EGs to interact.)

- If a function wants to start using concurrency internally, then now *all*
its exceptions have to get wrapped in EGs and callers have to change *all*
their exception handling code to use except* or similar. You would think
this was an internal implementation detail that the caller shouldn't have
to care about, but instead it forces a major change on the function's
public API. And this is because regular 'except' can't do anything useful
with EGs.

- We have a special-case hack to keep 'except Exception' working, but it
has tricky edge cases (Exceptions can still sneak past if they're paired up
with a BaseException), and it really is specific to 'except Exception'; it
doesn't work for any other 'except SomeError' code. This smells funny.

Anyway, that's just abstract context to give an idea where I'm coming from.
Maybe we just have to accept these trade-offs, but if anyone has any ideas,
speak up...

== Most important comment ==

Flat ExceptionGroups: there were two basic design approaches we discussed
last year, which I'll call "flat" vs "nested". The current PEP uses the
nested design, where ExceptionGroups form a tree, and traceback information
is distributed in pieces over this tree. This is the source of a lot of the
complexity in the current PEP: for example, it's why EG's don't have one
obvious iteration semantics, and it's why once an exception is wrapped in
an EG, it can never be unwrapped again (because it would lose traceback
information).

The idea of the "flat" design is to instead store all the traceback info
directly on the leaf exceptions, so the EG itself can be just a pure
container holding a list of exceptions, that's it, with no nesting. The
downside is that it requires changes to the interpreter's code for updating
__traceback__ attributes, which is currently hard-coded to only update one
__traceback__ at a time.

For a third-party library like Trio, changing the interpreter is obviously
impossible, so we never considered it seriously. But in a PEP, changing the
interpreter is possible. And now I'm worried that we ruled out a better
solution early on for reasons that no longer apply. The more I think about
it, the more I suspect that flat EGs would end up being substantially
simpler all around? So I think we should at least think through what that
would look like (and Irit, I'd love your thoughts here now that you're the
expert on the CPython details!), and document an explicit decision one way
or another. (Maybe we should do a call or something to go over the details?
I'm trying to keep this email from ballooning out of control...)

== Smaller points ==

- In my original proposal, EGs didn't just hold a list of exceptions, but
also a list of "origins" for each exception. The idea being that if, say,
you attempt to connect to a host with an IPv4 address and an IPv6 address,
and they raised two different OSErrors that got bundled together into one
EG, then it would be nice to know which OSError came from which attempt. Or
in asyncio/trio it would be nice if tracebacks could show which task each
exception came from. It seems like this got dropped 

[Python-Dev] Re: PEP 654 -- Exception Groups and except* : request for feedback for SC submission

2021-02-26 Thread Nathaniel Smith
On Fri, Feb 26, 2021 at 5:05 AM Irit Katriel  wrote:
> I'm not sure it's safe to assume that it is necessarily a programming error, 
> and that the interpreter can essentially break the program in this case.
> Is this not allowed?
>
> try:
> try:
> obj.func()# function that raises ExceptionGroups
> except AttributeError:
> logger.info("obj doesn't have a func")
> except *(AttributeError, SyntaxError):
> logger.info("func had some problems")

I'd be fine with disallowing that. The intuition is that things will
be simplest if ExceptionGroup is kept as transparent and meaningless
as possible, i.e. ExceptionGroup(ValueError) and ValueError mean
exactly the same thing -- "some code inside this block raised
ValueError" -- and ideally should be processed in exactly the same
way. (Of course we can't quite achieve that due to backcompat issues,
but the closer we can get, the better, I think?)

If you need to distinguish between the AttributeError from
'obj.__getattr__("func")' vs the AttributeError from the call to
func(), then there's already an obvious way to do that, that works for
all functions, not just ones that happen to raise ExceptionGroups:

try:
f = obj.func
except AttributeError:
...
try:
f()
except ...:# or except *...:
...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VSIGKUNU2GNLMKKH4GLMRSICTQYVLVCZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654 -- Exception Groups and except* : request for feedback for SC submission

2021-02-25 Thread Nathaniel Smith
On Thu, Feb 25, 2021 at 10:23 PM Glenn Linderman  wrote:
> So then why do you need  except*  at all?  Only to catch unwrapped
> ExceptionGroup before it gets wrapped?
>
> So why not write except ExceptionGroup, and let it catch unwrapped
> ExceptionGroup?
>
> That "CUTE BIT" could be done only when hitting an except chain that
> doesn't include an except ExceptionGroup.
>
> Nope, I didn't read the PEP, and don't understand the motivation, but
> the discussion sure sounded confusing. This is starting to sound almost
> reasonable.

I'm not sure what to make of the complaint that not reading the
motivation makes the motivation confusing :-).

But very briefly: the core reason we need ExceptionGroup is because in
concurrent programs, multiple things can go wrong at the same time, so
our error handling system needs to have some way to represent and cope
with that. This means that in concurrent programs (e.g. anything using
asyncio, trio, etc.), it's safest to assume that *any* exception could
be wrapped in an ExceptionGroup and use except* for everything. The
question is how to make error handling in those programs as ergonomic
as Python exceptions are currently in non-concurrent programs, without
creating too many issues for existing code.

So in programs like this, you have to assume that 'except
ExceptionGroup' catches *all* exceptions, or worse, a
random/unpredictable subset. Saying that that would be good enough is
like saying that bare 'except:' is good enough for regular
non-concurrent Python (after all, you can always do an isinstance
check inside the 'except' block and re-raise the ones you don't want,
right?), and that 'except ' is a pointless frivolity. I
think most people would disagree with that :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BYCDFS5FVIF4FYNKUPVEVWRSCCEIU6CB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 654 -- Exception Groups and except* : request for feedback for SC submission

2021-02-25 Thread Nathaniel Smith
On Thu, Feb 25, 2021 at 2:13 PM Guido van Rossum  wrote:
>
> So is "fail-fast if you forget to handle an ExceptionGroup" really a feature? 
> (Do we call this out in the PEP?)
>
> We may believe that "except Exception" is an abuse, but it is too common to 
> dismiss out of hand. I think if some app has e.g. a main loop where they 
> repeatedly do something that may fail in many ways (e.g. handle a web 
> request), catch all errors and then just log the error and continue from the 
> top, it's a better experience if it logs "ExceptionGroup:  [ subexceptions>]" than if it crashes.

Yeah, 'except Exception' happens a lot in the wild, and what to do
about that has been a major sticking point in the ExceptionGroup
debates all along. I wouldn't say that 'except Exception' is an abuse
even -- what do you want gunicorn to do if your buggy flask app raises
some random exception? Crash your entire web server, or log it and
attempt to keep going? (This is almost your example, but adding in the
part where gunicorn is reliable and well-respected, and that its whole
job is to invoke arbitrarily flaky code written by random users.)
Yury/I/others did discuss the idea of a
BaseExceptionGroup/ExceptionGroup split a lot, and I think the general
feeling is that it could potentially work, but feels like a
complicated and awkward hack, so no-one was super excited about it.
For a while we also had a compromise design where only
BaseExceptionGroup was built-in, but we left it non-final specifically
so asyncio could define an ExceptionsOnlyExceptionGroup.

Another somewhat-related awkward part of the API is how ExceptionGroup
and plain-old 'except' should interact *in general*. The intuition is
that if you have 'except ValueError' and you get an
'ExceptionGroup(ValueError)', then the user's code has some kind of
problem and we should probably do something? to let them know? One
idea I had was that we should raise a RuntimeError if this happens,
sort of similar to PEP 479. But I could never quite figure out how
this would help (gunicorn crashing with a RuntimeError isn't obviously
better than gunicorn crashing with an ExceptionGroup).

== NEW IDEA THAT MAYBE SOLVES BOTH PROBLEMS ==

Proposal:

- any time an unwinding ExceptionGroup encounters a traditional
try/except, then it gets replaced with a RuntimeError whose __cause__
is set to the original ExceptionGroup and whose first traceback entry
points to the offending try/except block

- CUTE BIT I ONLY JUST THOUGHT OF: this substitution happens right
*before* we start evaluating 'except' clauses for this try/except

So for example:

If an ExceptionGroup hits an 'except Exception': The ExceptionGroup is
replaced by a RuntimeError. RuntimeError is an Exception, so the
'except Exception' clause catches it. And presumably logs it or
something. This way your log contains both a notification that you
might want to switch to except* (from the RuntimeError), *along with*
the full original exception details (from the __cause__ attribute). If
it was an ExceptionGroup(KeyboardInterrupt), then it still gets caught
and that's not so great, but at least you get the RuntimeError to
point out that something has gone wrong and tell you where?

If an ExceptionGroup(ValueError) hits an 'except ValueError': it
doesn't get caught, *but* a RuntimeError keeps propagating out to tell
you you have a problem. And when that RuntimeError eventually hits the
top of your program or ends up in your webserver logs or whatever,
then the RuntimeError's traceback will point you to the 'except
ValueError' that needs to be fixed.

If you write 'except ExceptionGroup': this clause is a no-op that will
never execute, because it's impossible to still have an ExceptionGroup
when we start matching 'except' clauses. (We could additionally emit a
diagnostic if we want.)

If you write bare 'except:', or 'except BaseException': the clause
always executes (as before), but they get the RuntimeError instead of
the ExceptionGroup. If you really *wanted* the ExceptionGroup, you can
retrieve it from the __cause__ attribute. (The only case I can think
of where this would be useful is if you're writing code that has to
straddle both old and new Python versions *and* wants to do something
clever with ExceptionGroups. I think this would happen if you're
implementing Trio, or implementing a higher-level backport library for
catching ExceptionGroups, something like that. So this only applies to
like half a dozen users total, but they are important users :-).)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JYRWNHRBPFQQPT44TZQRTVFNXPAY27UL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Why aren't we allowing the use of C11?

2021-01-28 Thread Nathaniel Smith
On Thu, Jan 28, 2021 at 9:03 PM Gregory P. Smith  wrote:
>
> On Thu, Jan 28, 2021 at 10:52 AM Charalampos Stratakis  
> wrote:
>>
>>
>>
>> - Original Message -
>> > From: "Mark Shannon" 
>> > To: "Python Dev" 
>> > Sent: Thursday, January 28, 2021 5:26:37 PM
>> > Subject: [Python-Dev] Why aren't we allowing the use of C11?
>> >
>> > Hi everyone,
>> >
>> > PEP 7 says that C code should conform to C89 with a subset of C99 allowed.
>> > It's 2021 and all the major compilers support C11 (ignoring the optional
>> > parts).
>> >
>> > C11 has support for thread locals, static asserts, and anonymous structs
>> > and unions. All useful features.
>> >
>> > Is there a good reason not to start using C11 now?
>> >
>> > Cheers,
>> > Mark.
>> >
>> >
>> > ___
>> > Python-Dev mailing list -- python-dev@python.org
>> > To unsubscribe send an email to python-dev-le...@python.org
>> > https://mail.python.org/mailman3/lists/python-dev.python.org/
>> > Message archived at
>> > https://mail.python.org/archives/list/python-dev@python.org/message/PLXETSQE7PRFXBXN2QY6VNPKUTM6I7OD/
>> > Code of Conduct: http://python.org/psf/codeofconduct/
>> >
>> >
>>
>> Depends what platforms the python core developers are willing to support.
>>
>> Currently downstream on e.g. RHEL7 we compile versions of CPython under gcc 
>> 4.8.2 which does not support C11.
>>
>> In addition the manylinux2014 base image is also based on CentOS 7, which 
>> wouldn't support C11 as well.
>
>
> I suspect this is the primary technical reason not to adopt C11 left.
>
> But aren't things like manylinux2014 defined by the contents of a centrally 
> maintained docker container?
> If so (I'm not one who knows how wrong my guess likely is...), can we get 
> those updated to include a more modern compiler so we can move on sooner than 
> the deprecation of manylinux2014?

RedHat maintains builds of gcc 8.2.1 for CentOS/RHEL 7, that have some
clever hacks to guarantee that the resulting binaries will work on
CentOS/RHEL 7: https://www.softwarecollections.org/en/scls/rhscl/devtoolset-8/

I'm pretty sure that's what the manylinux2014 image is using.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5QQIKXXY7MNXPYM5AZYXPQFAQOHOYRTP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Speeding up CPython

2020-10-22 Thread Nathaniel Smith
Hi Mark,

This sounds really cool. Can you give us more details? Some questions that
occurred to me while reading:

- You're suggesting that the contractor would only be paid if the desired
50% speedup is achieved, so I guess we'd need some objective Python
benchmark that boils down to a single speedup number. Did you have
something in mind for this?

- How much of the work has already been completed?

- Do you have any preliminary results of applying that work to that
benchmark? Even if it's preliminary, it would still help a lot in making
the case for this being a realistic plan.

-n

On Tue, Oct 20, 2020 at 6:00 AM Mark Shannon  wrote:

> Hi everyone,
>
> CPython is slow. We all know that, yet little is done to fix it.
>
> I'd like to change that.
> I have a plan to speed up CPython by a factor of five over the next few
> years. But it needs funding.
>
> I am aware that there have been several promised speed ups in the past
> that have failed. You might wonder why this is different.
>
> Here are three reasons:
> 1. I already have working code for the first stage.
> 2. I'm not promising a silver bullet. I recognize that this is a
> substantial amount of work and needs funding.
> 3. I have extensive experience in VM implementation, not to mention a
> PhD in the subject.
>
> My ideas for possible funding, as well as the actual plan of
> development, can be found here:
>
> https://github.com/markshannon/faster-cpython
>
> I'd love to hear your thoughts on this.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Nathaniel J. Smith -- https://vorpus.org 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HZCOTNS6AQ33JLDVRVA5LK3L72Q4VYQG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 515: Non-empty return statement

2020-08-24 Thread Nathaniel Smith
It was decided to leave out 'yield from' support for async generators,
at least for now, due to the implementation complexity. And non-empty
returns in generators are only intended for use with 'yield from', so
they got left out as well.

On Mon, Aug 24, 2020 at 4:48 PM Paul Bryan  wrote:
>
> Per PEP 515:
>
> It is a SyntaxError to have a non-empty return statement in an asynchronous 
> generator.
>
>
> Synchronus generators can return values that include it in the StopIteration 
> exception. Why would a return value in an asynchronous generator not do the 
> same in the StopAsyncIteration?
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/IYF6QKMNSTZEVEXBDIJH7NO2VNDJNUZJ/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZGYKS2MS2LRGKHU7KD2ZO4SMH3GB3HR5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [Python-ideas] Re: Amend PEP-8 to require clear, understandable comments instead of Strunk & White Standard English comments

2020-06-29 Thread Nathaniel Smith
On Mon, Jun 29, 2020 at 5:04 AM Paul Sokolovsky  wrote:
>
> Hello,
>
> On Mon, 29 Jun 2020 14:35:08 +0300
> "Jim F.Hilliard"  wrote:
>
> > I believe I'm not the only one with this question but, how is Strunk &
> > White connected with white supremacy?
>
> I wouldn't be surprised if the only connection between them is the word
> "white".

It's not Strunk and White per se, it's the idea of enforcing "standard
English", where "standard" here means "talks like a American with an
Ivy league education".

You all are displaying breathtakingly levels of ignorance here.
There's nothing wrong with being ignorant – we can't be experts in
everything, and your education probably didn't spend a lot of time
talking about the long history of language "standards" and the many
ways they've been used, intentionally, systematically, and violently
to enforce racist/classist/etc. policies. But using a thread on
python-dev to make clueless speculations like this is *incredibly*
inappropriate and offensive.

I'm not going to try to educate you on that history – it's completely
off-topic for this list, and you can do your own work if you care to.
But let's let this thread die here.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PDTXLGIK5GGDDM7OBAD3XCHDHGE55NR7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [Python-ideas] Re: Amend PEP-8 to require clear, understandable comments instead of Strunk & White Standard English comments

2020-06-29 Thread Nathaniel Smith
On Mon, Jun 29, 2020 at 2:31 AM Steve Holden  wrote:
> The commit message used, however, reveals implementation details of the 
> change which are irrelevant to the stated aim, which is making the 
> documentation clear and concise. Use of such language is certainly 
> regrettable, since it carries with it the implication that the Python 
> developer community has somehow been wilfully sanctioning "relics of white 
> supremacy" up until the change was made.
>
> There certainly is a place in tech for politics, as I have argued many times, 
> and I am sure nobody wishes to continue to use language that might be 
> offensive to readers. But I would suggest that the politics can safely be 
> omitted from commit messages, since they can only properly be fully addressed 
> in the conversation about the PR in advance. The wording of the commit 
> message has the appearance (probably specious) of wanting to rub former 
> misdeeds in the face of a largely innocent community, and that is the 
> principal reason I found it distasteful and unnecessary.

I just re-read the commit message, and I think you're being
oversensitive and imagining things that aren't there. The actual
commit message is written in a straightforward and factual way, and
spends special effort on *absolving* the community of this kind of
guilt. In particular, it emphasizes that the new text is accomplishing
"the same goal", "maintaining the original intent", and describes the
old text as a "relic", which is another way of saying that the
problems were only there by historical accident, rather than by anyone
intentionally keeping it there. Merely mentioning the concept of white
supremacy is not an attack on you or the community [1].

-n

[1] https://en.wikipedia.org/wiki/White_defensiveness

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5EAQOYAEMKGTRDTTNZQNF4QNJBQGSOYA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Can we stop adding to the C API, please?

2020-06-03 Thread Nathaniel Smith
On Wed, Jun 3, 2020 at 2:10 PM Victor Stinner  wrote:
> For the short term, my plan is to make structure opaque in the limited
> C API, before breaking more stuff in the public C API :-)

But you're also breaking the public C API:
https://github.com/MagicStack/immutables/issues/46
https://github.com/pycurl/pycurl/pull/636

I'm not saying you're wrong to do so, I'm just confused about whether
your plan is to break stuff or not and on which timescale.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UMGH7AOPW25IXZ7IWD73EKSVYY6ROCLC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Map errno==ETIME to TimeoutError

2020-05-25 Thread Nathaniel Smith
On Mon, May 25, 2020 at 1:25 AM Serhiy Storchaka  wrote:
>
> 24.05.20 17:48, Eric V. Smith пише:
> > Does anyone have an opinion on https://bugs.python.org/issue39673? It
> > maps ETIME to TimeoutError, in addition to the already existing ETIMEDOUT.
> >
> > http://man7.org/linux/man-pages/man3/errno.3.html says:
> >
> > *ETIME *Timer expired (POSIX.1 (XSI STREAMS option)).
> >
> > (POSIX.1 says "STREAMioctl(2)  
> >   timeout".)
> >
> > *ETIMEDOUT *Connection timed out (POSIX.1-2001).
> >
> >
> > It seems like a reasonable change to me, but I'm not a subject matter
> > expert on STREAMS, or what other affect this might have.
>
> Why it was not mapped at first place? Was there any discussion?

AFAICT from a few minutes of searching, ETIME is almost never used,
which probably explains it. It doesn't show up in glibc at all, and
only a few times in the Linux kernel sources, most notably in the
graphics subsystem -- and apparently this causes some annoyance for
the *BSDs, which share a bunch of that code and don't have ETIME, so
they #define ETIME ETIMEDOUT to get the code to build.

I'm not sure there's any point in making the change – the BPO doesn't
even have an example of it, just someone who was poking around in
obscure corners of errno and noticed it – but it seems harmless. It
sounds like literally no-one knows what the difference between these
is supposed to be.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PTDLD2ZTMIEYRQTWVK6ECSC2YDIVNHVJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-08 Thread Nathaniel Smith
On Fri, May 8, 2020 at 12:30 AM Sebastian Krause  wrote:
>
> Guido van Rossum  wrote:
> > Is there some kind of optimized communication possible yet between
> > subinterpreters? (Otherwise I still worry that it's no better than
> > subprocesses -- and it could be worse because when one
> > subinterpreter experiences a hard crash or runs out of memory, all
> > others have to die with it.)
>
> The use case that I have in mind with subinterpreters is
> Windows. With its lack of fork() and the way it spawns a fresh
> interpreter process it always feels a bit weird to use
> multiprocessing on Windows. Would it be faster and/or cleaner to
> start a new in-process subinterpreter instead?

Subinterpreters don't support fork() either -- they can't share any
objects, so each one has to start from a blank slate and go through
the Python startup sequence, re-import all modules from scratch, etc.
Subinterpreters do get to skip the OS process spawn overhead, but most
of the startup costs are the same.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RIK75U3ROEHWZL4VENQSQECB4F4GDELV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 2:34 PM Paul Ganssle  wrote:
> I think I tried something similar for tests that involved an environment 
> variable and found that it doesn't play nicely with coverage.py at all.

This is a solvable problem:
https://coverage.readthedocs.io/en/coverage-5.1/subprocess.html

But yeah, convincing your test framework to jump through the necessary
hoops might be tricky. (Last time I did this I was using pytest-cov,
which automatically takes care of all the details, so I'm not sure how
tough it is.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q3K6IT774HAS2IS62HN3NRV5VCBWTVLO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  wrote:
>
> As part of PEP 399, an idiom for testing both C and pure Python versions of a 
> library is suggested making use if import_fresh_module.
>
> Unfortunately, I'm finding that this is not amazingly robust. We have this 
> issue: https://bugs.python.org/issue40058, where the tester for datetime 
> needs to do some funky manipulations to the state of sys.modules for reasons 
> that are now somewhat unclear, and still sys.modules is apparently left in a 
> bad state.
>
> When implementing PEP 615, I ran into similar issues and found it very 
> difficult to get two independent instances of the same module – one with the 
> C extension blocked and one with it intact. I ended up manually importing the 
> C and Python extensions and grafting them onto two "fresh" imports with 
> nothing blocked.

When I've had to deal with similar issues in the past, I've given up
on messing with sys.modules and just had one test spawn a subprocess
to do the import+run the actual tests. It's a big hammer, but the nice
thing about big hammers is that there's no subtle issues, either they
smash the thing or they don't.

But, I don't know how awkward that would be to fit into Python's
unittest system, if you have lots of tests you need to run this way.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SDSODK5ZSJUSGDFVFOAESHYLPPFANNWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 10:03 AM Antoine Pitrou  wrote:
>
> On Tue, 5 May 2020 18:59:34 -0700
> Nathaniel Smith  wrote:
> > On Tue, May 5, 2020 at 3:47 PM Guido van Rossum  wrote:
> > >
> > > This sounds like a significant milestone!
> > >
> > > Is there some kind of optimized communication possible yet between 
> > > subinterpreters? (Otherwise I still worry that it's no better than 
> > > subprocesses -- and it could be worse because when one subinterpreter 
> > > experiences a hard crash or runs out of memory, all others have to die 
> > > with it.)
> >
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> Which would be useful already, especially with pickle out-of-band
> buffers.

Sure, zero cost is always better than some cost, I'm not denying that
:-). What I'm trying to understand is whether the difference is
meaningful enough to justify subinterpreters' increased complexity,
fragility, and ecosystem breakage.

If your data is in large raw memory buffers to start with (like numpy
arrays or arrow dataframes), then yeah, serialization costs are
smaller proportion of IPC costs. And out-of-band buffers are an
elegant way of letting pickle users take advantage of that speedup
while still using the familiar pickle API. Thanks for writing that PEP
:-).

But when you're in the regime where you're working with large raw
memory buffers, then that's also the regime where inter-process
shared-memory becomes really efficient. Hence projects like Ray/Plasma
[1], which exist today, and even work for sharing data across
languages and across multi-machine clusters. And the pickle
out-of-band buffer API is general enough to work with shared memory
too.

And even if you can't quite manage zero-copy, and have to settle for
one-copy... optimized raw data copying is just *really fast*, similar
to memory access speeds. And CPU-bound, big-data-crunching apps are by
definition going to access that memory and do stuff with it that's
much more expensive than a single memcpy. So I still have trouble
figuring out how skipping a single memcpy will make subinterpreters
significantly faster that subprocesses in any real-world scenario.

-n

[1]
https://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/
https://github.com/ray-project/ray

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PCLCXUK2OOHL2DHEHKMB3LGCIT7247WM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-06 Thread Nathaniel Smith
On Wed, May 6, 2020 at 5:41 AM Victor Stinner  wrote:
>
> Hi Nathaniel,
>
> Le mer. 6 mai 2020 à 04:00, Nathaniel Smith  a écrit :
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> I think that you misunderstood the PEP 554. It's a bare minimum API,
> and the idea is to *extend* it later to have an efficient
> implementation of "shared objects".

No, I get this part :-)

> IMO it should easy to share *data* (object "content") between
> subinterpreters, but each interpreter should have its own PyObject
> which exposes the data at the Python level. See the PyObject has a
> proxy to data.

So when you say "shared object" you mean that you're sharing a raw
memory buffer, and then you're writing a Python object that stores its
data inside that memory buffer instead of inside its __dict__:

class MySharedObject:
def __init__(self, shared_memview, shared_lock):
self._shared_memview = shared_memview
self._shared_lock = shared_lock

@property
def my_attr(self):
with self._shared_lock:
return struct.unpack_from(MY_ATTR_FORMAT,
self._shared_memview, MY_ATTR_OFFSET)[0]

@my_attr.setter
def my_attr(self, new_value):
with self._shared_lock:
struct.pack_into(MY_ATTR_FORMAT, self._shared_memview,
MY_ATTR_OFFSET, new_value)

This is an interesting idea, but I think when most people say "sharing
objects between subinterpreters", they mean being able to pass some
pre-existing object between subinterpreters cheaply, while this
requires defining custom objects with custom locking. So we should
probably use different terms for them to avoid confusion :-).

This is an interesting idea, and it's true that it's not considered in
my post you're responding to. I was focusing on copying objects, not
sharing objects on an ongoing basis. You can't implement this kind of
"shared object" using a pipe/socket, because those create two
independent copies of the data.

But... if this is what you want, you can do the exact same thing with
subprocesses too. OSes provide inter-process shared memory and
inter-process locks. 'MySharedObject' above would work exactly the
same. So I think the conclusion still holds: there aren't any plans to
make IPC between subinterpreters meaningfully faster than IPC between
subprocesses.

> I don't think that we have to reinvent the wheel. threading,
> multiprocessing and asyncio already designed such APIs. We should to
> design similar APIs and even simply reuse code.

Or, we could simply *use* the code instead of using subinterpreters
:-). (Or write new and better code, I feel like there's a lot of room
for a modern 'multiprocessing' competitor.) The question I'm trying to
figure out is what advantage subinterpreters give us over these proven
technologies, and I'm still not seeing it.

> My hope is that "synchronization" (in general, locks in specific) will
> be more efficient in the same process, than synchronization between
> multiple processes.

Hmm, I would be surprised by that – the locks in modern OSes are
highly-optimized, and designed to work across subprocesses. For
example, on Linux, futexes work across processes. Have you done any
benchmarks?

Also btw, note that if you want to use async within your
subinterpreters, then that rules out a lot of tools like regular
locks, because they can't be integrated into an event loop. If your
subinterpreters are using async, then you pretty much *have* to use
full-fledged sockets or equivalent for synchronization.

> I would be interested to have a generic implementation of "remote
> object": a empty proxy object which forward all operations to a
> different interpreter. It will likely be inefficient, but it may be
> convenient for a start. If a method returns an object, a new proxy
> should be created. Simple scalar types like int and short strings may
> be serialized (copied).

How would this be different than
https://docs.python.org/3/library/multiprocessing.html#proxy-objects ?

How would you handle input arguments -- would those get proxied as well?

Also, does this mean the other subinterpreter has to be running an
event loop to process these incoming requests? Or is the idea that the
other subinterpreter would process these inside a traditional Python
thread, so users are exposed to all the classic shared-everything
locking issues?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/53XQ52JVILNQH7IQC7SHKFSNHWD4DNX6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

2020-05-05 Thread Nathaniel Smith
On Tue, May 5, 2020 at 3:47 PM Guido van Rossum  wrote:
>
> This sounds like a significant milestone!
>
> Is there some kind of optimized communication possible yet between 
> subinterpreters? (Otherwise I still worry that it's no better than 
> subprocesses -- and it could be worse because when one subinterpreter 
> experiences a hard crash or runs out of memory, all others have to die with 
> it.)

As far as I understand it, the subinterpreter folks have given up on
optimized passing of objects, and are only hoping to do optimized
(zero-copy) passing of raw memory buffers.

On my laptop, some rough measurements [1] suggest that simply piping
bytes between processes goes at ~2.8 gigabytes/second, and that
pickle/unpickle is ~10x slower than that. So that would suggest that
once subinterpreters are fully optimized, they might provide a maximum
~10% speedup vs multiprocessing, for a program that's doing nothing
except passing pickled objects back and forth. Of course, any real
program that's spawning parallel workers will presumably be designed
so its workers spend most of their time doing work on that data, not
just passing it back and forth. That makes a 10% speedup highly
unrealistic; in real-world programs it will be much smaller.

So IIUC, subinterpreter communication is currently about the same
speed as multiprocessing communication, and the plan is to keep it
that way.

-n

[1] Of course there are a lot of assumptions in my quick
back-of-the-envelope calculation: pickle speed depends on the details
of the objects being pickled, there are other serialization formats,
there are other IPC methods that might be faster but are more
complicated (shared memory), the stdlib 'multiprocessing' library
might not be as good as it could be (the above measurements are for an
ideal multiprocessing library, I haven't tested the one we currently
have in the stdlib), etc. So maybe there's some situation where
subinterpreters look better. But I've been pointing out this issue to
Eric et al for years and they haven't disputed it, so I guess they
haven't found one yet.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GZEJOIT6SZDUVPND64VKFFKFX6AJWZ7W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-28 Thread Nathaniel Smith
On Mon, Apr 20, 2020 at 6:21 PM Eric Snow  wrote:
>
> Nathaniel,
>
> Your tone and approach to this conversation concern me.  I appreciate
> that you have strong feelings here and readily recognize I have my own
> biases, but it's becoming increasingly hard to draw any constructive
> insight from what tend to be very longs posts from you.  It ends up
> being a large commitment of time for small gains.  And honestly, it's
> also becoming hard to not counter some of your more elaborate
> statements with my own unhelpful prose.  In the interest of making
> things better, please take it all down a notch or two.

I'm sorry it's landing that way on you. I am frustrated, and I think
that's a reasonable reaction. But I know we're all here because we
want to make Python better. So let me try again to explain my
position, to maybe reboot the conversation in a more productive way.

All engineering decisions come down to costs vs. benefits. My
frustration is about how you're approaching the costs, and how you're
approaching the benefits.

**Costs**

I think you've been downplaying the impact of subinterpreter support
on the existing extension ecosystem. All features have a cost, which
is why PEPs always require substantial rationales and undergo intense
scrutiny. But subinterpreters are especially expensive. Most features
only affect a small group of modules (e.g. async/await affected
twisted and tornado, but 99% of existing libraries didn't care); OTOH
subinterpreters require updates to every C extension module. And if we
start telling users that subinterpreters are a supported way to run
arbitrary Python code, then we've effectively limited extension
authors options to "update to support subinterpreters" or "explain to
users why they aren't writing a proper Python module", which is an
intense amount of pressure; for most features maintainers have the
option of saying "well, that isn't relevant to me", but with
subinterpreter support that option's been removed. (You object to my
calling this an API break, but you're literally saying that old code
that worked fine is being redefined to be incorrect, and that all
maintainers need to learn new techniques. That's the definition of an
API break!) And until everything is updated, you're creating a schism
in the ecosystem, between modules that support subinterpreters and
those that don't.

I did just read your reply to Sebastian, and it sounds like you're
starting to appreciate this impact more, which I'm glad to see.

None of this means that subinterpreters are necessarily a bad idea.
For example, the Python 2 -> Python 3 transition was very similar, in
terms of maintainers being forced to go along and creating a temporary
schism in the ecosystem, and that was justified by the deep, unfixable
problems with Python 2. But it does mean that subinterpreters need an
even stronger rationale than most features.

And IMO, the point where PEP 554 is accepted and we start adding new
public APIs for subinterpreters is the point where most of these costs
kick in, because that's when we start sending the message that this is
a real thing and start forcing third-party maintainers to update their
code. So that's when we need the rationale.

**Benefits**

In talks and informal conversations, you paint a beautiful picture of
all the wonderful things subinterpreters will do. Lots of people are
excited by these wonderful things. I tried really hard to be excited
too. (In fact I spent a few weeks trying to work out a
subinterpreter-style proposal myself way back before you started
working on this!) But the problem is, whenever I look more closely at
the exciting benefits, I end up convincing myself that they're a
mirage, and either they don't work at all (e.g. quickly sharing
arbitrary objects between interpreters), or else end up being
effectively a more complex, fragile version of things that already
exist.

I've been in lots of groups before where everyone (including me!) got
excited about a cool plan, focused exclusively on the positives, and
ignored critical flaws until it was too late. See also: "groupthink",
"confirmation bias", etc. The whole subinterpreter discussion feels
very familiar that way. I'm worried that that's what's happening.

Now, I might be right, or I might be wrong, I dunno; subinterpreters
are a complex topic. Generally the way we sort these things out is to
write down the arguments for and against and figure out the technical
merits. That's one of the purposes of writing a PEP. But: you've been
*systematically refusing to do this.* Every time I've raised a concern
about one rationale, then instead of discussing the technical
substance of my concern, you switch to a different rationale, or say
"oh well, that rationale isn't the important one right now". And the
actual text in PEP 554 is *super* vague, like it's so vague it's kind
of an insult to the PEP process.

>From your responses in this thread, I think your core position now is
that the rationale is irrelevant, 

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-20 Thread Nathaniel Smith
On Mon, Apr 20, 2020 at 5:36 PM Edwin Zimmerman  wrote:
>
> On 4/20/2020 7:33 PM, Nathaniel Smith wrote:
> > On Mon, Apr 20, 2020 at 4:26 PM Edwin Zimmerman  
> > wrote:
> >> On 4/20/2020 6:30 PM, Nathaniel Smith wrote:
> >>> We already have robust support for threads for low-isolation and
> >>> subprocesses for high-isolation. Can you name some use cases where
> >>> neither of these are appropriate and you instead want an in-between
> >>> isolation – like subprocesses, but more fragile and with odd edge
> >>> cases where state leaks between them?
> >> I don't know if this has been mentioned before or not, but I'll bring it 
> >> up now: massively concurrent networking code on Windows.  Socket 
> >> connections could be passed off from the main interpreter to 
> >> sub-interpreters for concurrent processing that simply isn't possible with 
> >> the global GIL (provided the GIL actually becomes per-interpreter).  On 
> >> *nix you can fork, this would give CPython on Windows similar capabilities.
> > Both Windows and Unix have APIs for passing sockets between related or
> > unrelated processes -- no fork needed. On Windows, it's exposed as the
> > socket.share method:
> > https://docs.python.org/3/library/socket.html#socket.socket.share
> >
> > The APIs for managing and communicating between processes are
> > definitely not the most obvious or simplest to use, but they're very
> > mature and powerful, and it's a lot easier to wrap them up in a
> > high-level API than it is to effectively reimplement process
> > separation from scratch inside CPython.
> >
> > -n
> +1 on not being most obvious or simplest to use.  Not only that, but to use 
> it you have to write Windows-specific code.  PEP 554 would provide a uniform, 
> cross-platform capability that I would choose any day over a random pile of 
> os-specific hacks.

I mean, sure, if you've decided to build one piece of hypothetical
software well and another badly, then the good one will be better than
the bad one, but that doesn't really say much, does it?

In real life, I don't see how it's possible to get PEP 554's
implementation to the point where it works reliably and robustly –
i.e., I just don't think the promises the PEP makes can actually be
fulfilled. And even if you did, it would still be several orders of
magnitude easier to build a uniform, robust, cross-platform API on top
of tools like socket.share than it would be to force changes on every
C extension. PEP 554 is hugely expensive; you can afford a *lot* of
careful systems engineering while still coming in way under that
budget.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L63XKYQDFVCOCNZC2VN27KFW2C3NTBKZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-20 Thread Nathaniel Smith
On Mon, Apr 20, 2020 at 4:26 PM Edwin Zimmerman  wrote:
>
> On 4/20/2020 6:30 PM, Nathaniel Smith wrote:
> > We already have robust support for threads for low-isolation and
> > subprocesses for high-isolation. Can you name some use cases where
> > neither of these are appropriate and you instead want an in-between
> > isolation – like subprocesses, but more fragile and with odd edge
> > cases where state leaks between them?
> I don't know if this has been mentioned before or not, but I'll bring it up 
> now: massively concurrent networking code on Windows.  Socket connections 
> could be passed off from the main interpreter to sub-interpreters for 
> concurrent processing that simply isn't possible with the global GIL 
> (provided the GIL actually becomes per-interpreter).  On *nix you can fork, 
> this would give CPython on Windows similar capabilities.

Both Windows and Unix have APIs for passing sockets between related or
unrelated processes -- no fork needed. On Windows, it's exposed as the
socket.share method:
https://docs.python.org/3/library/socket.html#socket.socket.share

The APIs for managing and communicating between processes are
definitely not the most obvious or simplest to use, but they're very
mature and powerful, and it's a lot easier to wrap them up in a
high-level API than it is to effectively reimplement process
separation from scratch inside CPython.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KMS6JEGPB62STE4SE7YWGFALNFUE2LUX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-20 Thread Nathaniel Smith
On Fri, Apr 17, 2020 at 3:57 PM Eric Snow  wrote:
>
> On Fri, Apr 17, 2020 at 2:59 PM Nathaniel Smith  wrote:
> > I think some perspective might be useful here :-).
> >
> > The last time we merged a new concurrency model in the stdlib, it was 
> > asyncio.
> >
> > [snip]
> >
> > OTOH, AFAICT the new concurrency model in PEP 554 has never actually
> > been used, and it isn't even clear whether it's useful at all.
>
> Perhaps I didn't word things quite right.  PEP 554 doesn't provide a
> new concurrency model so much as it provides functionality that could
> probably be used as the foundation for one.

That makes it worse, right? If I wrote a PEP saying "here's some
features that could possibly someday be used to make a new concurrency
model", that wouldn't make it past the first review.

> Ultimately the module
> proposed in the PEP does the following:
>
> * exposes the existing subinterpreters functionality almost as-is

So I think this is a place where we see things really differently.

I guess your perspective is, subinterpreters are already a CPython
feature, so we're not adding anything, and we don't really need to
talk about whether CPython should support subinterpreters.

But this simply isn't true. Yes, there's some APIs for subinterpreters
added back in the 1.x days, but they were never really thought
through, and have never actually worked. There are exactly 3 users,
and all have serious issues, and a strategy for avoiding
subinterpreters because of the brokenness. In practice, the existing
ecosystem of C extensions has never supported subinterpreters.

This is clearly not a great state of affairs – we should either
support them or not support them. Shipping a broken feature doesn't
help anyone. But the current status isn't terribly harmful, because
the general consensus across the ecosystem is that they don't work and
aren't used.

If we start exposing them in the stdlib and encouraging people to use
them, though, that's a *huge* change. Our users trust us. If we tell
them that subinterpreters are a real thing now, then they'll spend
lots of effort on trying to support them.

Since subinterpreters are confusing, and break the C API/ABI, this
means that every C extension author will have to spend a substantial
amount of time figuring out what subinterpreters are, how they work,
squinting at PEP 489, asking questions, auditing their code, etc. This
will take years, and in the mean time, users will expect
subinterpreters to work, be confused at why they break, yell at random
third-party maintainers, spend days trying to track down mysterious
problems that turn out to be caused by subinterpreters, etc. There
will be many many blog posts trying to explain subinterpreters and
understand when they're useful (if ever), arguments about whether to
support them. Twitter threads. Production experiments. If you consider
that we have thousands of existing C extensions and millions of users,
accepting PEP 554 means forcing people you don't know to collectively
spend many person-years on subinterpreters.

Random story time: NumPy deprecated some C APIs some years ago, a
little bit before I got involved. Unfortunately, it wasn't fully
thought through; the new APIs were a bit nicer-looking, but didn't
enable any new features, didn't provide any path to getting rid of the
old APIs, and in fact it turned out that there were some critical use
cases that still required the old API. So in practice, the deprecation
was never going anywhere; the old APIs work just as well and are never
going to get removed, so spending time migrating to the new APIs was,
unfortunately, a completely pointless waste of time that provided zero
value to anyone.

Nonetheless, our users trusted us, so lots and lots of projects spend
substantial effort on migrating to the new API: figuring out how it
worked, making PRs, reviewing them, writing shims to work across the
old and new API, having big discussions about how to make the new API
work with Cython, debating what to do about the cases where the new
APIs were inadequate, etc. None of this served any purpose: they just
did it because they trusted us, and we misled them. It's pretty
shameful, honestly. Everyone meant well, but in retrospect it was a
terrible betrayal of our users' trust.

Now, that only affected projects that were using the NumPy C API, and
even then, only developers who were diligent and trying to follow the
latest updates; there were no runtime warnings, nothing visible to
end-users, etc. Your proposal has something like 100x-1000x more
impact, because you want to make all C extensions in Python get
updated or at least audited, and projects that aren't updated will
produce mysterious crashes, incorrect output, or loud error messages
that cause users to come after the developers and demand fixes.

Now maybe that's worth it. I think on net the Py3 transition was worth
it, and 

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-17 Thread Nathaniel Smith
On Fri, Apr 17, 2020 at 11:50 AM Eric Snow  wrote:
> Dilemma
> 
>
> Many folks have conflated PEP 554 with having a per-interpreter GIL.
> In fact, I was careful to avoid any mention of parallelism or the GIL
> in the PEP.  Nonetheless some are expecting that when PEP 554 lands we
> will reach multi-core nirvana.
>
> While PEP 554 might be accepted and the implementation ready in time
> for 3.9, the separate effort toward a per-interpreter GIL is unlikely
> to be sufficiently done in time.  That will likely happen in the next
> couple months (for 3.10).
>
> So...would it be sufficiently problematic for users if we land PEP 554
> in 3.9 without per-interpreter GIL?
>
> Options
> 
>
> Here are the options as I see them (if the PEP is accepted in time for 3.9):
>
> 1. merge PEP 554 into 3.9 even if per-interpreter GIL doesn't get into
> 3.9 (they get parallelism for free in 3.10)
> 2. like 1, but mark the module as provisional until per-interpreter GIL lands
> 3. do not merge PEP 554 until per-interpreter GIL is merged
> 4. like 3, but publish a 3.9-only module to PyPI in the meantime

I think some perspective might be useful here :-).

The last time we merged a new concurrency model in the stdlib, it was asyncio.

In that case, the process went something like:

- We started with two extremely mature libraries (Twisted + Tornado)
with long histories of real-world use
- The asyncio designers (esp. Guido) did a very extensive analysis of
these libraries' design choices, spoke to the maintainers about what
they'd learned from hard experience, etc.
- Asyncio was initially shipped outside the stdlib to allow for
testing and experimentation, and at this stage it was used to build
non-trivial projects (e.g. the aiohttp project's first commits use
tulip, not asyncio)
- When it was eventually added to the stdlib, it was still marked
provisional for multiple python releases, and underwent substantial
and disruptive changes during this time
- Even today, the limitations imposed by the stdlib release cycle
still add substantial difficulty to maintaining asyncio

OTOH, AFAICT the new concurrency model in PEP 554 has never actually
been used, and it isn't even clear whether it's useful at all.
Designing useful concurrency models is *stupidly* hard. And on top of
that, it requires major reworks of the interpreter internals +
disrupts the existing C extension module ecosystem -- which is very
different from asyncio, where folks who didn't use it could just
ignore it.

So to me, it's kind of shocking that you'd even bring up the
possibility of merging PEP 554 as-is, without even a provisional
marker. And if it's possible for it to live on PyPI, then why would we
even consider putting it into the stdlib? Personally, I'm still
leaning towards thinking that the whole subinterpreter project is
fundamentally flawed, and that on net we'd be better off removing
support for them entirely. But that's a more complex and nuanced
question that I'm not 100% certain of, while the idea of merging it
for 3.9 seems like a glaringly obvious bad idea.

I know you want folks to consider PEP 554 on its own merits, ignoring
the GIL-splitting work, but let's be realistic: purely as a
concurrency framework, there's at least a dozen more
mature/featureful/compelling options in the stdlib and on PyPI, and as
an isolation mechanism, subinterpreters have been around for >20 years
and in that time they've found 3 users and no previous champions.
Obviously the GIL stuff is the only reason PEP 554 might be worth
accepting. Or if PEP 554 is really a good idea on its own merits,
purely as a new concurrency API, then why not build that concurrency
API on top of multiprocessing and put it on PyPI and let real users
try it out?

One more thought. Quoting from Poul Henning-Kemp's famous email at bikeshed.org:

> Parkinson shows how you can go in to the board of directors and
get approval for building a multi-million or even billion dollar
atomic power plant, but if you want to build a bike shed you will
be tangled up in endless discussions.
>
> Parkinson explains that this is because an atomic plant is so vast,
so expensive and so complicated that people cannot grasp it, and
rather than try, they fall back on the assumption that somebody
else checked all the details before it got this far.   Richard P.
Feynmann gives a couple of interesting, and very much to the point,
examples relating to Los Alamos in his books.
>
> A bike shed on the other hand.  Anyone can build one of those over
a weekend, and still have time to watch the game on TV.  So no
matter how well prepared, no matter how reasonable you are with
your proposal, somebody will seize the chance to show that he is
doing his job, that he is paying attention, that he is *here*.

Normally, when people reference this story they focus on the bikeshed,
hence the term "bikeshedding". But for PEP 554, you're building a
nuclear power plant :-). The whole conglomeration of a new 

[Python-Dev] Re: Improvement to SimpleNamespace

2020-04-15 Thread Nathaniel Smith
On Wed, Apr 15, 2020 at 2:59 PM Ivan Pozdeev via Python-Dev
 wrote:
> "Glom syntax" still excludes the delimiter, whatever it is, from use in keys. 
> So it's still a further limitation compared to the JSON spec.

Glom does let you be specific about the exact lookup keys if you want,
to handle keys that contain embedded periods, or non-string keys. The
syntax looks like:

from glom import glom, Path
glom(obj, Path("a", "b.c", 2))

https://glom.readthedocs.io/en/latest/api.html#specifier-types

For a simple case like this t's a bit wordier than obj["a"]["b.c"][2],
but OTOH you get better error message on failed lookups,
null-coalescing support by using default=, etc.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/32Q7H5LLES3C2WUIYWTPICXGWWWP4UQU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Improvement to SimpleNamespace

2020-04-14 Thread Nathaniel Smith
On Tue, Apr 14, 2020 at 9:26 PM David Mertz  wrote:
>
> I've written AttributeDict a fair number of times. Each time I write it from 
> scratch, which is only a few lines. And I only make a silly wore about 50% of 
> the time when I do so.

I've also written it a number of times, and never found a way to do it
that I was really happy with. (In particular, converting all sub-dicts
into AttributeDict is necessary to support a.b.c-style access, but if
c is itself a dict, then you end up leaking AttributeDict objects into
other parts of the code that might just be expecting a regular dict.)

These days I've given up on that approach and use Mahmoud's 'glom'
library instead: https://glom.readthedocs.io/

It has a ton of super-fancy features, but mostly I ignore those and
just write stuff like 'glom(json_obj, "a.b.c")' or maybe
'glom(json_obj, "a.b.c", default=None)'.

Like anything there are probably trade-offs and situations where
something like AttributeDict is better, but figured I'd throw that out
there as another option to consider.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q6U3AG3QRDBZU4RSV77CSYNJ62WJXYWY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 617: New PEG parser for CPython

2020-04-02 Thread Nathaniel Smith
On Thu, Apr 2, 2020 at 2:48 PM Pablo Galindo Salgado
 wrote:
>
> > About the migration, can I ask who is going to (help to) fix projects
> which rely on the AST?
>
> I think you misunderstood: The AST is exactly the same as the old and the new 
> parser. The only
> the thing that the new parser does is not generate an immediate CST (Concrete 
> Syntax Tree) and that
> is only half-exposed in the parser module.

If the AST is supposed to be the same, then would it make sense to
temporarily – maybe just during the alpha/beta period – always run
*both* parsers and confirm that they match?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IRQUITYBQFJYUFQRKTVDXUBX4X42ARMP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Nathaniel Smith
On Sat, Mar 21, 2020 at 11:35 AM Steven D'Aprano  wrote:
>
> On Fri, Mar 20, 2020 at 06:18:20PM -0700, Nathaniel Smith wrote:
> > On Fri, Mar 20, 2020 at 11:54 AM Dennis Sweeney
> >  wrote:
> > > This is a proposal to add two new methods, ``cutprefix`` and
> > > ``cutsuffix``, to the APIs of Python's various string objects.
> >
> > The names should use "start" and "end" instead of "prefix" and
> > "suffix", to reduce the jargon factor
>
> Prefix and suffix aren't jargon. They teach those words to kids in
> primary school.

Whereas they don't have to teach "start" and "end", because kids
already know them before they start school.

> Why the concern over "jargon"? We happily talk about exception,
> metaclass, thread, process, CPU, gigabyte, async, ethernet, socket,
> hexadecimal, iterator, class, instance, HTTP, boolean, etc without
> blinking, but you're shying at prefix and suffix?

Yeah. Jargon is fine when there's no regular word with appropriate
precision, but we shouldn't use jargon just for jargon's sake. Python
has a long tradition of preferring regular words when possible, e.g.
using not/and/or instead of !/&&/||, and startswith/endswith instead
of hasprefix/hassuffix.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SMZB6KII42ZSLOFJGDMFRXXPM72UGQ3D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-21 Thread Nathaniel Smith
On Fri, Mar 20, 2020 at 11:27 AM Victor Stinner  wrote:
> I would prefer to continue to experiment passing tstate explicitly in
> internal C APIs until most blocker issues will be fixed. Once early
> work on running two subinterpreters in parallel will start working
> (one "GIL" per interpreter), I will be more open to reconsider using a
> TLS variable.

The PEP for parallel subinterpreters hasn't been accepted yet either, right?

> "Inefficient signal handling in multithreaded applications"
> https://bugs.python.org/issue40010

CPython's current signal handling architecture basically assumes that
signals are always delivered to the main thread. (Fortunately, on real
systems, this is almost always true.) In particular, it assumes that
if a syscall arrives while the main thread is blocked in a
long-running syscall, then the syscall will be interrupted, which is
only true when the signal is delivered to the main thread. AFAICT if
we really care about off-main-thread signals, then the only way to
handle them properly is for the signal handler to detect when they
happen, and redeliver the signal to the main thread using
pthread_kill, and then let the main thread set its own eval_breaker
etc.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FTVIAXHDHUNQWLBZQ4YIQXTFFDZ762GL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-20 Thread Nathaniel Smith
On Fri, Mar 20, 2020 at 11:54 AM Dennis Sweeney
 wrote:
> This is a proposal to add two new methods, ``cutprefix`` and
> ``cutsuffix``, to the APIs of Python's various string objects.

The names should use "start" and "end" instead of "prefix" and
"suffix", to reduce the jargon factor and for consistency with
startswith/endswith.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R4ND2KANMLS74AVKHUJ5BI5JM5QW5IC2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please be more precise when commenting on PEP 611.

2019-12-09 Thread Nathaniel Smith
> > On 09/12/2019 2:15 pm, Chris Angelico wrote:
> You: "We should limit things. Stuff will be faster."
> Others: "Really? Because bit masking is work. It'll be slower."
> You: "Maybe we limit it somewhere else, whatever. It'll be faster."
> Others: "Will it? How much faster?"
> You: "It'll be faster."

Mark, possibly you want to re-frame the PEP to be more like "this is
good for correctness and enabling robust reasoning about the
interpreter, which has a variety of benefits (and possibly speed will
be one of them eventually)"? My impression is that you see speedups as
a secondary motivation, while other people are getting the impression
that speedups are the entire motivation, so one way or the other the
text is confusing people.

In particular, right now the most detailed example is the compacted
object header bit, which makes it a magnet for critique. Also, I don't
understand how this idea would work at all :-). So I'd either remove
it or else make it more detailed, one or the other.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5NQWKURB45J5NIZWD5R7GDTEDAGY7U7S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-06 Thread Nathaniel Smith
On Thu, Dec 5, 2019 at 5:38 AM Mark Shannon  wrote:
>  From my limited googling, linux has a hard limit of about 600k file
> descriptors across all processes. So, 1M is well past any reasonable
> per-process limit. My impression is that the limits are lower on
> Windows, is that right?

Linux does limit the total number of file descriptors across all
processes, but the limit is configurable at runtime. 600k is the
default limit, but you can always make it larger (and people do).

In my limited experimentation with Windows, it doesn't seem to impose
any a priori limit on how many sockets you can have open. When I wrote
a simple process that opens as many sockets as it can in a loop, I
didn't get any error; eventually the machine just locked up. (I guess
this is another example of why it can be better to have explicit
limits!)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5Z3CQQK6QDH3L466BIF7HAGCRV5SXBNW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

2019-12-03 Thread Nathaniel Smith
On Tue, Dec 3, 2019 at 8:20 AM Mark Shannon  wrote:
> The Python language does not specify limits for many of its features.
> Not having any limit to these values seems to enhance programmer freedom,
> at least superficially, but in practice the CPython VM and other Python
> virtual
> machines have implicit limits or are forced to assume that the limits are
> astronomical, which is expensive.

The basic idea makes sense to me. Well-defined limits that can be
supported properly are better than vague limits that are supported by
wishful thinking.

> This PR lists a number of features which are to have a limit of one
> million.
> If a language feature is not listed but appears unlimited and must be
> finite,
> for physical reasons if no other, then a limit of one million should be
> assumed.

This language is probably too broad... for example, there's certainly
a limit on how many objects can be alive at the same time due to the
physical limits of memory, but that limit is way higher than a
million.

> This PR proposes that the following language features and runtime values
> be limited to one million.
>
> * The number of source code lines in a module
> * The number of bytecode instructions in a code object.
> * The sum of local variables and stack usage for a code object.
> * The number of distinct names in a code object
> * The number of constants in a code object.

These are all attributes of source files, so sure, a million is
plenty, and the interpreter spends a ton of time manipulating tables
of these things.

> * The number of classes in a running interpreter.

This one isn't as obvious to me... classes are basically just objects
of type 'type', and there is definitely code out there that creates
classes dynamically. A million still seems like a lot, and I'm not
saying I'd *recommend* a design that involves creating millions of
different type objects, but it might exist already.

> * The number of live coroutines in a running interpreter.

I don't get this one. I'm not thinking of any motivation (the
interpreter doesn't track live coroutines differently from any other
object), and the limit seems dangerously low. A million coroutines
only requires a few gigabytes of RAM, and there are definitely people
who run single process systems with >1e6 concurrent tasks (random
example: https://goroutines.com/10m)

I don't know if there's anyone doing this in *Python right now, due to
Python's performance limitations, but it's nowhere near as silly as a
function with a million local variables.

> Total number of classes in a running interpreter
> 
>
> This limit has to the potential to reduce the size of object headers
> considerably.
>
> Currently objects have a two word header, for objects without references
> (int, float, str, etc.) or a four word header for objects with references.
> By reducing the maximum number of classes, the space for the class reference
> can be reduced from 64 bits to fewer than 32 bits allowing a much more
> compact header.
>
> For example, a super-compact header format might look like this:
>
> .. code-block::
>
>  struct header {
>  uint32_t gc_flags:6; /* Needs finalisation, might be part of a
> cycle, etc. */
>  uint32_t class_id:26; /* Can be efficiently mapped to address
> by ensuring suitable alignment of classes */
>  uint32_t refcount; /* Limited memory or saturating */
>  }
>
> This format would reduce the size of a Python object without slots, on a
> 64 bit machine, from 40 to 16 bytes.

In this example, I can't figure out how you'd map your 26 bit class_id
to a class object. On a 32-bit system it would be fine, you just need
64 byte alignment, but you're talking about 64-bit systems, so... I
know you aren't suggesting classes should have 2**(64 - 26) =
~3x10**11 byte alignment :-)

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TL62SYQ6DGCCLRTIGMCUFAT5UEWMB7KN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Exposing Tools/parser/unparse.py in the stdlib?

2019-11-19 Thread Nathaniel Smith
On Mon, Nov 18, 2019 at 4:41 PM Pablo Galindo Salgado
 wrote:
>
> Hi,
>
> What do people feel about exposing Tools/parser/unparse.py in the standard 
> library? Here is my initial rationale:
>
> * The tool already needs to be maintained and updated as is tested as part of 
> the test suite.
> * I have used the tool almost all the time I needed to deal with AST 
> transformations.
> * The public interface will have a very low surface API, keeping maintaining 
> it (the public interface) a very small burden IMHO.
>
> We could add the public interface to the ast.py module or a new one if people 
> feel strongly about it.

How does it compare to Berker's popular and well-maintained PyPI
package for this? https://github.com/berkerpeksag/astor

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EDOBFEJDKANKWCAYEVLWTTXSCM3OIMXE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Pass the Python thread state to internal C functions

2019-11-17 Thread Nathaniel Smith
On Sun, Nov 17, 2019 at 1:58 PM Nick Coghlan  wrote:
> On Sat., 16 Nov. 2019, 8:26 am Nathaniel Smith,  wrote:
>>
>> As you know, I'm skeptical that PEP 554 will produce benefits that are
>> worth the effort, but let's assume for the moment that it is, and
>> we're all 100% committed to moving all globals into the threadstate.
>> Even given that, the motivation for this change seems a bit unclear to
>> me.
>>
>> I guess the possible goals are:
>>
>> - Get rid of the "ambient" threadstate entirely
>> - Make accessing the threadstate faster
>
> - Eventually make it easier for CPython maintainers to know which functions 
> require access to a live thread state, and which are stateless helper 
> functions

So the idea would be that eventually we'd remove all uses of implicit
state lookup inside CPython, and add some kind of CI check to make
sure that they're never used?

> - Eventually make it easier for embedding applications to control which 
> Python code runs in which thread state by moving the thread state activation 
> dance out of the application and into the CPython shared library

That seems like a good goal, but I don't understand how it's related
to passing threadstate explicitly as a function argument. If the plan
is to move towards passing threadstates both implicitly AND explicitly
everywhere, that seems like it would make things more error-prone, not
less, because the two states could get out of sync. Could you
elaborate?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5JKNEYXI6ZILC3P6JBXW7NKAUVMXBRQN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Pass the Python thread state to internal C functions

2019-11-15 Thread Nathaniel Smith
As you know, I'm skeptical that PEP 554 will produce benefits that are
worth the effort, but let's assume for the moment that it is, and
we're all 100% committed to moving all globals into the threadstate.
Even given that, the motivation for this change seems a bit unclear to
me.

I guess the possible goals are:

- Get rid of the "ambient" threadstate entirely
- Make accessing the threadstate faster

For the first goal, I don't think this is possible, or desirable.
Obviously if we remove the GIL somehow then at a minimum we'll need to
make the global threadstate a thread-local. But I think we'll always
have to keep it around as a thread-local, at least, because there are
situations where you simply cannot pass in the threadstate as an
argument. One example comes up when doing FFI: there are C libraries
that take callbacks, and will run them later in some arbitrary thread.
When wrapping these in Python, we need a way to bundle up a Python
function into a C function that can be called from any thread. So,
ctypes and cffi and cython all have ways to do this bundling, and they
all start with some delicate dance to figure out whether or not the
current thread holds the GIL, acquiring the GIL if not, then checking
whether or not this thread has a Python threadstate assigned, creating
it if not, etc. This is completely dependent on having the threadstate
available in ambient context. If threadstates were always passed as
arguments, then it would become impossible to wrap these C libraries.
So we can't do that.

That said, it's fine – even if we do remove the GIL, we still won't
have a *single OS thread* executing code from two different
interpreters at the same time! So storing the threadstate in a
thread-local is fine, and we can keep the ability to grab the
threadstate at any moment, regardless of whether it was passed as an
argument.

But that means the only reason for passing the threadstate around as
an argument is if it's faster than looking it up. And AFAICT, no-one
in this thread actually knows if that's true? You mentioned that
there's an "atomic operation" there currently, but I think on x86 at
least _Py_atomic_load_relaxed is literally a no-op. Larry did some
experiments with the old pthreads thread-local storage API, but no-one
seems to have done any measurements on the new, much-faster
thread-local storage API, and no-one's done any measurements of the
cost of passing around threadstates explicitly. For all we know,
passing the threadstate around is actually slower than looking it up
every time. And we don't even know yet whether the threadstate even
will move into thread-local storage.

It seems a bit weird to start doing massive internal refactoring
before measuring those things.

-n

On Tue, Nov 12, 2019 at 2:03 PM Victor Stinner  wrote:
>
> Hi,
>
> Are you ok to modify internal C functions to pass explicitly tstate?
>
> --
>
> I started to modify internal C functions to pass explicitly "tstate"
> when calling C functions: the Python thread state (PyThreadState).
> Example of C code (after my changes):
>
> if (_Py_EnterRecursiveCall(tstate, " while calling a Python object")) 
> {
> return NULL;
> }
> PyObject *result = (*call)(callable, args, kwargs);
> _Py_LeaveRecursiveCall(tstate);
> return _Py_CheckFunctionResult(tstate, callable, result, NULL);
>
> In Python 3.8, the tstate is implicit:
>
> if (Py_EnterRecursiveCall(" while calling a Python object")) {
> return NULL;
> }
> PyObject *result = (*call)(callable, args, kwargs);
> Py_LeaveRecursiveCall();
> return _Py_CheckFunctionResult(callable, result, NULL);
>
> There are different reasons to pass explicitly tstate, but my main
> motivation is to rework Python code base to move away from implicit
> global states to states passed explicitly, to implement the PEP 554
> "Multiple Interpreters in the Stdlib". In short, the final goal is to
> run multiple isolated Python interpreters in the same process: run
> pure Python code on multiple CPUs in parallel with a single process
> (whereas multiprocessing runs multiple processes).
>
> Currently, subinterpreters are a hack: they still share a lot of
> things, the code base is not ready to implement isolated interpreters
> with one "GIL" (interpreter lock) per interpreter, and to run multiple
> interpreters in parallel. Many _PyRuntimeState fields (the global
> _PyRuntime variable) should be moved to PyInterpreterState (or maybe
> PyThreadState): per interpreter.
>
> Another simpler but more annoying example are Py_None and Py_True
> singletons which are globals. We cannot share these singletons between
> interpreters because updating their reference counter would be a
> performance bottleneck. If we put a "superglobal-GIL" to ensure that
> Py_None reference counter remains consistent, it would basically
> "serialize" all threads, rather than running them in parallel.
>
> The idea of passing tstate 

[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Nathaniel Smith
On Mon, Sep 23, 2019 at 1:30 PM Vinay Sajip via Python-Dev
 wrote:
>
> OK - but that's just one I picked at random. There are others like it - what 
> would be the process for deciding which ones need to be made private and 
> moved? Should an issue be raised to track this?

There are really two issues here:

- hiding the symbols that *aren't* marked PyAPI_*, consistently across
platforms.
- finding symbols that are currently marked PyAPI_*, but shouldn't be.

The first one is a pretty straightforward technical improvement. The
second one is a longer-term project that could easily get bogged down
in complex judgement calls. So let's worry about them separately. Even
if there are too many symbols marked PyAPI_*, we can still get started
on hiding all the symbols that we *know* should be hidden.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VLTZW3HQUK3ZAQEEKZIHCJRUOXRMUPVV/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-23 Thread Nathaniel Smith
On Mon, Sep 23, 2019, 08:28 Vinay Sajip via Python-Dev <
python-dev@python.org> wrote:

> > requires some newer tools like -fvisibility=hidden that work
> > differently across different platforms, and so far no-one's done the
> > work to sort out the details.
>
> I've started looking at this, but quite apart from the specifics of
> applying -fvisibility=hidden, there are some things that aren't yet clear
> to me about the intent behind some of our symbol definitions. For example,
> the file Include/fileutils.h contains the definitions
>
> PyAPI_FUNC(wchar_t *) Py_DecodeLocale(const char *arg, size_t *size);
>
> and
>
> PyAPI_FUNC(int) _Py_DecodeLocaleEx(const char *arg,
> wchar_t **wstr,
> size_t *wlen,
> const char **reason,
> int current_locale,
> _Py_error_handler errors);
>
> However, only the first of these is documented, though the definition via
> PyAPI_FUNC implies that both are part of the public API. If this is the
> case, why aren't both documented? If _Py_DecodeLocaleEx is not part of the
> public API (and the leading underscore suggests so), should it be polluting
> the symbol space?
>
> The comment for PyAPI_FUNC is "Declares a public Python API function and
> return type". Is this really the case, or has PyAPI_FUNC been coopted to
> provide external linkage for use by Python-internal code in different
> compilation units?  _Py_DecodeLocaleEx is called in
> Modules/_testcapimodule.c and also in Objects/unicodeobject.c.
>
> If we want to take steps to restrict symbol visibility, it will
> potentially affect all of the code base - so presumably, a PEP would be
> required, even though it's an implementation detail from the point of view
> of the language itself?
>

Windows already has working symbol visibility handling, and PyAPI_FUNC is
what controls it. So adding symbol visibility handling to Linux/macOS is
just about making all the platforms consistent. There might be some weird
choices being made, but I don't think you need to sort all those out as
part of this.

-n
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TKSQQ5TLVRY4XYMSKVJ7X5GMXANOIDGG/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Nathaniel Smith
On Fri, Sep 20, 2019 at 2:58 PM Vinay Sajip via Python-Dev
 wrote:
>
> > > Right, I'm pretty sure that right now Python doesn't have any way to
> > share symbols between .c files without also exposing them in the C
> > API.
>
> On other C projects I've worked on, the public API is expressed in one set of 
> header files, and internal APIs that need to be exposed across modules are 
> described in a different set of internal header files, and developers who 
> incorrectly use internal APIs by including the internal headers could see 
> breakage when the internals change ... excuse my naïveté, as I haven't done 
> much at Python's C level - does this discipline/approach not apply to CPython?

Visibility in C is complicated :-). The level I'm talking about is
symbol visibility, which is determined by the linker, not by headers.
If a symbol is exported, then even if you hide the headers, it's still
part of the library ABI, can still collide with user symbols, can
still by accessed by determined users, etc. It's still fixable, but it
requires some newer tools like -fvisibility=hidden that work
differently across different platforms, and so far no-one's done the
work to sort out the details.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GOQDPXFWEBQT7DDQKHHGHWZQMTOG2COI/


[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?

2019-09-20 Thread Nathaniel Smith
On Fri, Sep 20, 2019 at 1:00 PM Andrew Svetlov  wrote:
> This target is very important for keeping public API as small as possible.

Right, I'm pretty sure that right now Python doesn't have any way to
share symbols between .c files without also exposing them in the C
API.

This is fixable using "symbol visibility" features, and it would be
nice to have the option to share stuff between our own C files without
also sharing it with the world, for lots of reasons. But it might be
necessary to implement that first before doing anything to share
_Py_IDENTIFIERs.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V4Z6EYNZMNNJFU3KFKHKFE2KCB5L5DSP/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Nathaniel Smith
On Fri, Aug 9, 2019 at 12:07 PM  wrote:
>
> Eric V. Smith wrote:
> >  Hopefully the warnings in 3.9 would be more visible that what we saw in
> > 3.7, so that library authors can take notice and do something about it
> > before 3.10 rolls around.
> > Eric
>
> Apologies for the ~double-post on the thread, but: the SymPy team has figured 
> out the right pytest incantation to expose these warnings. Given the 
> extensive adoption of pytest, perhaps it would be good to combine (1) a FR on 
> pytest to add a convenience flag enabling this mix of options with (2) an 
> aggressive "marketing push", encouraging library maintainers to add it to 
> their testing/CI.

Unfortunately, their solution isn't a pytest incantation, it's a
separate 'compileall' invocation they run on their source tree. I'm
not sure how you'd convert this into a pytest feature, because I don't
think pytest always know which parts of your code are your code versus
which parts are supporting libraries.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H36DMKUODHOQOYIIZCKW6LYKSGJLXTT4/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-06 Thread Nathaniel Smith
On Tue, Aug 6, 2019 at 3:44 PM Brett Cannon  wrote:
> I think this is a good example of how the community is not running tests with 
> warnings on and making sure that their code is warnings-free. This warning 
> has existed for at least one full release and fixing it doesn't require some 
> crazy work-around for backwards compatibility, and so this tells me people 
> are simply either ignoring the warnings or they are not aware of them.
>
> If it's the case that people are choosing to ignore warnings then that's on 
> them and there's not much we can do there.
>
> But my suspicion is it's the latter case of people simply not thinking about 
> running with warnings on and making sure to check for them. For instance, are 
> people running their CI with warnings turned on? How about making sure to 
> check the output of their CI to make sure there are no warnings? Or even 
> better, how many people are running CI with warnings turned into exceptions? 
> My guess is all of this is rather low because people are probably just doing 
> `pytest` without thinking of turning on warnings as exceptions to trigger a 
> CI failure and are only looking for CI passing versus checking its output.

There's an important point here that I think has been missed.

These days deprecation warnings are much more visible in general,
because all the major test systems enable them by default. BUT, this
SPECIFIC warning almost completely circumvented all those systems, so
almost no-one saw it.

For example, all my projects run tests with deprecation warnings
enabled and warnings turned into errors, but I never saw any of these
warnings. What happens is: the warning is issued when the .py file is
byte-compiled; but at this point, deprecation warnings probably aren't
visible. Later on, when pytest imports the file, it has warnings
enabled... but now the warning isn't issued.

Quoting Aaron Meurer from the bpo thread:

> As an anecdote, for SymPy's CI, we went through five (if I am counting 
> correctly) iterations of trying to test this. Each of the first four were 
> subtly incorrect, until we finally managed to find the correct one (for 
> reference, 'python -We:invalid -m compileall -f -q module/').  So most 
> library authors who will attempt to add tests against this will get it wrong.

Since folks don't seem to be reading that thread, I'll re-post my
comment from it as well:

> I think we haven't *actually* done a proper DeprecationWarning period for 
> this. We tried, but because of the issue with byte-compiling, the warnings 
> were unconditionally suppressed for most users -- even the users who are 
> diligent enough to enable warnings and look at warnings in their test suites.
> I can see a good argument for making the change, but if we're going to do it 
> then it's obviously the kind of change that requires a proper deprecation 
> period, and that hasn't happened.
> Maybe .pyc files need to be extended to store a list of syntax-related 
> DeprecationWarnings and SyntaxWarnings, that are re-issued every time the 
> .pyc is loaded? Then we'd at least have the technical capability to deprecate 
> this properly.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74OBYEY3PVLNQG2ZAVRO653LD5K/


Re: [Python-Dev] PEP 595: Improving bugs.python.org

2019-05-31 Thread Nathaniel Smith
On Fri, May 31, 2019 at 11:39 AM Barry Warsaw  wrote:
>
> On May 31, 2019, at 01:22, Antoine Pitrou  wrote:
>
> > I second this.
> >
> > There are currently ~7000 bugs open on bugs.python.org.  The Web UI
> > makes a good job of actually being able to navigate through these bugs,
> > search through them, etc.
> >
> > Did the Steering Council conduct a usability study of Github Issues
> > with those ~7000 bugs open?  If not, then I think the acceptance of
> > migrating to Github is a rushed job.  Please reconsider.
>
> Thanks for your feedback Antoine.
>
> This is a tricky issue, with many factors and tradeoffs to consider.  I 
> really appreciate Ezio and Berker working on PEP 595, so we can put all these 
> issues on the table.
>
> I think one of the most important tradeoffs is balancing the needs of 
> existing developers (those who actively triage bugs today), and future 
> contributors.  But this and other UX issues are difficult to compare on our 
> actual data right now.  I fully expect that just as with the switch to git, 
> we’ll do lots of sample imports and prototyping to ensure that GitHub issues 
> will actually work for us (given our unique requirements), and to help 
> achieve the proper balance.  It does us no good to switch if we just anger 
> all the existing devs.
>
> IMHO, if the switch to GH doesn’t improve our workflow, then it definitely 
> warrants a reevaluation.  I think things will be better, but let’s prove it.

Perhaps we should put an explicit step on the transition plan, after
the prototyping, that's "gather feedback from prototypes, re-evaluate,
make final go/no-go decision"? I assume we'll want to do that anyway,
and having it formally written down might reassure people. It might
also encourage more people to actually try out the prototypes if we
make it very clear that they're going to be asked for feedback.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should I postpone PEP 558 (locals() semantics) to Python 3.9?

2019-05-31 Thread Nathaniel Smith
I wouldn't mind having a little more breathing room. It's frustrating
to miss the train, but these bugs are several decades old so I guess
nothing terrible will happen if their fixes get delayed to 3.9.

On Thu, May 30, 2019 at 4:23 PM Nick Coghlan  wrote:
>
> Hi folks,
>
> The reference implementation for PEP 558 (my attempt to fix the interaction 
> between tracing functions and closure variables) is currently segfaulting 
> somewhere deep in the garbage collector, and I've found that there's an issue 
> with the PyEval_GetLocals() API returning a borrowed reference that means I 
> need to tweak the proposed C API a bit such that PyEval_GetLocals() returns 
> the proxy at function scope, and we add a new PyEval_GetPyLocals() that 
> matches the locals() builtin.
>
> I don't *want* to postpone this to Python 3.9, but there turned out to be 
> more remaining work than I thought there was to get this ready for inclusion 
> in beta 1.
>
> I'll try to get the C API design details sorted today, but the segfault is 
> mystifying me, and prevents the option of putting the core implementation in 
> place for b1, and tidying up the documentation and comments for b2.
>
> Cheers,
> Nick.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [SPAM?] Re: PEP 558: Defined semantics for locals()

2019-05-28 Thread Nathaniel Smith
On Tue, May 28, 2019 at 5:24 PM Greg Ewing  wrote:
>
> Terry Reedy wrote:
> > I believe that the situation is or can be thought of as this: there is
> > exactly 1 function locals dict.  Initially, it is empty and inaccessible
> > (unusable) from code.  Each locals() call updates the dict to a current
> > snapshot and returns it.
>
> Yes, I understand *what's* happening, but not *why* it was designed
> that way.

I'm not sure of the exact history, but I think it's something like:

In the Beginning, CPython was Simple, but Slow: every frame struct had
an f_locals field, it was always a dict, the bytecode accessed the
dict, locals() returned the dict, that was that. Then one day the
serpent of Performance Optimization came, whispering of static
analysis of function scope and LOAD_FAST bytecodes. And we were
seduced by the serpent's vision, and made CPython Faster, with
semantics that were Almost The Same, and we shipped it to our users.
But now the sin of Cache Inconsistency had entered our hearts, and we
were condemned to labor endlessly: again and again, users discovered a
leak in our abstraction, and again and again we covered our sin with
new patches, until Simplicity was obscured.

(The current design does makes sense, but you really have to look at
it as a hard-fought compromise between the elegant original design
versus ~30 years of real-world demands. And hey, it could be worse –
look at the fun Intel's been having with their caches.)

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP 558] thinking through locals() semantics

2019-05-28 Thread Nathaniel Smith
On Tue, May 28, 2019 at 6:48 PM Greg Ewing  wrote:
>
> Nathaniel Smith wrote:
> > - [proxy]: Simply return the .f_locals object, so in all contexts
> > locals() returns a live mutable view of the actual environment:
> >
> >   def locals():
> >   return get_caller_frame().f_locals
>
> Not sure I quite follow this --  as far as I can see, f_locals
> currently has the same snapshot behaviour as locals().
>
> I'm assuming you mean to change things so that locals() returns a
> mutable view tracking the environment in both directions. That
> sounds like a much better idea all round to me. No weird
> shared-snapshot behaviour, and no need for anything to behave
> differently when tracing.

Yeah, I made the classic mistake and forgot that my audience isn't as
immersed in this as I am :-). Throughout the email I'm assuming we're
going to adopt PEP 558's proposal about replacing f_locals with a new
kind of mutable view object, and then given that, asking what we
should do about locals().

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [SPAM?] Re: PEP 558: Defined semantics for locals()

2019-05-28 Thread Nathaniel Smith
On Tue, May 28, 2019 at 6:02 PM Guido van Rossum  wrote:
>
> On Tue, May 28, 2019 at 5:25 PM Greg Ewing  
> wrote:
>>
>> Terry Reedy wrote:
>> > I believe that the situation is or can be thought of as this: there is
>> > exactly 1 function locals dict.  Initially, it is empty and inaccessible
>> > (unusable) from code.  Each locals() call updates the dict to a current
>> > snapshot and returns it.
>>
>> Yes, I understand *what's* happening, but not *why* it was designed
>> that way. Would it really be prohibitively expensive to create a
>> fresh dict each time?
>
> No. But it would be inconsistent with the behavior at module level.
>
> FWIW I am leaning more and more to the [proxy] model, where locals() and 
> frame.f_locals are the same object, which *proxies* the fast locals and 
> cells. That only has one downside: it no longer returns a dict, but merely a 
> MutableMapping. But why would code care about the difference? (There used to 
> be some relevant builtins that took dicts but not general MutableMappings -- 
> but that has been fixed long ago.)

Related trivia: the exec() and eval() builtins still mandate that
their 'globals' argument be an actual no-fooling dict, but their
'locals' argument is allowed to be any kind of mapping object. This is
an intentional, documented feature [1]. And inside the exec/eval,
calls to locals() return whatever object was passed. For example:

>>> exec("print(type(locals()))", {}, collections.ChainMap())


So technically speaking, it's already possible for locals() to return
a non-dict.

Of course this is incredibly uncommon in practice, so existing code
doesn't necessarily take it into account. But it's some kind of
conceptual precedent, anyway.

-n

[1] See https://docs.python.org/3/library/functions.html#eval and the
'exec' docs right below it. I think the motivation is that in the
current CPython implementation, every time you access a global it does
a direct lookup in the globals object, so it's important that we do
this lookup as fast as possible, and forcing the globals object to be
a actual dict allows some optimizations. For locals, though, we
usually use the "fast locals" mechanism and the mapping object is
mostly vestigial, so it doesn't matter how fast lookups are, so we can
support any mapping.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP 558] thinking through locals() semantics

2019-05-27 Thread Nathaniel Smith
On Mon, May 27, 2019 at 9:18 PM Guido van Rossum  wrote:
>
> Note that the weird, Action At A Distance behavior is also visible for 
> locals() called at module scope (since there, locals() is globals(), which 
> returns the actual dict that's the module's __dict__, i.e. the Source Of 
> Truth. So I think it's unavoidable in general, and we would do wise not to 
> try and "fix" it just for function locals. (And I certainly don't want to 
> mess with globals().)

I think it's worth distinguishing between two different types of weird
Action At A Distance here:

- There's the "justified" action-at-a-distance that currently happens
at module scope, where locals().__setitem__ affects variable lookup,
and variable mutation affects locals().__getitem__. This can produce
surprising results if you pass locals() into something that's
expecting a regular dict, but it's also arguably the point of an
environment introspection API, and like you say, it's unavoidable and
expected at module scope and when using globals().

- And then there's the "spooky" action-at-a-distance that currently
happens at function scope, where calling locals() has the side-effect
of mutating the return value from previous calls to locals(), and the
objects returned from locals may or may not spontaneously mutate
themselves depending on whether some other code registered a trace
function. This is traditional, but extremely surprising if you aren't
deeply familiar with internals of CPython's implementation.

Of the four designs:

[PEP] and [PEP-minus-tracing] both have "spooky" action-at-a-distance
(worse in [PEP]), but they don't have "justified"
action-at-a-distance.

[proxy] adds "justified" action-at-a-distance, and removes "spooky"
action at a distance.

[snapshot] gets rid of both kinds of action-at-a-distance (at least in
function scope).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PEP 558] thinking through locals() semantics

2019-05-27 Thread Nathaniel Smith
On Mon, May 27, 2019 at 9:16 AM Guido van Rossum  wrote:
>
> I re-ran your examples and found that some of them fail.
>
> On Mon, May 27, 2019 at 8:17 AM Nathaniel Smith  wrote:
[...]
>> The interaction between f_locals and and locals() is also subtle:
>>
>>   def f():
>>   a = 1
>>   loc = locals()
>>   assert "loc" not in loc
>>   # Regular variable updates don't affect 'loc'
>>   a = 2
>>   assert loc["a"] == 1
>>   # But debugging updates do:
>>   sys._getframe().f_locals["a"] = 3
>>   assert a == 3
>
>
> That assert fails; `a` is still 2 here for me.

I think you're running on current Python, and I'm talking about the
semantics in the current PEP 558 draft, which redefines f_locals so
that the assert passes. Nick has a branch here if you want to try it:
https://github.com/python/cpython/pull/3640

(Though I admit I was lazy, and haven't tried running my examples at
all -- they're just based on the text.)

>>
>>   assert loc["a"] == 3
>>   # But it's not a full writeback
>>   assert "loc" not in loc
>>   # Mutating 'loc' doesn't affect f_locals:
>>   loc["a"] = 1
>>   assert sys._getframe().f_locals["a"] == 1
>>   # Except when it does:
>>   loc["b"] = 3
>>   assert sys._getframe().f_locals["b"] == 3
>
>
> All of this can be explained by realizing `loc is sys._getframe().f_locals`. 
> IOW locals() always returns the dict in f_locals.

That's not true in the PEP version of things. locals() and
frame.f_locals become radically different. locals() is still a dict
stored in the frame object, but f_locals is a magic proxy object that
reads/writes to the fast locals array directly.

>>
>> Again, the results here are totally different if a Python-level
>> tracing/profiling function is installed.
>>
>> And you can also hit these subtleties via 'exec' and 'eval':
>>
>>   def f():
>>   a = 1
>>   loc = locals()
>>   assert "loc" not in loc
>>   # exec() triggers writeback, and then mutates the locals dict
>>   exec("a = 2; b = 3")
>>   # So now the current environment has been reflected into 'loc'
>>   assert "loc" in loc
>>   # Also loc["a"] has been changed to reflect the exec'ed assignments
>>   assert loc["a"] == 2
>>   # But if we look at the actual environment, directly or via
>>   # f_locals, we can see that 'a' has not changed:
>>   assert a == 1
>>   assert sys._getframe().f_locals["a"] == 1
>>   # loc["b"] changed as well:
>>   assert loc["b"] == 3
>>   # And this *does* show up in f_locals:
>>   assert sys._getframe().f_locals["b"] == 3
>
>
> This works indeed. My understanding is that the bytecode interpreter, when 
> accessing the value of a local variable, ignores f_locals and always uses the 
> "fast" array. But exec() and eval() don't use fast locals, their code is 
> always compiled as if it appears in a module-level scope.
>
> While the interpreter is running and no debugger is active, in a function 
> scope f_locals is not used at all, the interpreter only interacts with the 
> fast array and the cells. It is initialized by the first locals() call for a 
> function scope, and locals() copies the fast array and the cells into it. 
> Subsequent calls in the same function scope keep the same value for f_locals 
> and re-copy fast and cells into it. This also clears out deleted local 
> variables and emptied cells, but leaves "strange" keys (like "b" in the 
> examples) unchanged.
>
> The truly weird case happen when Python-level tracers are present, then the 
> contents of f_locals is written back to the fast array and cells at certain 
> points. This is intended for use by pdb (am I the only user of pdb left in 
> the world?), so one can step through a function and mutate local variables. I 
> find this essential in some cases.

Right, the original goal for the PEP was to remove the "truly weird
case" but keep pdb working

>>
>> Of course, many of these edge cases are pretty obscure, so it's not
>> clear how much they matter. But I think we can at least agree that
>> this isn't the one obvious way to do it :-).
>>
>>
>> # What's the landscape of possible semantics?
>>
>> I did some brainstorming, and came up with 4 sets of semantics that
>> seem plausible enough to at least consider:
>>
>> - [PEP]: the

[Python-Dev] [PEP 558] thinking through locals() semantics

2019-05-27 Thread Nathaniel Smith
First, I want to say: I'm very happy with PEP 558's changes to
f_locals. It solves the weird threading bugs, and exposes the
fundamental operations you need for debugging in a simple and clean
way, while leaving a lot of implementation flexibility for future
Python VMs. It's a huge improvement over what we had before.

I'm not as sure about the locals() parts of the proposal. It might be
fine, but there are some complex trade-offs here that I'm still trying
to wrap my head around. The rest of this document is me thinking out
loud to try to clarify these issues.


# What are we trying to solve?

There are two major questions, which are somewhat distinct:
- What should the behavior of locals() be in CPython?
- How much of that should be part of the language definition, vs
CPython implementation details?

The status quo is that for locals() inside function scope, the
behavior is quite complex and subtle, and it's entirely implementation
defined. In the current PEP draft, there are some small changes to the
semantics, and also it promotes them becoming part of the official
language semantics.

I think the first question, about semantics, is the more important
one. If we're promoting them to the language definition, the main
effect is just to make it more important we get the semantics right.


# What are the PEP's proposed semantics for locals()?

They're kinda subtle. [Nick: please double-check this section, both
for errors and because I think it includes some edge cases that the
PEP currently doesn't mention.]

For module/class scopes, locals() has always returned a mapping object
which acts as a "source of truth" for the actual local environment –
mutating the environment directly changes the mapping object, and
vice-versa. That's not going to change.

In function scopes, things are more complicated. The *local
environment* is conceptually well-defined, and includes:
- local variables (current source of truth: "fast locals" array)
- closed-over variables (current source of truth: cell objects)
- any arbitrary key/values written to frame.f_locals that don't
correspond to local or closed-over variables, e.g. you can do
frame.f_locals[object()] = 10, and then later read it out again.

However, the mapping returned by locals() does not directly reflect
this local environment. Instead, each function frame has a dict
associated with it. locals() returns this dict. The dict always holds
any non-local/non-closed-over variables, and also, in certain
circumstances, we write a snapshot of local and closed-over variables
back into the dict.

Specifically, we write back:

- Whenever locals() is called
- Whenever exec() or eval() is called without passing an explicit
locals argument
- After every trace/profile event, if a Python-level tracing/profiling
function is registered.

(Note: in CPython, the use of Python-level tracing/profiling functions
is extremely rare. It's more common in alternative implementations
like PyPy. For example, the coverage package uses a C-level tracing
function on CPython, which does not trigger locals updates, but on
PyPy it uses a Python-level tracing function, which does trigger
updates.)

In addition, the PEP doesn't say, but I think that any writes to
f_locals immediately update both the environment and the locals dict.

These semantics have some surprising consequences. Most obviously, in
function scope (unlike other scopes), mutating locals() does not
affect the actual local environment:

  def f():
  a = 1
  locals()["a"] = 2
  assert a == 1

The writeback rules can also produce surprising results:

  def f():
  loc1 = locals()
  # Since it's a snapshot created at the time of the call
  # to locals(), it doesn't contain 'loc1':
  assert "loc1" not in loc1
  loc2 = locals()
  # Now loc1 has changed:
  assert "loc1" in loc1

However, the results here are totally different if a Python-level
tracing/profiling function is installed – in particular, the first
assertion fails.

The interaction between f_locals and and locals() is also subtle:

  def f():
  a = 1
  loc = locals()
  assert "loc" not in loc
  # Regular variable updates don't affect 'loc'
  a = 2
  assert loc["a"] == 1
  # But debugging updates do:
  sys._getframe().f_locals["a"] = 3
  assert a == 3
  assert loc["a"] == 3
  # But it's not a full writeback
  assert "loc" not in loc
  # Mutating 'loc' doesn't affect f_locals:
  loc["a"] = 1
  assert sys._getframe().f_locals["a"] == 1
  # Except when it does:
  loc["b"] = 3
  assert sys._getframe().f_locals["b"] == 3

Again, the results here are totally different if a Python-level
tracing/profiling function is installed.

And you can also hit these subtleties via 'exec' and 'eval':

  def f():
  a = 1
  loc = locals()
  assert "loc" not in loc
  # exec() triggers writeback, and then mutates the locals dict
  exec("a = 2; b = 3")
  # So now 

Re: [Python-Dev] PEP 558: Defined semantics for locals()

2019-05-25 Thread Nathaniel Smith
On Sat, May 25, 2019, 07:38 Guido van Rossum  wrote:

> This looks great.
>
> I only have two nits with the text.
>
> First, why is the snapshot called a "dynamic snapshot"? What exactly is
> dynamic about it?
>

It's dynamic in that it can spontaneously change when certain other events
happen. For example, imagine this code runs at function scope:

# take a snapshot
a = locals()

# it's a snapshot, so it doesn't include the new variable
assert "a" not in a

# take another snapshot
b = locals()

# now our first "snapshot" has changed
assert "a" in a

Overall I'm happy with the PEP, but I'm still a bit uneasy about whether
we've gotten the details of this "dynamicity" exactly right, esp. since the
PEP promotes them from implementation detail to language features. There
are a lot of complicated tradeoffs so I'm working on a longer response that
tries to lay out all the options and hopefully convince myself (and
everyone else).

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] we will probably be having an difficult discussion about the stdlib after PEP 594 is done (was: PEP 594: Removing dead batteries from the standard library)

2019-05-24 Thread Nathaniel Smith
On Thu, May 23, 2019 at 2:18 PM Brett Cannon  wrote:
> I'm personally viewing it as a first step in addressing the maintenance 
> burden we have with such a large stdlib. Christian started this work over a 
> year ago and I think it's worth seeing through. After that we should probably 
> have a discussion as a team about how we view the stdlib long-term and how 
> that ties into maintaining it so that people's opinion of the stdlib's 
> quality goes up rather than viewing the quality of it as varying 
> module-to-module.

I started a thread on discourse to discuss some "what if" scenarios
here, in the hopes it will help us make more informed decisions:
https://discuss.python.org/t/1738

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] we will probably be having an difficult discussion about the stdlib after PEP 594 is done

2019-05-24 Thread Nathaniel Smith
On Fri, May 24, 2019, 08:08 Ben Cail  wrote:

>
> Why not have the PSF hire someone (or multiple people) to be paid to
> work on the maintenance burden? This could be similar to the Django
> fellows:
> https://www.djangoproject.com/fundraising/#who-is-the-django-fellow. It
> seems like a good thing for Django, and Python is used by many more
> people than Django. Why not pay someone to do the work that others don't
> want to do? The person in this position could be guided by the PSF
> and/or the Steering Council, to do the work most necessary for the good
> of the language as a whole (like maintaining old modules that other core
> devs don't want to work on).
>
> You could market it together with the maintenance burden: "you want to
> use all these old modules, but we don't want to maintain them. So pay us
> some money, and we'll hire someone to maintain them."
>

I think the basic idea here is a good one, but:

- transitioning from an all-volunteer project to a mixed
paid-staff+volunteers project is a big step, and we'll need to take time to
figure out what that would look like before people are comfortable with it.

- even if we have folks paid to help with maintenance, it won't mean we
suddenly have infinite resources and can do everything. We'll still need to
pick which things to prioritize. And I think if you asked 100 people to
name the most critical issues facing Python today, most of them would not
say "maintaining xdrlib".

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-22 Thread Nathaniel Smith
On Wed, May 22, 2019, 04:32 Christian Heimes  wrote:

> On 22/05/2019 12.19, Steven D'Aprano wrote:
> > I don't think this PEP should become a document about "Why you should
> > use PAM". I appreciate that from your perspective as a Red Hat security
> > guy, you want everyone to use best practices as you see them, but it
> > isn't Python's position to convince Linux distros or users to use PAM.
>
> I think the PEP should make clear why spwd is bad and pining for The
> Fjords. The document should point users to correct alternatives. There is
> no correct and secure way to use the spwd module to verify user accounts.
> Any use of spwd for logins introduces critical security bugs.
>
> By the way, all relevant BSD, Linux, and Darwin (macOS) distributions come
> with PAM support. Almost all use PAM by default. AFAIK only the minimal
> Alpine container does not have PAM installed by default. This is not Red
> Hat trying to evangelize the world. PAM is *the* industry standards on
> Unix-like OS.
>

The removal of spwd seems reasonable to me, and I don't think you need to
write 20 seperate PEPs for each module, but I do think you should split the
spwd/crypt modules off into their own PEP. The discussion about these
modules is qualitatively different than some of the others (the security
implications etc.), and trying to mix qualitatively different discussions
always makes people frustrated.

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: update 1

2019-05-22 Thread Nathaniel Smith
On Wed, May 22, 2019, 12:14 Sean Wallitsch 
wrote:

> Dear python-dev,
>
> I'm writing to provide some feedback on PEP-594, primarily the proposed
> deprecation and reason for the removal of the aifc and audioop libraries.
>
> The post production film industry continues to make heavy use of AIFFs, as
> completely uncompressed audio is preferred. Support for the consumer
> alternatives (ALAC, FLAC) is virtually non-existent, with no movement
> towards adoption of those formats. Even Apple's own professional editing
> tool Final Cut Pro does not support ALAC. Many of the applications also
> support WAV, but not all.
>
> Removal of this module from the standard library is complicated by the
> fact that a large number of film industry facilities have extremely limited
> internet access for security reasons. This does not make it impossible to
> get a library from pypi, but speaking to those devs has made me aware of
> what a painful process that is for them. They have benefited greatly from
> aifc's inclusion in the standard library.
>

That's really helpful data, thank you!

Is audioop also used? You mention both aifc and audioop at the beginning
and end of your message, but all the details in the middle focus on just
aifc.

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread Nathaniel Smith
On Tue, May 21, 2019 at 4:25 AM Victor Stinner  wrote:
>
> Le mar. 21 mai 2019 à 13:18, André Malo  a écrit :
> > There's software in production using both. (It doesn't mean it's on pypi or
> > even free software).
> >
> > What would be the maintenance burden of those modules anyway? (at least for
> > nntp, I guess it's not gonna change).
>
> The maintenance burden is real even if it's not visible. For example,
> test_nntplib is causing frequently issues on our CI:
>
> https://bugs.python.org/issue19756
> https://bugs.python.org/issue19613
> https://bugs.python.org/issue31850
>
> It's failing frequently since 2013, and nobody managed to come with a
> fix.. in 6 years.

If the tests don't work and the module is unmaintained, then maybe we
should disable the tests and put a prominent notice in the docs saying
that it's unmaintained and any use is at your own risk. It's not a
pleasant thing to do, but if that's the reality of the situation then
it's probably better to be up front about it than to stick our fingers
in our ears and waste folks time with spurious test failures. And
perhaps someone would actually step up to maintain it.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 594: Removing dead batteries from the standard library

2019-05-21 Thread Nathaniel Smith
On Tue, May 21, 2019 at 10:43 AM Glenn Linderman  wrote:
> After maintaining my own version of http.server to fix or workaround some of 
> its deficiencies for some years, I discovered bottle.py. It has far more 
> capability, is far better documented, and is just as quick to deploy. While I 
> haven't yet converted all past projects to use bottle.py, it will likely 
> happen in time, unless something even simpler to use is discovered, although 
> I can hardly imagine that happening.

bottle.py uses http.server for its local development mode (the one you
see in their quickstart example at the top of their README). Same with
flask, django, and probably a bunch of other frameworks. It's *very*
widely used.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-36829: Add sys.unraisablehook()

2019-05-16 Thread Nathaniel Smith
On Thu, May 16, 2019 at 1:23 PM Victor Stinner  wrote:
>
> Le jeu. 16 mai 2019 à 20:58, Petr Viktorin  a écrit :
> > I always thought the classic (exc_type, exc_value, exc_tb) triple is a
> > holdover from older Python versions, and all the information is now in
> > the exception instance.
> > Is the triple ever different from (type(exc), exc, exc.__traceback__)?
> > (possibly with a getattr for __traceback__)
>
> I added assertions in PyErr_WriteTraceback():
>
> assert(Py_TYPE(v) == t);
> assert(PyException_GetTraceback(v) == tb);
>
> "Py_TYPE(v) == t" fails in
> test_exceptions.test_memory_error_in_PyErr_PrintEx() for example.
> PyErr_NoMemory() calls PyErr_SetNone(PyExc_MemoryError), it sets
> tstate->curexc_type to PyExc_MemoryError, but tstate->curexc_value is
> set to NULLL.

This makes some sense – if you can't allocate memory, then better not
allocate an exception instance to report that! So this is legitimately
a special case.

But... it looks like everywhere else, the way we handle this when
transitioning into Python code is to create an instance. For example,
that test does 'except MemoryError as e', so an instance does need to
be created then. The comments suggest that there's some trick where we
have pre-allocated MemoryError() instances? But either way, if we can
afford to call a Python hook (which requires at least allocating a
frame!), then we can probably also afford to materialize the
MemoryError instance. So I feel like maybe we shouldn't be passing
None to the unraisable hook, even if PyErr_NoMemory() did initially
set that?

Also, in practice, the only time I've ever seen MemoryError is from
attempting to do a single massively-huge allocation. It's never meant
that regular allocation of small objects will fail.

> "PyException_GetTraceback(v) == tb" fails in
> test_exceptions.test_unraisable() for example: "PyTraceBack_Here(f);"
> in the "error:" label of ceval.c creates a traceback object and sets
> it to tstate->curexec_traceback, but it doesn't set the __traceback__
> attribute of the current exception.

Isn't this just a bug that should be fixed?

> > Should new APIs use it?
>
> I tried to add a "PyErr_NormalizeException(, , );" call in
> PyErr_WriteUnraisable(): it creates an exception object (exc_value)
> for the PyErr_NoMemory() case, but it still doesn't set the
> __traceback__ attribute of the exception for the PyTraceBack_Here()
> case.
>
> It seems like PyErr_WriteUnraisable() cannot avoid having 3 variables
> (exc_type, exc_value, exc_tb), since they are not consistent as you
> may expect.

I'm actually fine with it having three arguments -- even if it's
technically unnecessary, it's currently 100% consistent across these
low-level APIs, and it doesn't hurt anything, so we might as well
stick with it for consistency.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Parser module in the stdlib

2019-05-16 Thread Nathaniel Smith
On Thu, May 16, 2019 at 2:13 PM Pablo Galindo Salgado
 wrote:
> I propose to remove finally the parser module as it has been "deprecated" for 
> a long time, is almost clear that nobody uses it and has very limited 
> usability and replace it (maybe with a different name)
> with pgen2 (maybe with a more generic interface that is detached to lib2to3 
> particularities). This will not only help a lot current libraries that are 
> using forks or similar solutions but also will help to keep
> synchronized the shipped grammar (that is able to parse Python2 and Python3 
> code) with the current Python one (as now will be more justified to keep them 
> in sync).

Will the folks using forks be happy to switch to the stdlib version?
For example I can imagine that if black wants to process 3.7 input
code while running on 3.6, it might prefer a parser on PyPI even if
the stdlib version were public, since the PyPI version can be updated
independently of the host Python.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-36829: Add sys.unraisablehook()

2019-05-16 Thread Nathaniel Smith
On Thu, May 16, 2019 at 2:17 PM Steve Dower  wrote:
> You go on to say "pass an error message" and "keep repr(obj) if you
> want", but how is this different from creating an exception that
> contains the custom message, the repr of the object, and chains the
> exception that triggered it?

A clever hook might want the actual object, so it can pretty-print it,
or open an interactive debugger and let it you examine it, or
something. Morally this is similar to calling repr(obj), but it
doesn't literally call repr(obj).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-36829: Add sys.unraisablehook()

2019-05-16 Thread Nathaniel Smith
On Thu, May 16, 2019, 09:07 Steve Dower  wrote:

>
> Actually, if the default implementation prints the exception message,
> how is this different from sys.excepthook? Specifically, from the point
> of customizing the hooks.
>

sys.excepthook means the program has fully unwound and is about to exit.
This is pretty different from an exception inside a __del__ or background
thread or whatever, where the program definitely hasn't unwound and is
likely to continue. And I'm pretty sure they have behavioral differences
already, like if you pass in a SystemExit exception then sys.excepthook
doesn't print anything, but PyErr_WriteUnraiseable prints a traceback.

So making them two separate hooks seems right to me. Some people will
override both; that's fine.

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-36829: Add sys.unraisablehook()

2019-05-15 Thread Nathaniel Smith
On Wed, May 15, 2019 at 6:25 PM Victor Stinner  wrote:
> I proposed a different approach: add a new sys.unraisablehook hook
> which is called to handle an "unraisable exception". To handle them
> differently, replace the hook. For example, I wrote a custom hook to
> log these exceptions into a file (the output on the Python test suite
> is interesting!). It also becomes trivial to reimplement Thomas's idea
> (kill the process):

What happens if the hook raises an exception?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] deprecation of abstractstaticmethod and abstractclassmethod

2019-05-15 Thread Nathaniel Smith
I don't care about the deprecation either way. But can we fix the
individual decorators so both orders work? Even if it requires a special
case in the code, it seems worthwhile to remove a subtle user-visible
footgun.

On Wed, May 15, 2019, 12:39 Ethan Furman  wrote:

> In issue 11610* abstractclassmethod and abstractstaticmethod were
> deprecated, apparently because they were redundant with the new technique
> of calling `classmethod` or `staticmethod` followed by a call to
> `abstractmethod`.  To put it in code:
>
> # deprecated
>
> class Foo(ABC):
>
>  @abstractclassmethod
>  def foo_happens(cls):
>  # do some fooey stuff
>
> # the new(er) way
>
> class Foo(ABC):
>
>  @classmethod
>  @abstractmethod
>  def foo_happens(cls):
>  # do some fooey stuff
>
>
> I would like to remove the deprecated status of `abstractclassmethod` and
> `abstractstaticmethod` mainly because:
>
> - using the combined decorator is easy to get right
>(@abstractmethod followed by @classmethod doesn't work)
>
> - getting the order wrong can be hard to spot and fix
>
> Obviously, decorator order matters for many, if not most, decorators out
> there -- so why should these two be any different?  Because 'abstract',
> 'class', and 'static' are adjectives -- they're describing the method,
> rather than changing it**; to use an example, what is the difference
> between "hot, dry sand" and "dry, hot sand"?  The sand is just as dry and
> just as hot either way.  In a debugging session looking at:
>
> @abstractmethod
> @classmethod
> def some_func(self, this, that, the_other):
>  # many
>  # many
>  ...
>  ...
>  ...
>  # many
>  # lines
>  # of
>  # code
>
> Not noticing that the two decorators are in reverse order would be very
> easy to do.
>
> Because order matters here, but cognitively should not, a helper function
> to make sure it is always done right is a prime candidate to be added to a
> module -- and, luckily for us, those helper functions already exist!
> Unfortunately, they are also deprecated, discouraging their use, when we
> should be encouraging their use.
>
> What are the reasons to /not/ remove the deprecation?
>
> --
> ~Ethan~
>
>
>
> * https://bugs.python.org/issue11610
>
> ** I realize that abstractmethod does actually change the function, but
> that's an implementation detail.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-29 Thread Nathaniel Smith
On Mon, Apr 29, 2019 at 5:01 PM Neil Schemenauer  wrote:
> As far as I understand, we have a similar problem already for
> gc.get_objects() because those static type objects don't have a
> PyGC_Head.  My 2-cent proposal for fixing things in the long term
> would be to introduce a function like PyType_Ready that returns a
> pointer to the new type.  The argument to it would be what is the
> current static type structure.  The function would copy things from
> the static type structure into a newly allocated type structure.

I doubt you'll be able to get rid of static types entirely, due to the
usual issues with C API breakage. And I'm guessing that static types
make up such a tiny fraction of the address space that merely tweaking
the percent up or down won't affect performance.

But your proposed new API would make it *way* easier to migrate
existing code to the stable ABI.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-27 Thread Nathaniel Smith
On Sat, Apr 27, 2019, 04:27 Armin Rigo  wrote:

> Hi Neil,
>
> On Wed, 24 Apr 2019 at 21:17, Neil Schemenauer 
> wrote:
> > Regarding the Py_TRACE_REFS fields, I think we can't do them without
> > breaking the ABI because of the following.  For GC objects, they are
> > always allocated by _PyObject_GC_New/_PyObject_GC_NewVar.  So, we
> > can allocate the extra space needed for the GC linked list.  For
> > non-GC objects, that's not the case.  Extensions can allocate using
> > malloc() directly or their own allocator and then pass that memory
> > to be initialized as a PyObject.
> >
> > I think that's a poor design and I think we should try to make slow
> > progress in fixing it.
>
> Such progress needs to start with the global static PyTypeObjects that
> all extensions define.  This is going to be impossible to fix without
> requiring a big fix in of *all* of them.  (Unless of course you mean
> to still allow them, but then Py_TRACE_REF can't be implemented in a
> way that doesn't break the ABI.)
>

For Py_TRACE_REFS specifically, IIUC the only goal is to be able to produce
a list of all live objects on demand. If that's the goal, then static type
objects aren't a huge deal. You can't add extra data into the type objects
themselves, but since there's a fixed set of them and they're immortal, you
can just build a static list of all of them in PyType_Ready.

-n

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-25 Thread Nathaniel Smith
You don't necessarily need rpath actually. The Linux loader has a
bug/feature where once it has successfully loaded a library with a given
soname, then any future requests for that soname within the same process
will automatically return that same library, regardless of rpath settings
etc. So as long as the main interpreter has loaded libpython.whatever from
the correct directory, then extension modules will all get that same
version. The rpath won't matter at all.

It is annoying in general that on Linux, we have these two different ways
to build extension modules. It definitely violates TOOWTDI :-). It would be
nice at some point to get rid of one of them.

Note that we can't get rid of the two different ways entirely though – on
Windows, extension modules *must* link to libpython.dll, and on macOS,
extension modules *can't* link to libpython.dylib. So the best we can hope
for is to make Linux consistently do one of these, instead of supporting
both.

In principle, having extension modules link to libpython.so is a good
thing. Suppose that someone wants to dynamically load the python
interpreter into their program as some kind of plugin. (Examples: Apache's
mod_python, LibreOffice's support for writing macros in Python.) It would
be nice to be able to load python2 and python3 simultaneously into the same
process as distinct plugins. And this is totally doable in theory, *but* it
means that you can't assume that the interpreter's symbols will be
automagically injected into extension modules, so it's only possible if
extension modules link to libpython.so.

In practice, extension modules have never consistently linked to
libpython.so, so everybody who loads the interpreter as a plugin has
already worked around this. Specifically, they use RTLD_GLOBAL to dump all
the interpreter's symbols into the global namespace. This is why you can't
have python2 and python3 mod_python at the same time in the same Apache.
And since everyone is already working around this, linking to libpython.so
currently has zero benefit... in fact manylinux wheels are actually
forbidden to link to libpython.so, because this is the only way to get
wheels that work on every interpreter.

-n

On Wed, Apr 24, 2019, 09:54 Victor Stinner  wrote:

> Hum, I found issues with libpython: C extensions are explicitly linked
> to libpython built in release mode. So a debug python loading a C
> extension may load libpython in release mode, whereas libpython in
> debug mode is already loaded.
>
> When Python is built with --enable-shared, the python3.7 program is
> linked to libpython3.7m.so.1.0 on Linux. C extensions are explicitly
> linked to libpython3.7m as well:
>
> $ python3.7-config --ldflags
> ... -lpython3.7m ...
>
> Example with numpy:
>
> $ ldd /usr/lib64/python3.7/site-packages/numpy/core/
> umath.cpython-37m-x86_64-linux-gnu.so
> ...
> libpython3.7m.so.1.0 => /lib64/libpython3.7m.so.1.0 (...)
> ...
>
> When Python 3.7 is compiled in debug mode, libpython gets a "d" flag
> for debug: libpython3.7dm.so.1.0.
>
> I see 2 solutions:
>
> (1) Use a different directory. If "libpython" gets the same filename
> in release and debug mode, at least, they must be installed in
> different directories. If libpython build in debug mode is installed
> in /usr/lib64/python3.7-dbg/ for example, python3.7-dbg should be
> compiled with -rpath /usr/lib64/python3.7-dbg/ to get the debug
> libpython.
>
> (2) If "libpython" gets a different filename in debug mode, C
> extensions should not be linked to libpython explicitly but
> *implicitly* to avoid picking the wrong libpython. For example, remove
> "-lpython3.7m" from "python3.7-config --ldflags" output.
>
> The option (1) rely on rpath which is discouraged by Linux vendors and
> may not be supported by all operating systems.
>
> The option (2) is simpler and likely more portable.
>
> Currently, C extensions of the standard library may or may not be
> linked to libpython depending on they are built. In practice, both
> work since python3.7 is already linked to libpython: so libpython is
> already loaded in memory before C extensions are loaded.
>
> I opened https://bugs.python.org/issue34814 to discuss how C
> extensions of the standard library should be linked but I closed it
> because we failed to find a consensus and the initial use case became
> a non-issue. It seems like we should reopen the discussion :-)
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>

On Wed, Apr 24, 2019, 09:54 Victor Stinner  wrote:

> Hum, I found issues with libpython: C extensions are explicitly linked
> to libpython built in release mode. So a debug python loading a C
> extension may load libpython in release mode, whereas libpython in
> debug mode is already loaded.
>
> When Python is built with --enable-shared, the 

Re: [Python-Dev] Concurrent.futures: no type discovery for PyCharm

2019-04-23 Thread Nathaniel Smith
On Tue, Apr 23, 2019, 05:09 Andrew Svetlov  wrote:

> I agree that `from typing import TYPE_CHECKING` is not desirable from
> the import time reduction perspective.
>
> From my understanding code completion *can* be based on type hinting
> to avoid actual code execution.
> That's why I've mentioned that typeshed already has the correct type
> information.
>
> if TYPE_CHECKING:
> import ...
>
> requires mypy modification.
>
> if False:
> import ...
>
> Works right now for stdlib (mypy ignores stdlib code but uses typeshed
> anyway) but looks a little cryptic.
> Requires a comprehensive comment at least.
>

Last time I looked at this, I'm pretty sure `if False` broke at least one
popular static analysis tool (ie it was clever enough to ignore everything
inside `if False`) – I think either pylint or jedi?

I'd suggest checking any clever hacks against at least: mypy,
pylint/astroid, jedi, pyflakes, and pycharm. They all have their own static
analysis engines, and each one has its own idiosyncratic quirks.

We've struggled with this a *lot* in trio, and eventually ended up giving
up on all forms of dynamic export cleverness; we've even banned the use of
__all__ entirely. Static analysis has gotten good enough that users won't
accept it not working, but it hasn't gotten good enough to handle anything
but the simplest static exports in a reliable way:
https://github.com/python-trio/trio/pull/316
https://github.com/python-trio/trio/issues/542

The stdlib has more leeway because when tools don't work on the stdlib then
they tend to eventually add workarounds. I'm just saying, think twice
before diving into clever hacks to workaround static analysis limits, and
if you're going to do it then be careful to be thorough. You're basically
relying on undocumented bugs, and it gets really messy really quickly.

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Concurrent.futures: no type discovery for PyCharm

2019-04-20 Thread Nathaniel Smith
On Sat, Apr 20, 2019 at 2:11 PM Inada Naoki  wrote:
>
> "import typing" is slow too.

Many static analysis tools will also accept:

TYPE_CHECKING = False
if TYPE_CHECKING:
...

At least mypy and pylint both treat all variables named TYPE_CHECKING
as true, regardless of where they came from. I'm not sure if this is
intentional or because they're cutting corners, but it works...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-16 Thread Nathaniel Smith
On Mon, Apr 15, 2019 at 8:58 PM Michael Sullivan  wrote:
>
> On Mon, Apr 15, 2019 at 4:06 PM Nathaniel Smith  wrote:
>>
>> On Mon, Apr 15, 2019, 15:27 Michael Sullivan  wrote:
>>>
>>> > The main question is if anyone ever used Py_TRACE_REFS? Does someone
>>> > use sys.getobjects() or PYTHONDUMPREFS environment variable?
>>>
>>> I used sys.getobjects() today to track down a memory leak in the 
>>> mypyc-compiled version of mypy.
>>>
>>> We were leaking memory badly but no sign of the leak was showing up in 
>>> mypy's gc.get_objects() based profiler. Using a debug build and switching 
>>> to sys.getobjects() showed that we were badly leaking int objects. A quick 
>>> inspection of the values in question (large and random looking) suggested 
>>> we were leaking hash values, and that quickly pointed me to 
>>> https://github.com/mypyc/mypyc/pull/562.
>>>
>>> I don't have any strong feelings about whether to keep it in the "default" 
>>> debug build, though. I was using a debug build that I built myself with 
>>> every debug feature that seemed potentially useful.
>>
>>
>> This is mostly to satisfy my curiosity, so feel free to ignore: did you try 
>> using address sanitizer or valgrind?
>>
> I didn't, mostly because I assume that valgrind wouldn't play well with 
> cpython. (I've never used address sanitizer.)
>
> I was curious, so I went back and tried it out.
> It turned out to not seem to need that much fiddling to get to work. It slows 
> things down a *lot* and produced 17,000 "loss records", though, so maybe I 
> don't have it working right. At a glance the records did not shed any light.
>
> I'd definitely believe that valgrind is up to the task of debugging this, but 
> my initial take with it shed much less light than my sys.getobjects() 
> approach. (Though note that my sys.getobjects() approach was slotting it into 
> an existing python memory profiler we had hacked up, so...)

valgrind on CPython is definitely a bit fiddly – if you need it again
you might check out Misc/README.valgrind.

Supposedly memory sanitizer is just './configure
--with-memory-sanitizer', but I haven't tried it either :-)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 591 discussion (final qualifier) happening at typing-sig@

2019-04-15 Thread Nathaniel Smith
On Mon, Apr 15, 2019 at 5:00 PM Michael Sullivan  wrote:
>
> I've submitted PEP 591 (Adding a final qualifier to typing) for discussion to 
> typing-sig [1].

I'm not on typing-sig [1] so I'm replying here.

> Here's the abstract:
> This PEP proposes a "final" qualifier to be added to the ``typing``
> module---in the form of a ``final`` decorator and a ``Final`` type
> annotation---to serve three related purposes:
>
> * Declaring that a method should not be overridden
> * Declaring that a class should not be subclassed
> * Declaring that a variable or attribute should not be reassigned

I've been meaning to start blocking subclassing at runtime (e.g. like
[2]), so being able to express that to the typechecker seems like a
nice addition. I'm assuming though that the '@final' decorator doesn't
have any runtime effect, so I'd have to say it twice?

@typing.final
class MyClass(metaclass=othermod.Final):
...

Or on 3.6+ with __init_subclass__, it's easy to define a @final
decorator that works at runtime, but I guess this would have to be a
different decorator?

@typing.final
@alsoruntime.final
class MyClass:
...

This seems kinda awkward. Have you considered giving it a runtime
effect, or providing some way for users to combine these two things
together on their own?

-n

[1] https://github.com/willingc/pep-communication/issues/1
[2] https://stackoverflow.com/a/3949004/1925449

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-15 Thread Nathaniel Smith
On Mon, Apr 15, 2019, 15:27 Michael Sullivan  wrote:

> > The main question is if anyone ever used Py_TRACE_REFS? Does someone
> > use sys.getobjects() or PYTHONDUMPREFS environment variable?
>
> I used sys.getobjects() today to track down a memory leak in the
> mypyc-compiled version of mypy.
>
> We were leaking memory badly but no sign of the leak was showing up in
> mypy's gc.get_objects() based profiler. Using a debug build and switching
> to sys.getobjects() showed that we were badly leaking int objects. A quick
> inspection of the values in question (large and random looking) suggested
> we were leaking hash values, and that quickly pointed me to
> https://github.com/mypyc/mypyc/pull/562.
>
> I don't have any strong feelings about whether to keep it in the "default"
> debug build, though. I was using a debug build that I built myself with
> every debug feature that seemed potentially useful.
>

This is mostly to satisfy my curiosity, so feel free to ignore: did you try
using address sanitizer or valgrind?

-n

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-12 Thread Nathaniel Smith
On Fri, Apr 12, 2019 at 5:05 PM Steve Dower  wrote:
>
> On 12Apr.2019 1643, Nathaniel Smith wrote:
> > On Thu, Apr 11, 2019 at 8:26 AM Steve Dower  wrote:
> >>
> >> On 10Apr2019 1917, Nathaniel Smith wrote:
> > I don't know how many people use Py_TRACE_REFS, but if we can't find
> > anyone on python-dev who uses it then it must be pretty rare. If
> > dropping Py_TRACE_REFS would let us converge the ABIs and get rid of
> > all the stuff above, then that seems like a pretty good trade! But
> > maybe the Windows C runtime issue will foil this...
>
> The very first question I asked was whether this would let us converge
> the ABIs, and the answer was "no".
>
> Otherwise I'd have said go for it, despite the C runtime issues.

I don't see that in the thread... just Victor saying he isn't sure
whether there might be other ABI incompatibilities lurking that he
hasn't found yet. Did I miss something?

I'm mostly interested in this because of the possibility of converging
the ABIs. If you think that the C runtime thing isn't a blocker for
that, then that's useful information. Though obviously we still need
to figure out whether there are any other blockers :-).

> >>>> The reason we ship debug Python binaries is because debug builds use a
> >>>> different C Runtime, so if you do a debug build of an extension module
> >>>> you're working on it won't actually work with a non-debug build of 
> >>>> CPython.
> >>>
> >>> ...But this is an important point. I'd forgotten that MSVC has a habit
> >>> of changing the entire C runtime when you turn on the compiler's
> >>> debugging mode.
> >>
> >> Technically they are separate options, but most project files are
> >> configured such that *their* Debug/Release switch affects both the
> >> compiler options (optimization) and the linker options (C runtime linkage).
> >
> > So how do other projects handle this? I guess historically the main
> > target audience for Visual Studio was folks building monolithic apps,
> > where you can just rebuild everything with whatever options you want,
> > and compared to that Python extensions are messier. But Python isn't
> > the only project in this boat. Do ruby, nodejs, R, etc., all provide
> > separate debug builds with incompatible ABIs on Windows, and propagate
> > that information throughout their module/package ecosystem?
>
> Mostly I hear complaints about those languages *not* providing any help
> here. Python is renowned for having significantly better Windows support
> than any of them, so they're the wrong comparison to make in my opinion.
> Arguing that we should regress because other languages haven't caught up
> to us yet makes no sense.
>
> The tools that are better than Python typically don't ship debug builds
> either, unless you specifically request them. But they also don't leak
> their implementation details all over the place. If we had a better C
> API, we wouldn't have users who needed to match ABIs.

Do you happen to have a list of places where the C API leaks details
of the underlying CRT?

(I'm mostly curious because whenever I've looked my conclusion was
essentially: "Well... I don't see any places that are *definitely*
broken, so maybe mixing CRTs is fine? but I have zero confidence that
I caught everything, so probably better to play it safe?". At least on
py3 – I know the py2 C API was definitely broken if you mixed CRTs,
because of the exposed FILE*.)

> For the most part, disabling optimizations in your own extension but
> using the non-debug ABI is sufficient, and if you're having to deal with
> other people's packages then maybe you don't have any choice (though I
> do know of people who have built debug versions of numpy before - turns
> out Windows developers are often just as capable as non-Windows
> developers when it comes to building things ;)

I'm not sure why you think I was implying otherwise? I'm sorry if you
thought I was attacking your users or something. I did say that I
thought most users downloading the debug builds were probably confused
about what they were actually getting, but I didn't mean because they
were stupid Windows users, I meant because the debug builds are so
confusing that even folks on the Python core team are confused about
what they're actually getting.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-12 Thread Nathaniel Smith
On Thu, Apr 11, 2019 at 8:26 AM Steve Dower  wrote:
>
> On 10Apr2019 1917, Nathaniel Smith wrote:
> > It sounds like --with-pydebug has accumulated a big grab bag of
> > unrelated features, mostly stuff that was useful at some point for
> > some CPython dev trying to debug CPython itself? It's clearly not
> > designed with end users as the primary audience, given that no-one
> > knows what it actually does and that it makes third-party extensions
> > really awkward to run. If that's right then I think Victor's plan of
> > to sort through what it's actually doing makes a lot of sense,
> > especially if we can remove the ABI breaking stuff, since that causes
> > a disproportionate amount of trouble.
>
> Does it really cause a "disproportionate" amount of trouble? It's
> definitely not meant for anyone who isn't working on C code, whether in
> CPython, an extension or a host application. If you want to use
> third-party extensions and are not able to rebuild them, that's a very
> good sign that you probably shouldn't be on the debug build at all.

Well, here's what I mean by "disproportionate". Some of the costs of
the ABI divergence are:

- The first time I had to debug a C extension, I wasted a bunch of
time trying to figure out how I was supposed to use Debian's
'python-dbg' package (the --with-pydebug build), before eventually
figuring out that it was a red herring and what I actually wanted was
the -dbgsym package (their equivalent of MSVC's /Zi /DEBUG files).

- The extension loading machinery has extra code and complexity to
track the two different ABIs. The package ecosystem does too, e.g.
distutils needs to name extensions appropriately, and we need special
wheel tags, and pip needs code to handle these tags:

https://github.com/pypa/pip/blob/54b6a91405adc79cdb8a2954e9614d6860799ccb/src/pip/_internal/pep425tags.py#L106-L109

- If you want some of the features of --with-pydebug that don't change
the ABI, then you still have to rebuild third-party extensions to get
at them, and that's a significant hassle. (I could do it if I had to,
but my time has value.)

- Everyone who uses ctypes to access a PyObject* has to include some
extra hacks to handle the difference between the regular and debug
ABIs. There are a few different versions that get copy/pasted around
as folklore, and they're all pretty obscure. For example:

https://github.com/pallets/jinja/blob/fd89fed7456e755e33ba70674c41be5ab222e193/jinja2/debug.py#L317-L334

https://github.com/johndpope/sims4-ai-engine/blob/865212e841c716dc4364e0dba286f02af8d716e8/core/framewrapper.py#L12-L41

https://github.com/python-trio/trio/blob/862ced04e1f19287e098380ed8a0635004c36dd1/trio/_core/_multierror.py#L282
  And then if you want to test this code, it means you have to add a
--with-pydebug build to your CI infrastructure...

I don't know how many people use Py_TRACE_REFS, but if we can't find
anyone on python-dev who uses it then it must be pretty rare. If
dropping Py_TRACE_REFS would let us converge the ABIs and get rid of
all the stuff above, then that seems like a pretty good trade! But
maybe the Windows C runtime issue will foil this...

> >> The reason we ship debug Python binaries is because debug builds use a
> >> different C Runtime, so if you do a debug build of an extension module
> >> you're working on it won't actually work with a non-debug build of CPython.
> >
> > ...But this is an important point. I'd forgotten that MSVC has a habit
> > of changing the entire C runtime when you turn on the compiler's
> > debugging mode.
>
> Technically they are separate options, but most project files are
> configured such that *their* Debug/Release switch affects both the
> compiler options (optimization) and the linker options (C runtime linkage).

So how do other projects handle this? I guess historically the main
target audience for Visual Studio was folks building monolithic apps,
where you can just rebuild everything with whatever options you want,
and compared to that Python extensions are messier. But Python isn't
the only project in this boat. Do ruby, nodejs, R, etc., all provide
separate debug builds with incompatible ABIs on Windows, and propagate
that information throughout their module/package ecosystem?

-n

--
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-11 Thread Nathaniel Smith
On Thu, Apr 11, 2019 at 8:32 AM Serhiy Storchaka  wrote:
> On other hand, since using the debug allocator doesn't cause problems
> with compatibility, it may be possible to use similar technique for the
> objects double list. Although this is not easy because of objects placed
> at static memory.

I guess one could track static objects separately, e.g. keep a simple
global PyList containing all statically allocated objects. (This is
easy since we know they're all immortal.) And then sys.getobjects()
could walk the heap objects and statically allocated objects
separately.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Nathaniel Smith
On Wed, Apr 10, 2019 at 1:50 PM Steve Dower  wrote:
>
> On 10Apr2019 1227, Nathaniel Smith wrote:
> > On Wed, Apr 10, 2019, 04:04 Victor Stinner  > <mailto:vstin...@redhat.com>> wrote:
> > I don't think that I ever used sys.getobjects(), whereas many projects
> > use gc.get_objects() which is also available in release builds (not
> > only in debug builds).
> >
> >
> > Can anyone explain what pydebug builds are... for? Confession: I've
> > never used them myself, and don't know why I would want to.
> >
> > (I have to assume that most of Steve's Windows downloads are from folks
> > who thought they were downloading a python debugger.)
>
> They're for debugging :)
>
> In general, debug builds are meant for faster inner-loop development.
> They generally do incremental builds properly and much faster by
> omitting most optimisations, which also enables source mapping to be
> more accurate when debugging. Assertions are typically enabled so that
> you are notified when a precondition is first identified rather than
> when it causes the crash (compiling these out later means you don't pay
> a runtime cost once you've got the inputs correct - generally these are
> used for developer-controlled values, rather than user-provided ones).
>
> So the idea is that you can quickly edit, build, debug, fix your code in
> a debug configuration, and then use a release configuration for the
> actual released build. Full release builds may take 2-3x longer than
> full debug builds, given the extra effort they make at optimisation, and
> very often can't do minimal incremental builds at all (so they may be
> 10-100x slower if you only modified one source file). But because the
> builds behave functionally equivalently, you can iterate with the faster
> configuration and get more done.

Sure, I'm familiar with the idea of debug and optimization settings in
compilers. I build python with custom -g and -O flags all the time. (I
do it by setting OPT when running configure.) It's also handy that
many Linux distros these days let you install debug metadata for all
the binaries they ship – I've used that when debugging third-party
extension modules, to get a better idea of what was happening when a
backtrace passes through libpython. But --with-pydebug is a whole
other thing beyond that, that changes the ABI, has its own wheel tags,
requires special cases in packages that use ctypes to access PyObject*
internals, and appears to be almost entirely undocumented.

It sounds like --with-pydebug has accumulated a big grab bag of
unrelated features, mostly stuff that was useful at some point for
some CPython dev trying to debug CPython itself? It's clearly not
designed with end users as the primary audience, given that no-one
knows what it actually does and that it makes third-party extensions
really awkward to run. If that's right then I think Victor's plan of
to sort through what it's actually doing makes a lot of sense,
especially if we can remove the ABI breaking stuff, since that causes
a disproportionate amount of trouble.

> The reason we ship debug Python binaries is because debug builds use a
> different C Runtime, so if you do a debug build of an extension module
> you're working on it won't actually work with a non-debug build of CPython.

...But this is an important point. I'd forgotten that MSVC has a habit
of changing the entire C runtime when you turn on the compiler's
debugging mode. (On Linux, we generally don't bother rebuilding the C
runtime unless you're actually debugging the C runtime, and anyway if
you do want to switch to a debug version of the C runtime, it's ABI
compatible so your program binaries don't have to be rebuilt.)

Is it true that if the interpreter is built against ucrtd.lib, and an
extension module is built against ucrt.lib, then they'll have
incompatible ABIs and not work together? And that this detail is part
of what's been glommed together into the "d" flag in the soabi tag on
Windows?

Is it possible for the Windows installer to include PDB files (/Zi
/DEBUG) to allow debuggers to understand the regular release
executable? (That's what I would have expected to get if I checked a
box labeled "Download debug binaries".)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Nathaniel Smith
On Wed, Apr 10, 2019, 04:04 Victor Stinner  wrote:

> Le mar. 9 avr. 2019 à 22:16, Steve Dower  a écrit
> :
> > What are the other changes that would be required?
>
> I don't know.
>
> > And is there another
> > way to get the same functionality without ABI modifications?
>
> Py_TRACE_REFS is a double linked list of *all* Python objects. To get
> this functionality, you need to store the list somewhere. I don't know
> how to maintain such list outside the PyObject structure.
>

I assume these pointers get updated from some generic allocation/free code.
Could that code instead overallocate by 16 bytes, use the first 16 bytes to
hold the pointers, and then return the PyObject* as (actual allocated
pointer + 16)? Basically the "container_of" trick.

I don't think that I ever used sys.getobjects(), whereas many projects
> use gc.get_objects() which is also available in release builds (not
> only in debug builds).


Can anyone explain what pydebug builds are... for? Confession: I've never
used them myself, and don't know why I would want to.

(I have to assume that most of Steve's Windows downloads are from folks who
thought they were downloading a python debugger.)

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   >