[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Gregory P. Smith
On Wed, Apr 8, 2020, 10:37 AM Antoine Pitrou  wrote:

> On Wed, 8 Apr 2020 10:18:47 -0700
> Guido van Rossum  wrote:
> >
> > > But when I leave "large" temp objects hanging and
> > > give a rip, I already stick in "del" statements anyway.  Very rarely,
> > > but it happens.
> > >
> >
> > I recall that in the development of asyncio there were a few places where
> > we had to insert del statements, not so much to free a chunk of memory,
> but
> > to cause some destructor or finalizer to run early enough. (IIRC not
> right
> > at that moment, but at some later moment even if some Futures are still
> > alive.) Those issues took considerable effort to find, and would
> > conceivably have been prevented by this proposal.
>
> If code relies on life variable analysis for correctness, it implies
> other Python implementations must implement it with exactly the same
> results.
>

As far as I know they all do? The existence of locals() as an API cements
this behavior. If you want something to disappear from locals it requires
an explicit del.  (explicit is better than implicit and all...)

I'd actually accept this optimization in something like micropython where
bending rules to fit in mere kilobytes makes sense. But in CPython I want
to see serious demonstrated practical benefit before changing this behavior
in any file by default.  (it could be implemented per file based on a
declaration; this would be a bytecode optimization pass)

-gps


> Regards
>
> Antoine.
>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/V6AGHX43FB2EF6RZGQQJTYGQTRVX3W2F/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NQU7KKRO2C7AUQ6NDVZV4SITRZMV3DP5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Andrew Barnert via Python-ideas
On Apr 9, 2020, at 15:13, Wes Turner  wrote:
> 
> - > And then take a look at how @ApacheArrow
>   "supports zero-copy reads for lightning-fast data access without 
> serialization overhead."
> - .@blazingsql … #cuDF … @ApacheArrow 
>   https://docs.blazingdb.com/docs/blazingsql

This isn’t relevant here at all. How objects get constructed and manage their 
internal storage is completely orthogonal to the how Python manages object 
lifetimes.

>   … New #DataFrame Interface and when that makes a copy for 2x+ memory use
>   - "A dataframe protocol for the PyData ecosystem"
> 
> https://discuss.ossdata.org/t/a-dataframe-protocol-for-the-pydata-ecosystem/267

Same here.

> Presumably, nothing about magic del statements would affect C extensions, 
> Cython, zero-copy reads, or data that's copied to the GPU for faster 
> processing; but I don't understand this or how weakrefs and c-extensions 
> share memory that could be unlinked by a del.

And same for some of this—but not all.

C extensions can do the same kind of frame hacking, etc., as Python code, so 
they will have the same problems already raised in this thread. But I don’t 
think they they add anything new. (There are special rules allowing you to 
cheat with objects that haven’t been shared with Python code yet, which sounds 
like it would make things more complicated—until you realize that objects that 
haven’t been shared with Python code obviously can’t be affected by when Python 
code releases references.)

But weakrefs would be affected, and that might be a problem with the proposal 
that I don’t think anyone else has noticed before you.

Consider this toy example:

spam = make_giant_spam()
weakspam = weakref.ref(spam)
with ThreadPoolExecutor() as e:
for _ in range(1000):
e.submit(dostuff, weakspam)

Today, the spam variable lives until the end of the scope, which doesn’t happen 
until the with statement ends, which doesn’t happen until all 1000 tasks 
complete. So, the object in that variable is still alive for all of the tasks.

With Guido’s proposed change, the spam variable is deleted after the last 
statement that uses it, which is before the with statement is even entered. 
Assuming it’s the only (non-weak) reference to the object, which is probably 
true, it will get destroyed, releasing all the memory (or other expensive 
resources) used by that giant spam object. That’s the whole point of the 
proposal, after all. But that means weakspam is now a dead weakref. So all 
those dostuff tasks are now doing stuff with a dead weakref. Presumably dostuff 
is designed to handle that safely, so you won’t crash or anything—but it can’t 
do the actual stuff you wanted it to do with that spam object.

And, while this is obviously a toy example, perfectly reasonable real code will 
do similar things. It’s pretty common to use weakrefs for cases where 99% of 
the time the object is there but occasionally it’s dead (e.g., during graceful 
shutdown), and changing that 99% to 0% or 1% will make the entire process 
useless. It’s also common to use weakrefs for cases where 80% of the time the 
object is there but 20% of the time it’s been ejected from some cache and has 
to be regenerated; changing that 80% to 1% will mean the process still 
functions, but the cache is no longer doing anything, so it functions a lot 
slower. And so on.

So, unless you could introduce some compiler magic to detect weakref.ref and 
weakref.weakdict.__setitem__ and so on (which might not be feasible, especially 
since it’s often buried inside some wrapper code), this proposal might well 
break many, maybe even most, good uses of weakrefs.

> Would be interested to see the real performance impact of this potential 
> optimization:
> - 10%: 
> https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

Skimming this, it looks like this one is not just orthogonal to Guido’s 
proposal, it’s almost directly counter to it. Their goal is to have relatively 
short-lived killable children that defer refcount twiddling and destruction as 
much as possible so that fork-inherited objects don’t have to be copied and 
temporary objects don’t have to be cleaned up, they can just be abandoned. 
Guido’s goal is to get things decref’d and therefore hopefully destroyed as 
early as possible.

Anyway, their optimization is definitely useful for a special class of programs 
that meet some requirements that sound unusual until you realize a lot of web 
servers/middlewares are designed around nearly the same requirements. People 
have done similar (in fact, even more radical, akin to building CPython and all 
of your extensions with refcounting completely disabled) in C and other 
languages, and there’s no reason (if you’re really careful) it couldn’t work in 
Python. But it’s certainly not the behavior you’d want from a general-purpose 
Python implementation.___
Python-ideas 

[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Wes Turner
Thanks for removing the mystery.

FWIW, here are some of the docs and resources for memory management in
Python;
I share these not to be obnoxious or to atoen, but to point to the docs
that would need updating to explain what is going on if this is not
explicit.

- https://docs.python.org/3/reference/datamodel.html#object.__del__
-
https://docs.python.org/3/extending/extending.html?highlight=__del__#thin-ice
- https://docs.python.org/3/c-api/memory.html
- https://docs.python.org/3/library/gc.html
- https://docs.python.org/3/library/tracemalloc.html
- https://devguide.python.org/gdb/
- https://devguide.python.org/garbage_collector/
-
https://devguide.python.org/garbage_collector/#optimization-reusing-fields-to-save-memory
- https://doc.pypy.org/en/latest/gc_info.html
-
https://github.com/jythontools/jython/blob/master/src/org/python/modules/gc.java
  https://javadoc.io/doc/org.python/jython-standalone/2.7.2/
org/python/modules/gc.html
-
https://github.com/IronLanguages/ironpython2/blob/master/Src/IronPython.Modules/gc.cs

https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test/crashers/gc_has_finalizer.py

https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test/crashers/gc_inspection.py

https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython.Modules/gc.cs

- "[Python-Dev] Re: Mixed Python/C debugging"

https://mail.python.org/archives/list/python-...@python.org/message/Z3S2RAXRIHAWT6JEOXEBPPBTPUTMDZI7/

- @anthonypjshaw's CPython Internals book has a (will have a) memory
management chapter.
- > And then take a look at how @ApacheArrow
  "supports zero-copy reads for lightning-fast data access without
serialization overhead."
- .@blazingsql … #cuDF … @ApacheArrow
  https://docs.blazingdb.com/docs/blazingsql

  … New #DataFrame Interface and when that makes a copy for 2x+ memory use
  - "A dataframe protocol for the PyData ecosystem"

https://discuss.ossdata.org/t/a-dataframe-protocol-for-the-pydata-ecosystem/267

Presumably, nothing about magic del statements would affect C extensions,
Cython, zero-copy reads, or data that's copied to the GPU for faster
processing; but I don't understand this or how weakrefs and c-extensions
share memory that could be unlinked by a del.

Would be interested to see the real performance impact of this potential
optimization:
- 10%:
https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172

On Thu, Apr 9, 2020 at 2:48 PM Andrew Barnert  wrote:

> On Apr 8, 2020, at 23:53, Wes Turner  wrote:
> >
> > Could something just heuristically add del statements with an AST
> transformation that we could review with source control before committing?
> >
> > When the gc pause occurs is something I don't fully understand. For
> example:
>
> Your examples don’t have anything to do with gc pause.
>
> > FWIW, this segfaults CPython in 2 lines:
> >
> > import ctypes
> > ctypes.cast(1, ctypes.py_object)
>
> Yes, because this is ultimately trying to print the repr of (PyObject*)1,
> which means calling some function that tries to dereference some member of
> a struct at address 1, which means trying to access an int or pointer or
> whatever at address 1 or 9 or 17 or whatever. On most platforms, those
> addresses are going to be unmapped (and, on some, illegally aligned to
> boot), so you’ll get a segfault. This has nothing to do with the GC, or
> with Python objects at all.
>
> > Interestingly, this (tends to?) work; even when there are ah scope
> closures?:
> >
> > import ctypes, gc
> > x = 22
> > _id = id(x)
> > del x
> > gc.collect()
> > y = ctypes.cast(_id, ctypes.py_object).value
> > assert y == 22
>
> The gc.collect isn’t doing anything here.
>
> First, the 22 object, like other small integers and a few other special
> cases, is immortal. Even after you del x, the object is still alive, so of
> course everything works.
>
> Even if you used a normal object that does get deleted, it would get
> deleted immediately when the last reference to the value goes away, in that
> del x statement. The collect isn’t needed and doesn’t do anything relevant
> here. (It’s there to detect reference cycles, like `a.b=b; b.a=a; del a;
> del b`. Assuming a and b were the only references to their objects at the
> start, a.b and b.a are the only references at the end. They won’t be
> deleted by refcounting because there’s still one reference to each, but
> they are garbage because they’re not accessible. The gc.collect is a cycle
> detector that handles exactly this case.)
>
> But your code may well still often work on most platforms. Deleting an
> object rarely unmaps its memory; it just returns that memory to the object
> allocator’s store. Eventually that memory will be reused for another
> object, but until it is, it will often still look like a perfectly valid
> value if you cheat and look at it (as you’re doing). (And even after it’s
> reused, it will often end up getting reused by some object of the same
> 

[Python-ideas] Re: Exception spaces

2020-04-09 Thread Paul Sokolovsky
Hello,

On Thu, 9 Apr 2020 17:40:26 -0300
André Roberge  wrote:

> On Thu, Apr 9, 2020 at 5:31 PM Soni L.  wrote:
> 
> > Sometimes, you have an API:
> >
> > SNIP  
> 
> >  Raises:
> >  PropertyError: If the property is not supported by
> > this config source.
> >  LookupError: If the property is supported, but isn't
> > available.
> >  ValueError: If the property doesn't have exactly one
> > value. """
> >  raise PropertyError
> >
> > and you don't want your API to mask bugs. How would it mask bugs?
> > For example, if API consumers do an:
> >
> >  try:
> >  x = foo.get_property_value(prop)
> >  except ValueError:
> >  # handle it
> >
> > SNIP  
> 
> If you don't want standard Python exceptions, such as ValueError to be
> confused with exceptions from your own app, just create your own
> custom exceptions such as
> 
> class MyAppValueError(Exception):
> pass
> 
> and raise these custom exceptions when necessary.
> No need to change Python or use convoluted logic.

... And if you have a problem with a 3rd-party lib, drop them a bug
report. And if they don't listen (everybody is smart nowadays and
knows better how it should be), just fork that lib, and show everyone
how to do it right...

[]

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MGDU2BFPPN4SO3ZTVCFUJNVTAA5U5C4J/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Exception spaces

2020-04-09 Thread André Roberge
On Thu, Apr 9, 2020 at 5:31 PM Soni L.  wrote:

> Sometimes, you have an API:
>
> SNIP

>  Raises:
>  PropertyError: If the property is not supported by this config
>  source.
>  LookupError: If the property is supported, but isn't
> available.
>  ValueError: If the property doesn't have exactly one value.
>  """
>  raise PropertyError
>
> and you don't want your API to mask bugs. How would it mask bugs? For
> example, if API consumers do an:
>
>  try:
>  x = foo.get_property_value(prop)
>  except ValueError:
>  # handle it
>
> SNIP

If you don't want standard Python exceptions, such as ValueError to be
confused with exceptions from your own app, just create your own custom
exceptions such as

class MyAppValueError(Exception):
pass

and raise these custom exceptions when necessary.
No need to change Python or use convoluted logic.

André Roberge

> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/J76S2NBEVKX2MHZULELM22IHT5H7KLP3/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BTSK3CPKYUMZII7VIEFWWVYB76GKB455/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Exception spaces

2020-04-09 Thread Soni L.

Sometimes, you have an API:

    @abc.abstractmethod
    def get_property_value(self, prop):
    """Returns the value associated with the given property.

    Args:
    prop (DataProperty): The property.

    Returns:
    The value associated with the given property.

    Raises:
    PropertyError: If the property is not supported by this config
    source.
    LookupError: If the property is supported, but isn't available.
    ValueError: If the property doesn't have exactly one value.
    """
    raise PropertyError

and you don't want your API to mask bugs. How would it mask bugs? For 
example, if API consumers do an:


    try:
    x = foo.get_property_value(prop)
    except ValueError:
    # handle it

then the following implementation:

    def get_property_value(self, prop):
    iterator = self.get_property_values(prop)
    try:
    # note: unpacking
    ret, = iterator
    except LookupError as exc: raise RuntimeError from exc  # don't 
accidentally swallow bugs in the iterator

    return ret
    def get_property_values(self, prop):
    try:
    factory = self.get_supported_properties()[prop]
    except KeyError as exc: raise PropertyError from exc
    iterator = factory(self._obj)
    try:
    first = next(iterator)
    except StopIteration: return (x for x in ())
    except abdl.exceptions.ValidationError as exc: raise 
LookupError from exc
    except LookupError as exc: raise RuntimeError from exc  # don't 
accidentally swallow bugs in the iterator

    return itertools.chain([first], iterator)

could cause you to accidentally swallow unrelated ValueErrors.

so instead this needs to be rewritten. you can't just unpack things from 
the iterator, you need to wrap the iterator into another iterator, that 
then converts ValueError into RuntimeError.


but if that ValueError is part of *your* API requirements... it's 
impossible to use!


so my proposal is that we get "exception spaces". they'd be used through 
the 'in' keyword, as in "except in" and "raise in".


for example:

    X = object()
    Y = object()

    def foo():
  raise LookupError in X

    def bar():
  try:
    foo()
  except LookupError in Y:
    print("bar: caught LookupError in Y, ignoring")

    def baz():
  try:
    bar()
  except LookupError in X:
    print("baz: caught LookupError in X, re-raising in Y")
    raise in Y

    def qux():
  try:
    baz()
  except LookupError in Y:
    print("qux: caught LookupError in Y, ignoring")

    qux()

    # would print:
    # ---
    # baz: caught LookupError in X, re-raising in Y
    # qux: caught LookupError in Y, ignoring
    # ---
    # note the lack of "bar"

(or perhaps "raise in X LookupError" and "except in Y LookupError" etc)

and then you can adjust the above implementations accordingly:
(btw, anyone knows how to tell apart a ValueError from a specific 
unpacking and a ValueError from an iterator being used in that unpacking?)


    def get_property_value(self, prop, espace=None):
    iterator = self.get_property_values(prop, espace=espace)
    try:
    # note: unpacking
    ret, = iterator
    except ValueError: raise in espace
    # except LookupError as exc: raise RuntimeError from exc  # no 
longer needed

    return ret
    def get_property_values(self, prop, espace=None):
    try:
    factory = self.get_supported_properties()[prop]
    except KeyError as exc: raise PropertyError in espace from exc
    iterator = factory(self._obj)
    try:
    first = next(iterator)
    except StopIteration: return (x for x in ())
    except abdl.exceptions.ValidationError as exc: raise 
LookupError in espace from exc
    # except LookupError as exc: raise RuntimeError from exc  # no 
longer needed

    return itertools.chain([first], iterator)

as well as the caller:

    espace = object()
    try:
    x = foo.get_property_value(prop, espace=espace)
    except ValueError in espace:
    # handle it

I feel like this feature would significantly reduce bugs in python code, 
as well as significantly improve the error messages related to bugs. 
This would be even better than what we did with StopIteration! This 
would be comparable to Rust's Result type, where you can have 
Result, TheirError> and the like (except 
slightly/a lot more powerful).

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J76S2NBEVKX2MHZULELM22IHT5H7KLP3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Andrew Barnert via Python-ideas
On Apr 8, 2020, at 23:53, Wes Turner  wrote:
> 
> Could something just heuristically add del statements with an AST 
> transformation that we could review with source control before committing?
> 
> When the gc pause occurs is something I don't fully understand. For example:

Your examples don’t have anything to do with gc pause.

> FWIW, this segfaults CPython in 2 lines:
> 
> import ctypes
> ctypes.cast(1, ctypes.py_object)

Yes, because this is ultimately trying to print the repr of (PyObject*)1, which 
means calling some function that tries to dereference some member of a struct 
at address 1, which means trying to access an int or pointer or whatever at 
address 1 or 9 or 17 or whatever. On most platforms, those addresses are going 
to be unmapped (and, on some, illegally aligned to boot), so you’ll get a 
segfault. This has nothing to do with the GC, or with Python objects at all.

> Interestingly, this (tends to?) work; even when there are ah scope closures?:
>  
> import ctypes, gc
> x = 22
> _id = id(x)
> del x
> gc.collect()
> y = ctypes.cast(_id, ctypes.py_object).value
> assert y == 22

The gc.collect isn’t doing anything here. 

First, the 22 object, like other small integers and a few other special cases, 
is immortal. Even after you del x, the object is still alive, so of course 
everything works.

Even if you used a normal object that does get deleted, it would get deleted 
immediately when the last reference to the value goes away, in that del x 
statement. The collect isn’t needed and doesn’t do anything relevant here. 
(It’s there to detect reference cycles, like `a.b=b; b.a=a; del a; del b`. 
Assuming a and b were the only references to their objects at the start, a.b 
and b.a are the only references at the end. They won’t be deleted by 
refcounting because there’s still one reference to each, but they are garbage 
because they’re not accessible. The gc.collect is a cycle detector that handles 
exactly this case.)

But your code may well still often work on most platforms. Deleting an object 
rarely unmaps its memory; it just returns that memory to the object allocator’s 
store. Eventually that memory will be reused for another object, but until it 
is, it will often still look like a perfectly valid value if you cheat and look 
at it (as you’re doing). (And even after it’s reused, it will often end up 
getting reused by some object of the same shape, so you won’t crash, you’ll 
just get odd results.)

Anyway, getting off this side track and back to the main point: releasing the 
locals reference to an object that’s no longer being used locally isn’t 
guaranteed to destroy the object—but in CPython, if locals is the only 
reference, the object will be destroyed immediately. That’s why Guido’s 
optimization makes sense.

The only way gc pause is relevant is for other implementations. For example, if 
CPython stops guaranteeing that x is alive until the end of the scope under 
certain conditions, PyPy could decide to do the same thing, and in PyPy, there 
is no refcount; garbage is deleted when it’s detected by the GC. So it wouldn’t 
be deterministic when x goes away, and the question of how much earlier does it 
go away and how much benefit there is becomes more complicated than in CPython. 
But the PyPy guys seem to be really good at figuring out how to test such 
questions empirically.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RYMES5Q4GTKPAEOMHDXM7JLOSWEABOHF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Tolo Palmer
What about nested functions? I am not sure how we can predict if that
variable is going to be used or not.

I think this is a nice idea to have such an analysis, but I don't think is
feasible.

On Thu, 9 Apr 2020 at 04:29, Caleb Donovick 
wrote:

> This would almost certainly break my code.   As a DSL developer I do a lot
> of (exec | eval | introspection | ... ) shenanigans, that would make doing
> liveness analysis undecidable.
>
> On Wed, Apr 8, 2020 at 10:51 AM Andrew Barnert via Python-ideas <
> python-ideas@python.org> wrote:
>
>> On Apr 8, 2020, at 09:57, Guido van Rossum  wrote:
>> >
>> > 
>> > Look at the following code.
>> >
>> > def foo(a, b):
>> > x = a + b
>> > if not x:
>> > return None
>> > sleep(1)  # A calculation that does not use x
>> > return a*b
>> >
>> > This code DECREFs x when the frame is exited (at the return statement).
>> But (assuming) we can clearly see that x is not needed during the sleep
>> (representing a big calculation), we could insert a "del x" statement
>> before the sleep.
>> >
>> > I think our compiler is smart enough to find out *some* cases where it
>> could safely insert such del instructions.
>>
>> It depends on how much you’re willing to break and still call it “safely”.
>>
>> def sleep(n):
>> global store
>> store = inspect.current_frame().f_back.f_locals['x']
>>
>> This is a ridiculous example, but it shows that you can’t have all of
>> Python’s dynamic functionality and still know when locals are dead. And
>> there are less ridiculous examples with different code. If foo actually
>> calls eval, exec, locals, vars, etc., or if it has a nested function that
>> nonlocals x, etc., how can we spot that at compile time and keep x alive?
>>
>> Maybe that’s ok. After all, that code doesn’t work in a Python
>> implementation that doesn’t have stack frame support. Some of the other
>> possibilities might be more portable, but I don’t know without digging in
>> further.
>>
>> Or maybe you can add new restrictions to what locals and eval and so on
>> guarantee that will make it ok? Some code will break, but only rare
>> “expert” code, where the authors will know how to work around it.
>>
>> Or, if not, it’s definitely fine as an opt-in optimization: decorate the
>> function with @deadlocals and that decorator scans the bytecode and finds
>> any locals that are dead assuming there’s no use of locals/eval/cells/etc.
>> and, because you told it to assume that by opting in to the decorator, it
>> can insert a DELETE_FAST safely.
>>
>> People already do similar things today—e.g., I’ve (only once in live
>> code, but that’s still more than zero) used a @fastconst decorator that
>> turns globals into consts on functions that I know are safe and are
>> bottlenecks, and this would be no different. And of course you can add a
>> recursive class decorator, or an import hook (or maybe even a command line
>> flag or something) that enables it everywhere (maybe with a @nodeadlocals
>> decorator for people who want it _almost_ everywhere but need to opt out
>> one or two functions).
>>
>> Did Victor Stinner explore this as one of the optimizations for FAT
>> Python/PEP 511/etc.? Maybe not, since it’s not something you can insert a
>> guard, speculatively do, and then undo if the guard triggers, which was I
>> think his key idea.
>>
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/OIGCRV464VJW3FRRBBK25XSNQYGWID7N/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/DSIMIUM6QCUMC2GRTFV646KVWYIR45DR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VF2J6X6KS7XRKBSHXZEGD3ABRIKLKDM3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Alternative to iterator unpacking that wraps iterator-produced ValueError

2020-04-09 Thread Steven D'Aprano
On Thu, Apr 09, 2020 at 08:27:14AM -0300, Soni L. wrote:

> To put it simple, unpacking raises ValueError:
[...]
> But if the iterator raises ValueError, there's no way to tell it apart 
> from the unpacking:
> 
> >>> def foo():
> ... yield None
> ... raise ValueError

You could start by reading the error message and the traceback, that 
will usually make it clear the nature of the value error, and where it 
occurred (in the generator, or where the generator was consumed).

For *debugging purposes* this is usually sufficient: the person reading 
the exception can tell the difference between an unpacking error:

# outside the iterator
a, b, c = iterator
ValueError: not enough values to unpack (expected 3, got 1)

and some other error:

# inside the iterator
yield int('aaa')
ValueError: invalid literal for int() with base 10: 'aaa'

There may be rare cases where it is difficult to tell. Perhaps the 
traceback is missing, or you are debugging a byte-code only library, or 
obfuscated code, say. But these are rare cases, and we don't have to 
solve those problems in the language.

Where this is not sufficient is for error recovery:

try:
a, b, c = iterator
except ValueError:
recover()

However, this is also true for every exception that Python might raise. 
There is no absolutely foolproof solution, but it is usually good enough 
to e.g.:

- include as little as possible inside the `try` block;

- carefully audit the contents of the `try` block to ensure it cannot
  raise the exception you want to catch;

- wrap the iterator in something that will convert ValueError to 
  another exception.


At one point some years ago I attempted to write an "Exception Guard" 
object that caught an exception and re-raised it as another exception, 
so you could do this:

guard = ExceptionGuard(catch=ValueError, throw=RuntimeError)
try:
   a, b, c = guard(iterator)
except ValueError:
   print('unpacking error')
except RuntimeError:
   print('ValueError in iterator caught and coverted')

but it collapsed under the weight of over-engineering and trying to 
support versions of Python back to 2.4, and I abandoned it. Perhaps you 
will have better luck, and when you have it as a mature, working object, 
you can propose it for the standard library.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UE4AWB2445ZRZPTIWLT2WMJHLI2OIJYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Alternative to iterator unpacking that wraps iterator-produced ValueError

2020-04-09 Thread jdveiga
Soni L. wrote:
> On 2020-04-09 8:48 a.m., Rhodri James wrote:
> > [Muchly snipped]
> > On 09/04/2020 12:27, Soni L. wrote:
> > To put it simple, unpacking raises
> > ValueError:
> > But if the iterator raises ValueError, there's no way to tell it 
> > apart from the unpacking:
> > I don't see how this is any different from any other case when you get 
> > the same exception for different errors.  If for some reason you 
> > really care, subclass ValueError to make a finer-grained exception.
> > And the workaround for this is a bit ugly. We
> > already convert e.g. 
> > StopIteration into RuntimeError in many cases, why can't we do so 
> > here too?
> > Surely the correct "workaround" is not to do the thing that raises the 
> > exception?
> > Technically, I consider it a bug that bugs can shadow API-level 
> > exceptions. Any API defining API-level exceptions must carefully control 
> > the exceptions it raises. In this case I'd like to use the ValueError 
> > from the iterator unpacking API, on my API. But since the iterator 
> > unpacking API doesn't control its exceptions, this just does a great job 
> > of masking bugs in the code instead.
> > Equally, anything doing computation in __get__ should not propagate 
> LookupError except where explicitly intended. And that's not how those 
> things are often implemented, unfortunately. There's a reason ppl 
> complain so much about certain frameworks "eating all the exceptions". 
> They use exceptions as part of their API but let user code raise those 
> API-level exceptions, which, because they're part of the API, get 
> handled somewhere.

Strictly speaking, there is any unpackaging error in your example. Your example 
raises its own ValueError before any unpackaging error is raised.

Indeed

```
x, y = foo()
```

also raises your own `ValueError` and there is any unpackaging error involved.

On the other hand, an alternative design that does not raise any exception, 
does raise the proper unpackaging exception. For instance:

```
def foo():
yield True
return

x = foo()
x, = foo()
x, y = foo()

```

outputs:

```
Traceback (most recent call last):
  File "...", line 7, in 
x, y = foo()
ValueError: not enough values to unpack (expected 2, got 1)
```

So, IMHO, you are mixing two different things here. Am I wrong? Are you talking 
about something different? Thank you.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UBCF6LQCBGPMZERLOVQONEJPSCO5QGRO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Alternative to iterator unpacking that wraps iterator-produced ValueError

2020-04-09 Thread Soni L.



On 2020-04-09 8:48 a.m., Rhodri James wrote:

[Muchly snipped]
On 09/04/2020 12:27, Soni L. wrote:

To put it simple, unpacking raises ValueError:

But if the iterator raises ValueError, there's no way to tell it 
apart from the unpacking:


I don't see how this is any different from any other case when you get 
the same exception for different errors.  If for some reason you 
really care, subclass ValueError to make a finer-grained exception.


And the workaround for this is a bit ugly. We already convert e.g. 
StopIteration into RuntimeError in many cases, why can't we do so 
here too?


Surely the correct "workaround" is not to do the thing that raises the 
exception?


Technically, I consider it a bug that bugs can shadow API-level 
exceptions. Any API defining API-level exceptions must carefully control 
the exceptions it raises. In this case I'd like to use the ValueError 
from the iterator unpacking API, on my API. But since the iterator 
unpacking API doesn't control its exceptions, this just does a great job 
of masking bugs in the code instead.


Equally, anything doing computation in __get__ should not propagate 
LookupError except where explicitly intended. And that's not how those 
things are often implemented, unfortunately. There's a reason ppl 
complain so much about certain frameworks "eating all the exceptions". 
They use exceptions as part of their API but let user code raise those 
API-level exceptions, which, because they're part of the API, get 
handled somewhere.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XEPFBGSNPNUZ7KR7WT4TDHU2HM5BQ5WQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Alternative to iterator unpacking that wraps iterator-produced ValueError

2020-04-09 Thread Rhodri James

[Muchly snipped]
On 09/04/2020 12:27, Soni L. wrote:

To put it simple, unpacking raises ValueError:

But if the iterator raises ValueError, there's no way to tell it apart 
from the unpacking:


I don't see how this is any different from any other case when you get 
the same exception for different errors.  If for some reason you really 
care, subclass ValueError to make a finer-grained exception.


And the workaround for this is a bit ugly. We already convert e.g. 
StopIteration into RuntimeError in many cases, why can't we do so here too?


Surely the correct "workaround" is not to do the thing that raises the 
exception?


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JIOFUZPUDWZJRLNXIYVAWVHT4XDXQMAF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Alternative to iterator unpacking that wraps iterator-produced ValueError

2020-04-09 Thread Soni L.

To put it simple, unpacking raises ValueError:

>>> x, = ()
Traceback (most recent call last):
  File "", line 1, in 
ValueError: not enough values to unpack (expected 1, got 0)
>>> x, = (1, 2)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: too many values to unpack (expected 1)

But if the iterator raises ValueError, there's no way to tell it apart 
from the unpacking:


>>> def foo():
... yield None
... raise ValueError
...
>>> foo()

>>> x = foo()
>>> x, = foo()
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 3, in foo
ValueError

And the workaround for this is a bit ugly. We already convert e.g. 
StopIteration into RuntimeError in many cases, why can't we do so here too?


For backwards compatibility, this should probably be an itertools 
utility tho.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WTINS4QYQMVUI3UF7IC5ZJSFINA3YTL7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Optimize out unused variables

2020-04-09 Thread Serhiy Storchaka

09.04.20 03:46, Henk-Jaap Wagenaar пише:

I like the idea of formalizing "unused variables".

How about having a syntax for it? Allowing a "." instead of an 
identifier to signify this behaviour [reusing Serhiy's examples]:


head, ., rest = path.partition('/')
first, second, *. = line.split()
for . in range(10): ...
[k for k, . in pairs]

Potentially for unpacking one could use nothing, e.g.

first, second, * = line.split()


Changing the syntax has much more high bar that implementing a 
transparent optimization. We should have very good evidences that this 
will help significant part of programs.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LFWX3APYNNIGY32EAENEYKVPOPYLDAK5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Optimize out unused variables

2020-04-09 Thread Serhiy Storchaka

09.04.20 00:54, Andrew Barnert via Python-ideas пише:

Could you go so far as to remove the variable from the locals if its only 
assignment(s) are optimized out? I’m not sure how much benefit that would 
provide. (Surely it would sometimes mean an f_locals array fits into one cache 
line instead of two, or that a whole code object stays around in L2 cache for 
the next time it’s called instead of being ejected, but often enough to make a 
difference? Maybe not…)


I did no do this specially, it was a natural consequence of optimizing 
out assignments. So yes, they are removed from f_locals.



Like Guido’s idea, this seems like something that should definitely be safe 
enough as an opt-in decorator or whatever, and implementable that way. And that 
also seems like the best way to answer those doubts. Write or find some code 
that you think should benefit, add the decorator, benchmark, and see.

Also, with an opt-in mechanism, you could relax the restrictions. For example, by 
default @killunused only kills unused assignments that meet your restrictions, but 
if I know it’s safe I can @killunused("_”, “dummy”) and it kills unused 
assignments to those names even if it wouldn’t normally do so. Then you could see if 
there are any cases where it’s useful, but only with the restrictions relaxed, and 
maybe use that as a guide to whether it’s worth finding a way to aim for looser 
restrictions in the first place or not.


It would be much more complex. It is just 30 lines of simple code added 
in symtable.c and compile.c, but with the decorator you would need to 
write complex code in Python which parses bytecode, analyzes dataflow, 
patches bytecode, removes names from locals and recalculates all local's 
indices. I am not interesting in doing this.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GUF4OVFMVAGZ2HKU7XFPFDDR7BG6DBEN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Live variable analysis -> earlier release?

2020-04-09 Thread Wes Turner
Could something just heuristically add del statements with an AST
transformation that we could review with source control before committing?

When the gc pause occurs is something I don't fully understand. For example:

FWIW, this segfaults CPython in 2 lines:

import ctypes
ctypes.cast(1, ctypes.py_object)

Interestingly, this (tends to?) work; even when there are ah scope
closures?:

import ctypes, gc
x = 22
_id = id(x)
del x
gc.collect()
y = ctypes.cast(_id, ctypes.py_object).value
assert y == 22

Adding explicit calls to del with e.g. redbaron or similar would likely be
less surprising.
https://redbaron.readthedocs.io/en/latest/

Or something like a @jit decorator that pros would be aware of (who could
just add del statements to free as necessary)

On Wed, Apr 8, 2020, 10:29 PM Caleb Donovick 
wrote:

> This would almost certainly break my code.   As a DSL developer I do a lot
> of (exec | eval | introspection | ... ) shenanigans, that would make doing
> liveness analysis undecidable.
>
> On Wed, Apr 8, 2020 at 10:51 AM Andrew Barnert via Python-ideas <
> python-ideas@python.org> wrote:
>
>> On Apr 8, 2020, at 09:57, Guido van Rossum  wrote:
>> >
>> > 
>> > Look at the following code.
>> >
>> > def foo(a, b):
>> > x = a + b
>> > if not x:
>> > return None
>> > sleep(1)  # A calculation that does not use x
>> > return a*b
>> >
>> > This code DECREFs x when the frame is exited (at the return statement).
>> But (assuming) we can clearly see that x is not needed during the sleep
>> (representing a big calculation), we could insert a "del x" statement
>> before the sleep.
>> >
>> > I think our compiler is smart enough to find out *some* cases where it
>> could safely insert such del instructions.
>>
>> It depends on how much you’re willing to break and still call it “safely”.
>>
>> def sleep(n):
>> global store
>> store = inspect.current_frame().f_back.f_locals['x']
>>
>> This is a ridiculous example, but it shows that you can’t have all of
>> Python’s dynamic functionality and still know when locals are dead. And
>> there are less ridiculous examples with different code. If foo actually
>> calls eval, exec, locals, vars, etc., or if it has a nested function that
>> nonlocals x, etc., how can we spot that at compile time and keep x alive?
>>
>> Maybe that’s ok. After all, that code doesn’t work in a Python
>> implementation that doesn’t have stack frame support. Some of the other
>> possibilities might be more portable, but I don’t know without digging in
>> further.
>>
>> Or maybe you can add new restrictions to what locals and eval and so on
>> guarantee that will make it ok? Some code will break, but only rare
>> “expert” code, where the authors will know how to work around it.
>>
>> Or, if not, it’s definitely fine as an opt-in optimization: decorate the
>> function with @deadlocals and that decorator scans the bytecode and finds
>> any locals that are dead assuming there’s no use of locals/eval/cells/etc.
>> and, because you told it to assume that by opting in to the decorator, it
>> can insert a DELETE_FAST safely.
>>
>> People already do similar things today—e.g., I’ve (only once in live
>> code, but that’s still more than zero) used a @fastconst decorator that
>> turns globals into consts on functions that I know are safe and are
>> bottlenecks, and this would be no different. And of course you can add a
>> recursive class decorator, or an import hook (or maybe even a command line
>> flag or something) that enables it everywhere (maybe with a @nodeadlocals
>> decorator for people who want it _almost_ everywhere but need to opt out
>> one or two functions).
>>
>> Did Victor Stinner explore this as one of the optimizations for FAT
>> Python/PEP 511/etc.? Maybe not, since it’s not something you can insert a
>> guard, speculatively do, and then undo if the guard triggers, which was I
>> think his key idea.
>>
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/OIGCRV464VJW3FRRBBK25XSNQYGWID7N/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/DSIMIUM6QCUMC2GRTFV646KVWYIR45DR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org