[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2021-12-11 Thread Adam Johnson
On Sat, 11 Dec 2021 at 16:30, Christopher Barker  wrote:
>
> Sorry, accidentally off-list.

I did exactly the same a few days ago.

On Thu, 9 Dec 2021 at 07:49, Chris Angelico  wrote:
>
> BTW, did you intend for this to be entirely off-list?

Nope, and apologies to all, but at least it's given me the opportunity
to correct a typo & do some slight reformatting. Here's it is:

On Thu, 9 Dec 2021 at 07:25, Adam Johnson  wrote:
>
> On Fri, 3 Dec 2021 at 22:38, Chris Angelico  wrote:
> >
> > On Sat, Dec 4, 2021 at 6:33 AM Adam Johnson  wrote:
> > > The first unwelcome surprise was:
> > >
> > > >>> def func(a=>[]):
> > > ... return a
> > > ...
> > >
> > > >>> import inspect
> > > >>> inspect.signature(func).parameters['a'].default
> > > Ellipsis
> > >
> > > Here the current behaviour of returning `Ellipsis` is very unfortunate,
> > > and I think could lead to a lot of head scratching — people wondering
> > > why they are getting ellipses in their code, seemingly from nowhere.
> > > Sure, it can be noted in the official documentation that `Ellipsis` is
> > > used as the indicator of late bound defaults, but third-party resources
> > > which aim to explain the uses of `Ellipsis` would (with their current
> > > content) leave someone clueless.
> >
> > Yes. Unfortunately, since there is fundamentally no object that can be
> > valid here, this kind of thing WILL happen. So when you see Ellipsis
> > in a default, you have to do one more check to figure out whether it's
> > a late-bound default, or an actual early-bound Ellipsis...
>
> My discomfort is that any code that doesn't do that extra check will
> continue to function, but incorrectly operate under the assumption that
> `Ellipsis` was the actual intended value. I wouldn't go so far as to say
> this is outright backwards-incompatible, but perhaps
> 'backwards-misleading'.
>
> When attempting to inspect a late-bound default I'd much rather an
> exception were raised than return value that, as far as any existing
> machinery is concerned, could be valid. (More on this thought later...)
>
> > > Additionally I don't think it's too unreasonable an expectation that,
> > > for a function with no required parameters, either of the following (or
> > > something similar) should be equivalent to calling `func()`:
> > >
> > > pos_only_args, kwds = [], {}
> > > for name, param in inspect.signature(func).parameters.items():
> > > if param.default is param.empty:
> > > continue
> > > elif param.kind is param.POSITIONAL_ONLY:
> > > pos_only_args.append(param.default)
> > > else:
> > > kwds[name] = param.default
> > >
> > > func(*pos_only_args, **kwds)
> > >
> > > # or, by direct access to the dunders
> > >
> > > func(*func.__defaults__, **func.__kwdefaults__)
> >
> > The problem is that then, parameters with late-bound defaults would
> > look like mandatory parameters. The solution is another check after
> > seeing if the default is empty:
> >
> > if param.default is ... and param.extra: continue
>
> In some situations, though, late-bound defaults do essentially become
> mandatory. Picking an example you posted yourself (when demonstrating
> that not using the functions own context could be surprising):
>
> def g(x=>(a:=1), y=>a): ...
>
> In your implementation `a` is local to `g` and gets bound to `1` when no
> argument is supplied for `x` and the default is evaluated, however
> **supplying an argument for `x` leaves `a` unbound**. Therefore, unless
> `y` is also supplied, the function immediately throws an
> `UnboundLocalError` when attempting to get the default for `y`.
>
> With the current implementation it is possible to avoid this issue, but
> it's fairly ugly — especially if calculating the value for `a` has side
> effects:
>
> def g(
> x => (a:=next(it)),
> y => locals()['a'] if 'a' in locals() else next(it),
> ): ...
>
> # or, if `a` is needed within the body of `g`
>
> def g(
> x => (a:=next(it)),
> y => locals()['a'] if 'a' in locals() else (a:=next(it)),
> ): ...
>
> > > The presence of the above if statement's first branch (which was
> > > technically unnecessary, since we established for the purpose of this
> > > example all arguments of `func` are optional / have non-empty defaults)
> > > hin

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2021-12-03 Thread Adam Johnson
On Wed, 1 Dec 2021 at 06:19, Chris Angelico  wrote:
>
> I've just updated PEP 671 https://www.python.org/dev/peps/pep-0671/
> with some additional information about the reference implementation,
> and some clarifications elsewhere.
>
> *PEP 671: Syntax for late-bound function argument defaults*
>
> Questions, for you all:
>
> 1) If this feature existed in Python 3.11 exactly as described, would
> you use it?

Most likely.

---

> 2) Independently: Is the syntactic distinction between "=" and "=>" a
> cognitive burden?

Presented in isolation, like that, no — however I do feel that the
distinguishing character is the at the wrong side of the equals.
Default values may start with a prefix operator (`+`, `-`, `~`), thus
it could be possible to incorrectly interpret the `>` as some sort of
quote/defer prefix operator (or just difficult to spot) when additional
whitespace is lacking. In other words, I think these look a little too
similar:

def func(arg=-default): ...
def func(arg=>default): ...

Additionally `=>` would conflict with the proposed alternate lambda
syntax, both cognitively and syntactically — assuming the `=>` form
would be valid everywhere that a lambda expression is currently
(without requiring additional enclosing parentheses). The following is
legal syntax:

def func(arg: lambda x: x = 42): ...

# for clarification:
# func.__defaults__ == (42,)
# func.__annotations__ == {'arg':  at 0x...>}

It doesn't look promising to place the marker for late bound defaults
on other side of the equals either — causing a syntactical conflict
with comparison operators or assignment operator  (or cognitive
conflict augmented assignment) depending on the choice of character.

This leads me to favour the `@param=default` style and although I
agree with Abe Dillon that this somewhat mimics the `*args` and
`**kwds` syntax, I don't see this parallel as a negative. We already
have some variation of late binding in parameter lists, where? `*args`
and `**kwds`: both are rebound upon each call of the function.

Another odd (though not useful) similarity with the current proposal is
that function objects also lack attributes containing some kind of
special representation of the `*args` and `**kwds` parameter defaults
(i.e. the empty tuple & dict). One **cannot** successfully perform
something akin to the following:

def func(**kwds):
return kwds

func.__kwds_dict_default__ = {'keyword_one': 1}
assert func() == {'keyword_one': 1}

Just as with the proposal one cannot modify the method(s) of calculation
used to obtain the late bound default(s) once a function is defined.

I don't know that I have a strong preference for the specific marker
character, but I quite like how `@param=default` could be understood
as "at each (call) `param` defaults to `default`".

---

> 3) If "yes" to question 1, would you use it for any/all of (a) mutable
> defaults, (b) referencing things that might have changed, (c)
> referencing other arguments, (d) something else?

Likely all three, maybe all four. A combination of (b) & (c) could be
particularly useful with methods since one of those other arguments is
`self`, for example:

class IO:
def truncate(self, position=>self.tell()): ...

---

> 5) Do you know how to compile CPython from source, and would you be
> willing to try this out? Please? :)

I have.

The first unwelcome surprise was:

>>> def func(a=>[]):
... return a
...

>>> import inspect
>>> inspect.signature(func).parameters['a'].default
Ellipsis

Here the current behaviour of returning `Ellipsis` is very unfortunate,
and I think could lead to a lot of head scratching — people wondering
why they are getting ellipses in their code, seemingly from nowhere.
Sure, it can be noted in the official documentation that `Ellipsis` is
used as the indicator of late bound defaults, but third-party resources
which aim to explain the uses of `Ellipsis` would (with their current
content) leave someone clueless.

Additionally I don't think it's too unreasonable an expectation that,
for a function with no required parameters, either of the following (or
something similar) should be equivalent to calling `func()`:

pos_only_args, kwds = [], {}
for name, param in inspect.signature(func).parameters.items():
if param.default is param.empty:
continue
elif param.kind is param.POSITIONAL_ONLY:
pos_only_args.append(param.default)
else:
kwds[name] = param.default

func(*pos_only_args, **kwds)

# or, by direct access to the dunders

func(*func.__defaults__, **func.__kwdefaults__)

The presence of the above if statement's first branch (which was
technically unnecessary, since we established for the purpose of this
example all arguments of `func` are optional / have non-empty defaults)
hints that perhaps `inspect.Parameter` should grow another sentinel
attribute similar to `Parameter.empty` — 

[Python-ideas] Re: PEP 472 - new dunder attribute, to influence item access

2020-08-29 Thread Adam Johnson
On Sat, 29 Aug 2020 at 15:12, Ricky Teachey  wrote:
>
> But if we want to have the same behavior without supporting function
> style syntax, we will have to write code like this:
>
> MISSING = object()
>
> def __getitem__(self, key, x=MISSING, y=MISSING):
> if x is MISSING and y is MISSING::
> x, y = key
> if x is missing:
> x, = key
> if y is MISSING:
> y, = key
>
> And probably that code I just wrote has bugs. And it gets more
> complicated if we want to have more arguments than just two. And
> even more complicated if we want some of the arguments to be
> positional only or any other combination of things.
>
> This is code you would not have to write if we could do this instead
> with a new dunder or subscript processor:
>
> def __getx__(self, x, y): ...
>
> And these all just work:
>
> q[1, 2]
> q[1, y=2]
> q[y=2, x=1]
>
> 1 is assigned to x and 2 is assigned to y in all of these for both
> versions, but the second certain requires not parsing of parameters.
> Python does it for us. That's a lot of easily available flexibility.

I was partway through writing a message outlining this very point.

It's all well and good stating that named indices are an intended use
case (as in the PEP), but in cases where named indices would be useful
they presumably aren't currently being used (abuses of slice notation
notwithstanding). As such, if a variant of PEP 472 were implemented,
code that would benefit from named indices must still find a way to
support 'anonymous' indices for backwards compatibility.

I believe the code to implement that ought to be much more obvious
(both to the author and to readers) if 'anonymous' and named indices
were simply handled as positional and keyword arguments, rather than
manually parsing and validating the allowable combinations of indices.

That being said, it should be noted that even if there are no new
dunders, you don't necessarily need to parse the subscripts manually.
You could still take advantage of python's function-argument parsing
via something resembling the following:

```python
def _parse_subscripts(self, /, x, y):
return x, y

def __getitem__(self, item=MISSING, /, **kwargs):
if item is MISSING:
args = ()
elif isintance(item, tuple):
args = item
else:
args = (item,)

x, y = self._parse_subscripts(*args, **kwargs)
return do_stuff(x, y)
```

However that's still not exactly obvious, as the 'true' signature has
been moved away from `__getitem__`, to an arbitrarily named
(non-dunder) method.

A major difference between the above, and the case where we had one or
more new dunders, is that of introspection: new dunders would mean
that there would be a 'blessed' method whose signature exactly defines
the accepted subscripts. That would be useful in terms of
documentation and could be used to implement parameter completion
within subscripts.

---

Theoretically the signature could instead be defined in terms of
`typing.overload`s, something like the following (assuming x & y are
integers):
```python
@overload
def __getitem__(self, item: tuple[int, int], /): ...

@overload
def __getitem__(self, item: int, /, *, y: int): ...

@overload
def __getitem__(self, /, *, x: int, y: int): ...

def __getitem__(self, item=MISSING, /, **kwargs):
# actual implementation, as above
...
```
However that is incredibly verbose compared to the signature of any
new dunder, and would only grow worse with a greater number of keyword
subscripts.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RKRMT4XYTXH57VBLMY772CRC2DXFTDVX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: zip(x, y, z, strict=True)

2020-05-06 Thread Adam Johnson
On Mon, 4 May 2020 at 12:41, Steven D'Aprano  wrote:
>
> On Sun, May 03, 2020 at 11:13:58PM -0400, David Mertz wrote:
>
> > It seems to me that a Python implementation of zip_equals() shouldn't do
> > the check in a loop like a version shows (I guess from more-itertools).
> > More obvious is the following, and this has only a small constant speed
> > penalty.
> >
> > def zip_equal(*its):
> > yield from zip(*its)
> > if any(_sentinel == next(o, _sentinel) for o in its):
> > raise ZipLengthError
>
> Alas, that doesn't work, even with your correction of `any` to
> `not all`.
>
> py> list(zip_equal("abc", "xy"))
> [('a', 'x'), ('b', 'y')]
>
>
> The problem here is that zip consumes the "c" from the first iterator,
> exhausting it, so your check at the end finds that all the iterators are
> exhausted.

This got me thinking, what if we were to wrap (or as it turned out,
`chain` on to the end of) each of the individual iterables instead,
thereby performing the relevant check before `zip` fully exhausted
them, something like the following:

```python
def zip_equal(*iterables):
return zip(*_checked_simultaneous_exhaustion(*iterables))

def _checked_simultaneous_exhaustion(*iterables):
if len(iterables) <= 1:
return iterables

def check_others():
# first iterable exhausted, check the others are too
sentinel=object()
if any(next(i, sentinel) is not sentinel for i in iterators):
raise ValueError('unequal length iterables')
if False: yield

def throw():
# one of iterables[1:] exhausted first, therefore it must be shorter
raise ValueError('unequal length iterables')
if False: yield

iterators = tuple(map(iter, iterables[1:]))
return (
itertools.chain(iterables[0], check_others()),
*(itertools.chain(it, throw()) for it in iterators),
)
```

This has the advantage that, if desired, the
`_checked_simultaneous_exhaustion` function could also be reused to
implement a previously mentioned length checking version of `map`.

Going further, if `checked_simultaneous_exhaustion` were to become a
public function (with a better name), it could be used to impose
same-length checking to the iterable arguments of any function,
providing those iterables are consumed in a compatible way.

Additionally, it would allow one to be specific about which iterables
were checked, rather than being forced into the option of checking
either all or none by `zip_equal` / `zip` respectively, thus allowing
us to have our cake and eat it in terms of mixing infinite and
checked-length finite iterables, e.g.

```python
zip(i_am_infinite, *checked_simultaneous_exhaustion(*but_we_are_finite))
# or, if they aren't contiguous
checked1, checked2 = checked_simultaneous_exhaustion(it1, it2)
zip(checked1, infinite, checked2)
```

However, as I previously alluded to, this relies upon the assumption
that each of the given iterators is advanced in turn, in the order
they were provided to `checked_simultaneous_exhaustion`. So -- while
this function would be suitable for use with `zip`, `map`, and any
others which do the same -- if we wanted a more general
`checked_equal_length` function that extended to cases in which the
iterable-consuming function may consume the iterables in some
haphazard order, we'd need something more involved, such as keeping a
running tally of the current length of each iterable and, even then,
we could still only guarantee raising on unequal lengths if the said
function advanced all the given iterators by at least the length of
the shortest.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4D3FIYTOJSROIS3S3SYU752RTOJV27IZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Access to member name/names via context during Enum instance initialization

2019-11-30 Thread Adam Johnson
Peeking at the source for Enum, the __new__ method is explicitly called,
thereby deferring the call to __init__ until it, too, is explicitly
called, but only after member's _name_ attribute has been assigned. So,
adapting your example:


class ChoiceEnum(Enum):
def __init__(self, value):
self.label = self.name.capitalize()

class Color(ChoiceEnum):
SMALL = 'S'
MEDIUM = 'M'
LARGE = 'L'

print(Color.SMALL.label)  # Prints 'Small'


Side note:
I was surprised to find the docs lacking a clear example to explain this
behaviour, the DuplicateFreeEnum recipe does show that members'
name/_name_ attributes can be accessed, via the __init__, but that info
could have more attention drawn to it rather than being somewhat buried
in an example where it isn't necessarily immediately obvious.

This is especially true considering the specific "When to use __new__()
vs. __init__()" section, which reads:

> __new__() must be used whenever you want to customize the actual value
> of the Enum member. Any other modifications may go in either __new__()
> or __init__(), with __init__() being preferred.

The second sentence, as per this thread, is demonstrably untrue.
Modifications that are reliant upon the name of the member *must* go in
__init__ (in cases where _generate_next_value_ & auto() are
inappropriate).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AEEXXQGEMBX2BQWXWO5TZEAYJBPYXJAZ/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-12-01 Thread Adam Johnson
On Sat, 1 Dec 2018 at 10:44, Greg Ewing  wrote:
> It's not -- the StopIteration isn't terminating the map,
> it's terminating the iteration being performed by tuple().

That was a poor choice of wording on my part, it's rather that map
doesn't do anything special in that regard. To whatever is iterating
over the map, any unexpected StopIteration from the function isn't
distinguishable from the expected one from the iterable(s) being
exhausted.

This issue was dealt with in generators by PEP-479 (by replacing the
StopIteration with a RuntimeError). Whilst map, filter, and others may
not be generators, I would expect them to be consistent with that PEP
when handling the same issue.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-12-01 Thread Adam Johnson
On Sat, 1 Dec 2018 at 01:17, Steven D'Aprano  wrote:
>
> In principle, we could make this work, by turning the output of map()
> into a view like dict.keys() etc, or a lazy sequence type like range().
> wrapping the underlying sequence. That might be worth exploring. I can't
> think of any obvious problems with a view-like interface, but that
> doesn't mean there aren't any. I've spent like 30 seconds thinking about
> it, so the fact that I can't see any problems with it means little.

Something to consider that, so far, seems to have been overlooked is
that the total length of the resulting map isn't only dependent upon the
iterable, but also the mapped function.

It is a pretty pathological case, but there is no guarantee that the
function is a pure function, free from side effects. If the iterable is
mutable and the mapped function has a reference to it (either from
scoping or the iterable (in)directly containing a reference to itself),
there is nothing to prevent the function modifying the iterable as the
map is evaluated. For example, map can be used as a filter:

it = iter((0, 16, 1, 4, 8, 29, 2, 13, 42))

def filter_odd(x):
while x % 2 == 0:
x = next(it)
return x

tuple(map(filter_odd, it))
# (1, 29, 13)

The above also illustrates the second way the total length of the map
could differ from the length input iterable, even if is immutable. If
StopIteration is raised within the mapped function, map finishes early,
so can be used in a manner similar to takewhile:

def takewhile_lessthan4(x):
if x < 4:
return x
raise StopIteration

tuple(map(takewhile_lessthan4, range(9)))
# (0, 1, 2, 3)

I really don't understand why this is true, under 'normal' usage, map
shouldn't have any reason to silently swallow a StopIteration raised
_within_ the mapped function.


As I opened with, I wouldn't consider using map in either of these ways
to be a good idea, and anyone doing so should probably be persuaded to
find better alternatives, but it might be something to bear in mind.


AJ
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/