[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!
On Sat, 11 Dec 2021 at 16:30, Christopher Barker wrote: > > Sorry, accidentally off-list. I did exactly the same a few days ago. On Thu, 9 Dec 2021 at 07:49, Chris Angelico wrote: > > BTW, did you intend for this to be entirely off-list? Nope, and apologies to all, but at least it's given me the opportunity to correct a typo & do some slight reformatting. Here's it is: On Thu, 9 Dec 2021 at 07:25, Adam Johnson wrote: > > On Fri, 3 Dec 2021 at 22:38, Chris Angelico wrote: > > > > On Sat, Dec 4, 2021 at 6:33 AM Adam Johnson wrote: > > > The first unwelcome surprise was: > > > > > > >>> def func(a=>[]): > > > ... return a > > > ... > > > > > > >>> import inspect > > > >>> inspect.signature(func).parameters['a'].default > > > Ellipsis > > > > > > Here the current behaviour of returning `Ellipsis` is very unfortunate, > > > and I think could lead to a lot of head scratching — people wondering > > > why they are getting ellipses in their code, seemingly from nowhere. > > > Sure, it can be noted in the official documentation that `Ellipsis` is > > > used as the indicator of late bound defaults, but third-party resources > > > which aim to explain the uses of `Ellipsis` would (with their current > > > content) leave someone clueless. > > > > Yes. Unfortunately, since there is fundamentally no object that can be > > valid here, this kind of thing WILL happen. So when you see Ellipsis > > in a default, you have to do one more check to figure out whether it's > > a late-bound default, or an actual early-bound Ellipsis... > > My discomfort is that any code that doesn't do that extra check will > continue to function, but incorrectly operate under the assumption that > `Ellipsis` was the actual intended value. I wouldn't go so far as to say > this is outright backwards-incompatible, but perhaps > 'backwards-misleading'. > > When attempting to inspect a late-bound default I'd much rather an > exception were raised than return value that, as far as any existing > machinery is concerned, could be valid. (More on this thought later...) > > > > Additionally I don't think it's too unreasonable an expectation that, > > > for a function with no required parameters, either of the following (or > > > something similar) should be equivalent to calling `func()`: > > > > > > pos_only_args, kwds = [], {} > > > for name, param in inspect.signature(func).parameters.items(): > > > if param.default is param.empty: > > > continue > > > elif param.kind is param.POSITIONAL_ONLY: > > > pos_only_args.append(param.default) > > > else: > > > kwds[name] = param.default > > > > > > func(*pos_only_args, **kwds) > > > > > > # or, by direct access to the dunders > > > > > > func(*func.__defaults__, **func.__kwdefaults__) > > > > The problem is that then, parameters with late-bound defaults would > > look like mandatory parameters. The solution is another check after > > seeing if the default is empty: > > > > if param.default is ... and param.extra: continue > > In some situations, though, late-bound defaults do essentially become > mandatory. Picking an example you posted yourself (when demonstrating > that not using the functions own context could be surprising): > > def g(x=>(a:=1), y=>a): ... > > In your implementation `a` is local to `g` and gets bound to `1` when no > argument is supplied for `x` and the default is evaluated, however > **supplying an argument for `x` leaves `a` unbound**. Therefore, unless > `y` is also supplied, the function immediately throws an > `UnboundLocalError` when attempting to get the default for `y`. > > With the current implementation it is possible to avoid this issue, but > it's fairly ugly — especially if calculating the value for `a` has side > effects: > > def g( > x => (a:=next(it)), > y => locals()['a'] if 'a' in locals() else next(it), > ): ... > > # or, if `a` is needed within the body of `g` > > def g( > x => (a:=next(it)), > y => locals()['a'] if 'a' in locals() else (a:=next(it)), > ): ... > > > > The presence of the above if statement's first branch (which was > > > technically unnecessary, since we established for the purpose of this > > > example all arguments of `func` are optional / have non-empty defaults) > > > hin
[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!
On Wed, 1 Dec 2021 at 06:19, Chris Angelico wrote: > > I've just updated PEP 671 https://www.python.org/dev/peps/pep-0671/ > with some additional information about the reference implementation, > and some clarifications elsewhere. > > *PEP 671: Syntax for late-bound function argument defaults* > > Questions, for you all: > > 1) If this feature existed in Python 3.11 exactly as described, would > you use it? Most likely. --- > 2) Independently: Is the syntactic distinction between "=" and "=>" a > cognitive burden? Presented in isolation, like that, no — however I do feel that the distinguishing character is the at the wrong side of the equals. Default values may start with a prefix operator (`+`, `-`, `~`), thus it could be possible to incorrectly interpret the `>` as some sort of quote/defer prefix operator (or just difficult to spot) when additional whitespace is lacking. In other words, I think these look a little too similar: def func(arg=-default): ... def func(arg=>default): ... Additionally `=>` would conflict with the proposed alternate lambda syntax, both cognitively and syntactically — assuming the `=>` form would be valid everywhere that a lambda expression is currently (without requiring additional enclosing parentheses). The following is legal syntax: def func(arg: lambda x: x = 42): ... # for clarification: # func.__defaults__ == (42,) # func.__annotations__ == {'arg': at 0x...>} It doesn't look promising to place the marker for late bound defaults on other side of the equals either — causing a syntactical conflict with comparison operators or assignment operator (or cognitive conflict augmented assignment) depending on the choice of character. This leads me to favour the `@param=default` style and although I agree with Abe Dillon that this somewhat mimics the `*args` and `**kwds` syntax, I don't see this parallel as a negative. We already have some variation of late binding in parameter lists, where? `*args` and `**kwds`: both are rebound upon each call of the function. Another odd (though not useful) similarity with the current proposal is that function objects also lack attributes containing some kind of special representation of the `*args` and `**kwds` parameter defaults (i.e. the empty tuple & dict). One **cannot** successfully perform something akin to the following: def func(**kwds): return kwds func.__kwds_dict_default__ = {'keyword_one': 1} assert func() == {'keyword_one': 1} Just as with the proposal one cannot modify the method(s) of calculation used to obtain the late bound default(s) once a function is defined. I don't know that I have a strong preference for the specific marker character, but I quite like how `@param=default` could be understood as "at each (call) `param` defaults to `default`". --- > 3) If "yes" to question 1, would you use it for any/all of (a) mutable > defaults, (b) referencing things that might have changed, (c) > referencing other arguments, (d) something else? Likely all three, maybe all four. A combination of (b) & (c) could be particularly useful with methods since one of those other arguments is `self`, for example: class IO: def truncate(self, position=>self.tell()): ... --- > 5) Do you know how to compile CPython from source, and would you be > willing to try this out? Please? :) I have. The first unwelcome surprise was: >>> def func(a=>[]): ... return a ... >>> import inspect >>> inspect.signature(func).parameters['a'].default Ellipsis Here the current behaviour of returning `Ellipsis` is very unfortunate, and I think could lead to a lot of head scratching — people wondering why they are getting ellipses in their code, seemingly from nowhere. Sure, it can be noted in the official documentation that `Ellipsis` is used as the indicator of late bound defaults, but third-party resources which aim to explain the uses of `Ellipsis` would (with their current content) leave someone clueless. Additionally I don't think it's too unreasonable an expectation that, for a function with no required parameters, either of the following (or something similar) should be equivalent to calling `func()`: pos_only_args, kwds = [], {} for name, param in inspect.signature(func).parameters.items(): if param.default is param.empty: continue elif param.kind is param.POSITIONAL_ONLY: pos_only_args.append(param.default) else: kwds[name] = param.default func(*pos_only_args, **kwds) # or, by direct access to the dunders func(*func.__defaults__, **func.__kwdefaults__) The presence of the above if statement's first branch (which was technically unnecessary, since we established for the purpose of this example all arguments of `func` are optional / have non-empty defaults) hints that perhaps `inspect.Parameter` should grow another sentinel attribute similar to `Parameter.empty` —
[Python-ideas] Re: PEP 472 - new dunder attribute, to influence item access
On Sat, 29 Aug 2020 at 15:12, Ricky Teachey wrote: > > But if we want to have the same behavior without supporting function > style syntax, we will have to write code like this: > > MISSING = object() > > def __getitem__(self, key, x=MISSING, y=MISSING): > if x is MISSING and y is MISSING:: > x, y = key > if x is missing: > x, = key > if y is MISSING: > y, = key > > And probably that code I just wrote has bugs. And it gets more > complicated if we want to have more arguments than just two. And > even more complicated if we want some of the arguments to be > positional only or any other combination of things. > > This is code you would not have to write if we could do this instead > with a new dunder or subscript processor: > > def __getx__(self, x, y): ... > > And these all just work: > > q[1, 2] > q[1, y=2] > q[y=2, x=1] > > 1 is assigned to x and 2 is assigned to y in all of these for both > versions, but the second certain requires not parsing of parameters. > Python does it for us. That's a lot of easily available flexibility. I was partway through writing a message outlining this very point. It's all well and good stating that named indices are an intended use case (as in the PEP), but in cases where named indices would be useful they presumably aren't currently being used (abuses of slice notation notwithstanding). As such, if a variant of PEP 472 were implemented, code that would benefit from named indices must still find a way to support 'anonymous' indices for backwards compatibility. I believe the code to implement that ought to be much more obvious (both to the author and to readers) if 'anonymous' and named indices were simply handled as positional and keyword arguments, rather than manually parsing and validating the allowable combinations of indices. That being said, it should be noted that even if there are no new dunders, you don't necessarily need to parse the subscripts manually. You could still take advantage of python's function-argument parsing via something resembling the following: ```python def _parse_subscripts(self, /, x, y): return x, y def __getitem__(self, item=MISSING, /, **kwargs): if item is MISSING: args = () elif isintance(item, tuple): args = item else: args = (item,) x, y = self._parse_subscripts(*args, **kwargs) return do_stuff(x, y) ``` However that's still not exactly obvious, as the 'true' signature has been moved away from `__getitem__`, to an arbitrarily named (non-dunder) method. A major difference between the above, and the case where we had one or more new dunders, is that of introspection: new dunders would mean that there would be a 'blessed' method whose signature exactly defines the accepted subscripts. That would be useful in terms of documentation and could be used to implement parameter completion within subscripts. --- Theoretically the signature could instead be defined in terms of `typing.overload`s, something like the following (assuming x & y are integers): ```python @overload def __getitem__(self, item: tuple[int, int], /): ... @overload def __getitem__(self, item: int, /, *, y: int): ... @overload def __getitem__(self, /, *, x: int, y: int): ... def __getitem__(self, item=MISSING, /, **kwargs): # actual implementation, as above ... ``` However that is incredibly verbose compared to the signature of any new dunder, and would only grow worse with a greater number of keyword subscripts. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RKRMT4XYTXH57VBLMY772CRC2DXFTDVX/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: zip(x, y, z, strict=True)
On Mon, 4 May 2020 at 12:41, Steven D'Aprano wrote: > > On Sun, May 03, 2020 at 11:13:58PM -0400, David Mertz wrote: > > > It seems to me that a Python implementation of zip_equals() shouldn't do > > the check in a loop like a version shows (I guess from more-itertools). > > More obvious is the following, and this has only a small constant speed > > penalty. > > > > def zip_equal(*its): > > yield from zip(*its) > > if any(_sentinel == next(o, _sentinel) for o in its): > > raise ZipLengthError > > Alas, that doesn't work, even with your correction of `any` to > `not all`. > > py> list(zip_equal("abc", "xy")) > [('a', 'x'), ('b', 'y')] > > > The problem here is that zip consumes the "c" from the first iterator, > exhausting it, so your check at the end finds that all the iterators are > exhausted. This got me thinking, what if we were to wrap (or as it turned out, `chain` on to the end of) each of the individual iterables instead, thereby performing the relevant check before `zip` fully exhausted them, something like the following: ```python def zip_equal(*iterables): return zip(*_checked_simultaneous_exhaustion(*iterables)) def _checked_simultaneous_exhaustion(*iterables): if len(iterables) <= 1: return iterables def check_others(): # first iterable exhausted, check the others are too sentinel=object() if any(next(i, sentinel) is not sentinel for i in iterators): raise ValueError('unequal length iterables') if False: yield def throw(): # one of iterables[1:] exhausted first, therefore it must be shorter raise ValueError('unequal length iterables') if False: yield iterators = tuple(map(iter, iterables[1:])) return ( itertools.chain(iterables[0], check_others()), *(itertools.chain(it, throw()) for it in iterators), ) ``` This has the advantage that, if desired, the `_checked_simultaneous_exhaustion` function could also be reused to implement a previously mentioned length checking version of `map`. Going further, if `checked_simultaneous_exhaustion` were to become a public function (with a better name), it could be used to impose same-length checking to the iterable arguments of any function, providing those iterables are consumed in a compatible way. Additionally, it would allow one to be specific about which iterables were checked, rather than being forced into the option of checking either all or none by `zip_equal` / `zip` respectively, thus allowing us to have our cake and eat it in terms of mixing infinite and checked-length finite iterables, e.g. ```python zip(i_am_infinite, *checked_simultaneous_exhaustion(*but_we_are_finite)) # or, if they aren't contiguous checked1, checked2 = checked_simultaneous_exhaustion(it1, it2) zip(checked1, infinite, checked2) ``` However, as I previously alluded to, this relies upon the assumption that each of the given iterators is advanced in turn, in the order they were provided to `checked_simultaneous_exhaustion`. So -- while this function would be suitable for use with `zip`, `map`, and any others which do the same -- if we wanted a more general `checked_equal_length` function that extended to cases in which the iterable-consuming function may consume the iterables in some haphazard order, we'd need something more involved, such as keeping a running tally of the current length of each iterable and, even then, we could still only guarantee raising on unequal lengths if the said function advanced all the given iterators by at least the length of the shortest. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4D3FIYTOJSROIS3S3SYU752RTOJV27IZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Access to member name/names via context during Enum instance initialization
Peeking at the source for Enum, the __new__ method is explicitly called, thereby deferring the call to __init__ until it, too, is explicitly called, but only after member's _name_ attribute has been assigned. So, adapting your example: class ChoiceEnum(Enum): def __init__(self, value): self.label = self.name.capitalize() class Color(ChoiceEnum): SMALL = 'S' MEDIUM = 'M' LARGE = 'L' print(Color.SMALL.label) # Prints 'Small' Side note: I was surprised to find the docs lacking a clear example to explain this behaviour, the DuplicateFreeEnum recipe does show that members' name/_name_ attributes can be accessed, via the __init__, but that info could have more attention drawn to it rather than being somewhat buried in an example where it isn't necessarily immediately obvious. This is especially true considering the specific "When to use __new__() vs. __init__()" section, which reads: > __new__() must be used whenever you want to customize the actual value > of the Enum member. Any other modifications may go in either __new__() > or __init__(), with __init__() being preferred. The second sentence, as per this thread, is demonstrably untrue. Modifications that are reliant upon the name of the member *must* go in __init__ (in cases where _generate_next_value_ & auto() are inappropriate). ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AEEXXQGEMBX2BQWXWO5TZEAYJBPYXJAZ/ Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] __len__() for map()
On Sat, 1 Dec 2018 at 10:44, Greg Ewing wrote: > It's not -- the StopIteration isn't terminating the map, > it's terminating the iteration being performed by tuple(). That was a poor choice of wording on my part, it's rather that map doesn't do anything special in that regard. To whatever is iterating over the map, any unexpected StopIteration from the function isn't distinguishable from the expected one from the iterable(s) being exhausted. This issue was dealt with in generators by PEP-479 (by replacing the StopIteration with a RuntimeError). Whilst map, filter, and others may not be generators, I would expect them to be consistent with that PEP when handling the same issue. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] __len__() for map()
On Sat, 1 Dec 2018 at 01:17, Steven D'Aprano wrote: > > In principle, we could make this work, by turning the output of map() > into a view like dict.keys() etc, or a lazy sequence type like range(). > wrapping the underlying sequence. That might be worth exploring. I can't > think of any obvious problems with a view-like interface, but that > doesn't mean there aren't any. I've spent like 30 seconds thinking about > it, so the fact that I can't see any problems with it means little. Something to consider that, so far, seems to have been overlooked is that the total length of the resulting map isn't only dependent upon the iterable, but also the mapped function. It is a pretty pathological case, but there is no guarantee that the function is a pure function, free from side effects. If the iterable is mutable and the mapped function has a reference to it (either from scoping or the iterable (in)directly containing a reference to itself), there is nothing to prevent the function modifying the iterable as the map is evaluated. For example, map can be used as a filter: it = iter((0, 16, 1, 4, 8, 29, 2, 13, 42)) def filter_odd(x): while x % 2 == 0: x = next(it) return x tuple(map(filter_odd, it)) # (1, 29, 13) The above also illustrates the second way the total length of the map could differ from the length input iterable, even if is immutable. If StopIteration is raised within the mapped function, map finishes early, so can be used in a manner similar to takewhile: def takewhile_lessthan4(x): if x < 4: return x raise StopIteration tuple(map(takewhile_lessthan4, range(9))) # (0, 1, 2, 3) I really don't understand why this is true, under 'normal' usage, map shouldn't have any reason to silently swallow a StopIteration raised _within_ the mapped function. As I opened with, I wouldn't consider using map in either of these ways to be a good idea, and anyone doing so should probably be persuaded to find better alternatives, but it might be something to bear in mind. AJ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/