[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-31 Thread Erik Demaine

On Mon, 1 Nov 2021, Chris Angelico wrote:


This is incompatible with the existing __get__ method, so it should
get a different name. Also, functions have a __get__ method, so you
definitely don't want to have everything that takes a callback run
into this. Let's say it's __delayed__ instead.


Right, good point.  I'm clearly still learning about descriptors. :-)


I'm having a LOT of trouble seeing this as an improvement.


It's not meant to be an improvement exactly, more of a compatible explanation 
of how PEP 671 works -- in the same way that `instance.method` doesn't 
"magically" make a bound method, but rather checks whether `instance.method` 
has a `__get__` attribute, and if so, calls it with `instance` as an 
argument, instead of returning `instance.method` directly.  This mechanism 
makes the whole `instance.method` less magic, more introspectable, more 
overridable, etc., e.g. making classmethod and similar decorators possible. 
I'm trying to do the same thing with PEP 671 (though possibly failing :-)).


At least it's still executing the function in its natural scope; it's 
"just" the locals() dict that gets exposed, as an argument.


Yes, which means you can't access nonlocals or globals, only locals.
So it has a subset of functionality in an awkward way.


My actual intent was to just be able to access the arguments, which are all 
locals to the function.  [Conceptually, I was thinking of the arguments being 
in their own object, and then getting accessed once like attributes, which 
triggered __get__ if defined -- but this view isn't very good, in particular 
because we don't want to redefine what it means to pass functions as 
arguments!]


But the __delayed__ method is already a function, so it has its own locals, 
nonlocals, and globals.  The difference is that those are in the frame of 
__delayed__, which is outside the function with the defaults, and I wanted to 
access that function's arguments -- hence passing in the function's locals().



Alternatively, we could forbid this (at least for now): perhaps a __get__
method only gets checked and called on a parameter when that parameter has its
default value (e.g. `end is bisect.__defaults__[1]`).


That part's not a problem; if this has language support, it could be
much more explicit: "if the end parameter was not set".


True.  I was trying to preserve the "skip this argument" property, but it 
might make more sense to call __delayed__ only when the argument is omitted. 
This might make it possible for defaults with __delayed__ methods to actually 
be evaluated in the function's scope, which would make it more compatible with 
the current PEP 671.



AND it becomes impossible to have an object with this
method as an early default - that's the sentinel problem.


That's true.  I guess my point is that these *are* early defaults, but act 
very much like late defaults.  Functions or function calls just treat these 
early defaults specially because they have a __delayed__ method.


I agree it's not perfect, but is there a context where you'd actually want to 
have an early default that is one of these objects?  The point to add a method 
to an early default that makes the early default behave like a late default. 
So this feels like expected behavior...?



The use of locals() (as an argument to __get__) is rather ugly, and probably
prevents name lookup optimization.


Yes. It also prevents use of anything other than locals. For instance,
you can't have global helper functions, or anything like that; you
could use something like len() from the builtins, but you couldn't use
a function defined in the same module. Passing both globals and locals
would be better, but still imperfect; and it incurs double lookups
every time.


That wasn't my intent.  The __delayed__ method is still a function, and has 
its own locals, nonlocals, and globals.  It can still call len (as my example 
code did) -- it's just the len visible from the __delayed__ function, not the 
len visible from the function with the default parameter.


It's true that this approach would prevent implementing something like this:

```
def foo(a => (b := 5)):
nonlocal b
```

I'm not sure that that is particularly important: I just wanted the default 
expression to be able to access the arguments and the surrounding scopes.


Sure. Explore anything you like! But I don't think that this is any less 
ugly than either the status quo or PEP 671, both of which involve actual 
real code being parsed by the compiler.


This proposal was meant to help define what the compiler with PEP 671 parsed 
code *into*.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.pytho

[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-31 Thread Erik Demaine

On Sat, 30 Oct 2021, Erik Demaine wrote:

Functions are already a form of deferred evaluation.  PEP 671 is an 
embellishment to this mechanism for some of the code in the function 
signature to actually get executed within the body scope, *just like the body 
of the function*.


I was thinking about what other forms of deferred evaluation Python has, and 
ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. 
Classes support this mechanism for calling arbitrary code when accessing the 
attribute, instead of when calling the class:


```
class CallMeLater:
'''Descriptor for calling a specified function with no arguments.'''
def __init__(self, func):
self.func = func
def __get__(self, obj, objtype=None):
return self.func()

class Foo:
early_list = []
late_list = CallMeLater(lambda: [])

foo1 = Foo()
foo2 = Foo()
foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list
foo1.early_list is foo2.early_list# the same []
foo1.late_list is not foo2.late_list  # two different []s
```

Written this way, it feels quite a bit like early and late arguments to me. 
So this got me thinking:


What if parameter defaults supported descriptors?  Specifically, something 
like the following:


If a parameter (passed or defaulted) has a __get__ method, call it with
one argument (beyond self), namely, the function scope's locals().
Parameters are so processed in order from left to right.

(PEPs 549 and 649 are somewhat related in that they also propose extending 
descriptors.)


This would enable the following hand-rolled late-bound defaults (using two 
early-bound defaults):


```
def foo(early_list = [], late_list = CallMeLater(lambda: [])):
...
```

Or we could write a decorator to make this somewhat cleaner:

```
def late_defaults(func):
'''Convert callable defaults into late-bound defaults'''
func.__defaults__ = tuple(
CallMeLater(default) if callable(default) else default
for default in func.__defaults__
)
return func

@late_defaults
def foo(early_list = [], late_list = lambda: []):
...
```

It's also possible, but difficult, to write `end := len(a)` defaults:

```
class LateLength:
'''Descriptor for calling len(specified name)'''
def __init__(self, name):
self.name = name
def __get__(self, locals):
return len(locals[self.name])
def __repr__(self):
# This is bad form for repr, but it makes help(bisect)
# output the "right" thing: end=len(a)
return f'len({self.name})'

def bisect(a, start=0, end=LateLength('a')):
...
```

One feature/bug of this approach is that someone calling the function could 
pass in a descriptor, and its __get__ method will get called by the function 
(immediately at the start of the call).  Personally I find this dangerous, but 
those excited about general deferreds might like it?  At least it's still 
executing the function in its natural scope; it's "just" the locals() dict 
that gets exposed, as an argument.


Alternatively, we could forbid this (at least for now): perhaps a __get__ 
method only gets checked and called on a parameter when that parameter has its 
default value (e.g. `end is bisect.__defaults__[1]`).  In addition to 
feeling safer (to me), this would enable a lot of optimization:


* Parameters without defaults don't need any __get__ checking.

* Default values could be checked for the presence of a __get__ method at 
function definition time (or when setting func.__defaults__), and that flag 
could get checked at function call time, and __get__ semantics occur only when 
that flag is set.  (I'm not sure whether this would actually save time, 
though.  Maybe if it were a global flag for the function, "any 
late-bound arguments here?".  If not, old behavior and performance.)



This proposal could be compatible with PEP 671.  What I find nice about this 
proposal is that it's valid Python syntax today, just an extension of the data 
model.  But I wouldn't necessarily want to use the ugly incantations above, 
and rather use some syntactic sugar on top of it -- and that's where PEP 671 
could come in.  What this proposal might offer is a *meaning* for that 
syntactic sugar, which is more general and perhaps more Pythonic (building on 
the existing Python data model).  It provides another way to think about what 
the notation in PEP 671 means, and suggests a (different) mechanism to 
implement it.


Some nice features:

* __defaults__ naturally generalizes here; no need for auxiliary structures 
or different signatures for __defaults__.  A tool looking at __defaults__ 
could either be aware of descriptors in this context or not.  All other 
introspection should be the same.


* It becomes possible to skip a positional argument again: pass in the value 
in __defaults__ and it will behave as if that argument wasn't passed.


* The syntactic sugar could build a __repr__ (or 

[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-30 Thread Erik Demaine

On Sat, 30 Oct 2021, Brendan Barnwell wrote:

	I agree it seems totally absurd to add a type of deferred expression 
but restrict it to only work inside function definitions.


Functions are already a form of deferred evaluation.  PEP 671 is an 
embellishment to this mechanism for some of the code in the function signature 
to actually get executed within the body scope, *just like the body of the 
function*.  This doesn't seem weird to me.


If we have a way to create deferred expressions we should try to make them 
more generally usable.


Does anyone have a proposal for deferred expressions that could match the ease 
of use of PEP 671 in assigning a default argument of, say, `[]`?  The 
proposals I've seen so far in this thread involve checking `isdeferred` and 
then resolving that deferred.  This doesn't seem any easier than the existing 
sentinal approach for default arguments, whereas PEP 671 significantly 
simplifies this use-case.


I also don't see how a function could distinguish a deferred default argument 
and a deferred argument passed in from another function.  In my opinion, the 
latter would be really messy/dangerous to work with, because it could 
arbitrarily polute your scope.  Whereas late-bound default arguments make a 
lot of sense: they're written in the function itself (just in the signature 
instead of the body), so we can see by looking at the code what happens.


I've written code in dynamically scoped languages before.  I don't recall 
enjoying it.  But maybe I missed a proposal, or someone has an idea for how to 
fix these issues.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7ZJPAUJVUXJNI2SPAXK54CL3FGR22SCW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-27 Thread Erik Demaine

On Tue, 26 Oct 2021, Christopher Barker wrote:


It's not actually documented that None indicates "use the default".
Which, it turns out is because it doesn't :-)
In [24]: bisect.bisect([1,3,4,6,8,9], 5, hi=None)
---
TypeError                                 Traceback (most recent call last)
 in 
> 1 bisect.bisect([1,3,4,6,8,9], 5, hi=None)

TypeError: 'NoneType' object cannot be interpreted as an integer

I guess that's because in C there is a way to define optional other than using a
sentinel? or it's using an undocumented sentinal?

Note: that's python 3.8 -- I can't imagine anything;s changed, but ...


It seems to have changed.  I can reproduce the error in CPython 3.8, but the 
same code words in CPython 3.9 and 3.10 (all using the C version of the 
module, though there's also a Python version of the module that probably 
always supported hi=None).  I think it's the result of this commit:


https://github.com/python/cpython/commit/3a855b26aed02abf87fc1163ad0d564dc3da1ea3#diff-02d3dd896d6d030e5c6c3e0961f9a4760a37b50bb05a2d89e4ab627a8f1a7b9f

On the plus side, this probably means that there aren't many people using the 
hi=None API. :-)  So it might be safe to change to a late-bound default.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PXBXUWQPX4ZGOVGMPCV2ITNAPG5KEUTW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-26 Thread Erik Demaine

On Tue, 26 Oct 2021, Ricky Teachey wrote:


At bottom I guess I'd describe the problem this way: with most APIs, there is a
way to PASS SOMETHING that says "give me the default". With this proposed API, 
we
don't have that; the only want to say "give me the default" is to NOT pass
something.

I don't KNOW if that's a problem, it just feels like one.


I agree that it's annoying, but it's actually an existing problem with 
early-bound defaults too.  Consider:


```
def f(eggs = [], spam = {}): ...
```

There isn't an easy way to get the defaults for the arguments, because they're 
not just *any* `[]` or `{}`, they're a specific list and dict.  So if you want 
to specify a value for the second argument but not the first, you'd need to do 
one of the following:


```
f(spam = {'more'})

f(f.__defaults__[0], {'more'})
```

The former would work just as well with PEP 671.

The latter depends on introspection, which we're still working out. 
Unfortunately, even if we can get access to the code that produces the 
default, we won't be able to actually call it, because it needs to be called 
from the function's scope.  For example, consider:


```
def g(eggs := [], spam := {}): ...
```

In this simple case, there are no dependencies, so we could do something like 
this:


```
g(g.__defaults__[0](), {'more'})
```

But in general we won't be able to make this call, because we don't have the 
scope until `g` gets called and its scope created...


So there is a bit of functionality loss with PEP 671, though I'm not sure it's 
that big a deal.



I wonder if it would make sense to offer a "missing argument" object (builtin? 
attribute of inspect.Parameter? attribute of types.FunctionType?) that 
actually simulates the behavior of that argument not being passed.  Let me 
call it `_missing` for now.  This would actually make it far easier to 
accomplish "pass in the second argument but not the first", both with early- 
and late-binding defaults:


```
f(_missing, {'more'})
g(_missing, {'more'})
```

I started thinking about `_missing` when thinking about how to implement 
late-binding defaults.  It's at least one way to do it (then the function 
itself could even do the argument checks), though perhaps there are simpler 
ways that avoid the ref count increments.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3DLKREVEG62RHDHY4KP2R6IX2PPA633F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Parameter tuple unpacking in the age of positional-only arguments

2021-10-26 Thread Erik Demaine

On Tue, 26 Oct 2021, Eric V. Smith wrote:

You may or may not recall that a big reason for the removal of "tuple 
parameter unpacking" in PEP 3113 was that they couldn't be supported by the 
inspect module. Quoting that PEP: "Python has very powerful introspection 
capabilities. These extend to function signatures. There are no hidden 
details as to what a function's call signature is."


(Aside: I loved tuple parameter unpacking, and used it all the time! I was 
sad to see them go, but I agreed with PEP 3113.)


Having recently heard a friend say "the removal of tuple parameter unpacking 
was one thing that Python 3 got wrong", I read this and PEP 3113 with 
interest.


It seems like another approach would be to treat tuple-unpacking parameters as 
positional-only, now that this is a thing, or perhaps require that they are 
explicitly positional-only via in PEP 570:


def move((x, y), /): ...  # could be valid?
def move((x, y)): ... # could remain invalid?

Is it worth revisiting parameter tuple-unpacking in the age of positional-only 
arguments?  Or is this still a no-go from the perspective of introspection, 
because it violates "There are no hidden details as to what a function's call 
signature is."?  (This may be a very short-lived thread.)


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4DEUPSGXRJMB4TWGVLEZUEOCZUX3TNMS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-26 Thread Erik Demaine

On Tue, 26 Oct 2021, Steven D'Aprano wrote:


   def func(x=x, y=>x)  # or func(x=x, @y=x)


This makes me think of a "real" use-case for assigning all early-bound 
defaults before late-bound defaults: consider using closure hacks (my main use 
of early-bound defaults) together with late-bound defaults, as in


```
for i in range(n):
def func(arg := expensive(i), i = i):
...
```

I think it's pretty common to put closure hacks at the end, so they don't get 
in the way of the caller.  (The intent is that the caller never specifies 
those arguments.)  But then it'd be nice to be able to use those variables in 
the late-bound defaults.


I can't say this is beautiful code, but it is an application and would 
probably be convenient.


On Tue, 26 Oct 2021, Eric V. Smith wrote:


Among my objections to this proposal is introspection: how would that work?
The PEP mentions that the text of the expression would be available for
introspection, but that doesn't seem very useful.


I think what would make sense is for code objects to be visible, in the same 
way as `func.__code__`.  But it's definitely worth fleshing out whether:


1. Late-bound defaults are in `func.__defaults__` and `func.__kwdefaults__` -- 
where code objects are treated as special kind of default values.  This seems 
problematic because we can't distinguish between a late-bound default and an 
early-bound default that is a code object.


or

2. There are new defaults like `func.__late_defaults__` and 
`func.__late_kwdefaults__`.  The issue here is that it's not clear in what 
order to mix `func.__defaults__` and `func.__late_defaults` (each a tuple).


Perhaps most natural is to add a new introspection object, say LateDefault, 
that can take place as a default value (but can't be used as an early-bound 
default?), and has a __code__ attribute.


---

By the way, another thing missing from the PEP: presumably lambda expressions 
can also have late-bound defaults?



On Tue, 26 Oct 2021, Marc-Andre Lemburg wrote:


Now, it may not be obvious, but the key advantage of such
deferred objects is that you can pass them around, i.e. the
"defer os.listdir(DEFAULT_DIR)" could also be passed in via
another function.


Are deferred code pieces are dynamically scoped, i.e., they are evaluated in 
whatever scope they end up getting evaluated?  That would certainly 
interesting, but also kind of dangerous (about as dangerous as eval), and I 
imagine fairly prone to error if they get passed around a lot.  If they're 
*not* dynamically scoped, then I think they're equivalent to lambda, and then 
they don't solve the default parameter problem, because they'll be evaluated 
in the function's enclosing scope instead of the function's scope.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UPC3AX7ESRJ57IJS4DWEV4MS3N4SIISO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions

2021-10-25 Thread Erik Demaine

On Sat, 16 Oct 2021, Erik Demaine wrote:

Assuming the support remains relatively unanimous for [*...], {*...}, and 
{**...} (thanks for all the quick replies!), I'll put together a PEP.


As promised, I put together a pre-PEP (together with my friend and coteacher 
Adam Hartz, not currently subscribed, but I'll keep him aprised):


https://github.com/edemaine/peps/blob/unpacking-comprehensions/pep-.rst

For this to become an actual PEP, it needs a sponsor.  If a core developer 
would be willing to be the sponsor for this, please let me know.  (This is my 
first PEP, so if I'm going about this the wrong way, also let me know.)


Meanwhile, I'd welcome any comments!  In writing things up, I became convinced 
that generators should be supported, but arguments should not be supported; 
see the document for details why.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/L6NZLEWOXM2KTGOIX7AHP5L76TLNKDPW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-25 Thread Erik Demaine
g `spam` from being defined in the function's scope, it seems more 
reasonable for your example to work, just like the following should:


```
spam = 5
def f(x := spam):
print(x, spam)  # 5 5
f()
```


Here's another example where it matters whether the default expressions are 
computed within their own scope:


```
def f(x := (y := 5)):
print(x)  # 5
print(y)  # 5???
f()
```

I feel like we don't want to allow accessing `y` in the body of `f` here, 
because whether `y` is bound depends on whether `x` was passed.  (If `x` is 
passed, `y` won't get assigned.)  This would suggest evaluating default 
expressions in their own scope would be beneficial.  Intuitively, the parens 
are indicating a separate scope, in the same way that `(x for x in it)` 
creates its own scope and thus doesn't leak `x`.  On the other hand, `((y := 
x) for x in it)` does seem to leak `y`, so I'm not really sure what would be 
best / most consistent here.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EWYHQLZOXLYH5DCZJIW3KQSSO3BV37TD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-25 Thread Erik Demaine

On Mon, 25 Oct 2021, Chris Angelico wrote:


On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano  wrote:


The rules for applying parameter defaults are well-defined. I would have
to look it up to be sure...


And that right there is all the evidence I need. If you, an
experienced Python programmer, can be unsure, then there's a strong
indication that novice programmers will have far more trouble. Why
permit bad code at the price of hard-to-explain complexity?


I'm not sure how this helps; the rules are already a bit complicated. 
Steven's proposed rules are a natural way to extend the existing rules; I 
don't see the new rules as (much) more complicated.



Offer me a real use-case where this would matter. So far, we had
better use-cases for arbitrary assignment expression targets than for
back-to-front argument default references, and those were excluded.


I can think of a few examples, though they are a bit artificial:

```
def search_listdir(path = None, files := os.listdir(path),
   start = 0, end = len(files)):
'''specify path or files'''

# variation of the LocaleTextCalendar from stdlib (in a message of Steven's)
class Calendar:
default_firstweekday = 0
def __init__(self, firstweekday := Calendar.default_firstweekday,
 locale := find_default_locale(),
 firstweekdayname := locale.lookup_day_name(firstweekday)):
...
Calendar.default_firstweekday = 1
```

But I think the main advantage of the left-to-right semantics is simplicity 
and predictability.  I don't think the following functions should evaluate 
the default values in different orders.


```
def f(a := side_effect1(), b := side_effect2()): ...
def g(a := side_effect1(), b := side_effect2() + a): ...
def h(a := side_effect1() + b, b := side_effect2()): ...
```

I expect left-to-right semantics of the side effects (so function h will 
probably raise an error), just like I get from the corresponding tuple 
expressions:


```
(a := side_effect1(), b := side_effect2())
(a := side_effect1(), b := side_effect2() + a)
(a := side_effect1() + b, b := side_effect2())
```

As Jonathan Fine mentioned, if you defined the order to be a linearization of 
the partial order on arguments, (a) this would be complicated and (b) it would 
be ambiguous.  I think, if you're going to forbid `def f(a := b, b:= a)` at 
the compiler level, you would need to forbid using late-bound arguments (at 
least) in least-bound argument expressions.  But I don't see a reason to 
forbid this.  It's rare that order would matter, and if it did, a 
quick experiment or learning "left to right" is really easy.


The tuple expression equivalence leads me to think that `:=` is decent 
notation.  As a result, I would expect:


```
def f(a := expr1, b := expr2, c := expr3): pass
```

to behave the same as:

```
_no_a = object()
_no_b = object()
_no_c = object()
def f(a = _no_a, b = _no_b, c = _no_c):
(a := expr1 if a is _no_a else a,
 b := expr2 if b is _no_b else b,
 c := expr3 if c is _no_c else c)
```

Given that `=` assignments within a function's parameter spec already only 
means "assign when another value isn't specified", this is pretty similar.


On Mon, 25 Oct 2021, Chris Angelico wrote:


On Sun, 24 Oct 2021, Erik Demaine wrote:

> I think the semantics are easy to specify: the argument defaults get 
> evaluated for unspecified ARGUMENT(s), in left to right order as specified 
> in the def. Those may trigger exceptions as usual.


Ah, but is it ALL argument defaults, or only those that are
late-evaluated? Either way, it's going to be inconsistent with itself
and harder to explain. That's what led me to change my mind.


I admit I missed this subtlety, though again I don't think it would often make 
a difference.  But working out subtleties is what PEPs and discussion are for. 
:-)


I'd be inclined to assign the early-bound argument defaults before the 
late-bound arguments, because their values are already known (they're stored 
right in the function argument) so they can't cause side effects, and it could 
offer slight incremental benefits, like being able to write the following 
(again, somewhat artificial):


```
def manipulate(top_list):
def recurse(start=0, end := len(rec_list), rec_list=top_list): ...
```

But I don't feel strongly either way about either interpretation.

Mixing both types of default arguments breaks the analogy to tuple expressions 
above, alas.  The corresponding tuple expression with `=` is just invalid.


Personally, I'd expect to use late-bound defaults almost all or all the time; 
they behave more how I expect and how I usually need them (I use a fair amount 
of `[]` and `{}` and `set()` as default values).  The only context I'd 
use/want the current default behavior is to hack closures, as in:


```
for thing in things:
thing.callback = lambda thing=thing: print(thing.name)
```

I believe the general preference for late-bound

[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-24 Thread Erik Demaine

On Sun, 24 Oct 2021, Erik Demaine wrote:

I think the semantics are easy to specify: the argument defaults get 
evaluated for unspecified order, in left to right order as specified in the 
def.  Those may trigger exceptions as usual.


Sorry, that should be:

I think the semantics are easy to specify: the argument defaults get evaluated 
for unspecified ARGUMENT(s), in left to right order as specified in the def. 
Those may trigger exceptions as usual.


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SNAYBJR52DHO3U76RLXZEC7HQFJLKVEX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-24 Thread Erik Demaine

On Mon, 25 Oct 2021, Chris Angelico wrote:


On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico  wrote:


On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine  wrote:


Please forgive me if it's not already been considered. Is the following valid 
syntax, and if so what's the semantics? Here it is:

def puzzle(*, a=>b+1, b=>a+1):
return a, b


There are two possibilities: either it's a SyntaxError, or it's a
run-time UnboundLocalError if you omit both of them (in which case it
would be perfectly legal and sensible if you specify one of them).

I'm currently inclined towards SyntaxError, since permitting it would
open up some hard-to-track-down bugs, but am open to suggestions about
how it would be of value to permit this.


In fact, on subsequent consideration, I'm inclining more strongly
towards SyntaxError, due to the difficulty of explaining the actual
semantics. Changing the PEP accordingly.


I think the semantics are easy to specify: the argument defaults get evaluated 
for unspecified order, in left to right order as specified in the def.  Those 
may trigger exceptions as usual.


Erik
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HYXNABI2ACLVCLQH5TNRDX6WWSHNOING/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults

2021-10-24 Thread Erik Demaine

On Sun, 24 Oct 2021, Chris Angelico wrote:


Is anyone interested in coauthoring this with me? Anyone who has
strong interest in seeing this happen - whether you've been around the
Python lists for years, or you're new and interested in getting
involved for the first time, or anywhere in between!


I have a strong interest in seeing this happen, and would be happy to help how 
I can.  Teaching (and using) the behavior of Python argument initializers is 
definitely a thorn in my side. :-)  I'd love to be able to easily initalize an 
empty list/set/dict.


For what it's worth, here are my thoughts on some of the syntaxes proposed so 
far:


* I don't like `def f(arg => default)` exactly because it looks like a lambda, 
and so I imagine arg is an argument to that lambda, but the intended meaning 
has nothing to do with that.  I understand lambdas give delegation, but in my 
mind that should look more like `def f(arg = => default)` or `def f(arg = () 
=> default)` -- except these will have a different meaning (arg's default is a 
function, and they would be evaluated in parent scope not the function's 
scope) once `=>` is short-hand for lambda.


* I find `def f(arg := default)` reasonable.  I was actually thinking about 
this very issue before the thread started, and this was the syntax that came 
to mind.  The main plus for this is that it uses an existing operator (so 
fewer to learn) and it is "another kind of assignment".  The main minus is 
that it doesn't really have much to do with the walrus operator; we're not 
using the assigned value inline like `arg := default` would mean outside 
`def`.  Then again, `def f(arg = default)` is quite different from `arg = 
default` outside `def`.


* I find `def f(arg ?= default)` (or `def f(arg ??= default)`) reasonable, 
exactly because it is similar to None-aware operators (PEP 0505), which is 
currently/recently under discussion in python-dev).  The main complaint about 
PEP 0505 in those discussions is that it's very None-specific, which feels 
biased.  But the meaning of "omitted value" is extremely clear in a def.  If 
both this were added and PEP 0505 were accepted, `def f(arg ?= default)` would 
be roughly equivalent to:


```
def f(arg = None):
arg ??= default
```

except `def f(arg ?= default)` wouldn't trigger default because in the case 
of `f(None)`, whereas the above code would.  I find this an acceptable 
difference.  (FWIW, I'm also in favor of 0505.)


* I also find `def f(@arg = default)` reasonable, though it feels a little 
inconsistent with decorators.  I expect a decorator expression after @, not 
an argument, more like `def f(@later arg = default)`.


* I'm not very familiar with thunks, but they seem a bit too magical for my 
liking.  Evaluating argument defaults only sometimes (when they get read in 
the body) feels a bit unpredictable.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DBDNNYYOVVZ5MITYXC5Q3SC5U2P3ASUS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions

2021-10-16 Thread Erik Demaine

On Sun, 17 Oct 2021, Steven D'Aprano wrote:


On Sat, Oct 16, 2021 at 11:42:49AM -0400, Erik Demaine wrote:


I guess the question is whether to define `(*it for it in its)` to mean
tuple or generator comprehension or nothing at all.


I don't see why that is even a question. We don't have tuple
comprehensions and `(expr for x in items)` is always a generator, never
a tuple. There's no ambiguity there. Why would allowing unpacking turn
it into a tuple?


Agreed.  I got confused by the symmetry.


The only tricky corner case is that generator comprehensions can forgo
the surrounding brackets in the case of a function call:

   func( (expr for x in items) )
   func( expr for x in items )  # we can leave out the brackets

But with the unpacking operator, it is unclear whether the unpacking
star applies to the entire generator or the inner expression:

   func(*expr for x in items)

That could be read as either:

   it = (expr for x in items)
   func(*it)

or this:

   it = (*expr for x in items)
   func(it)

Of course we can disambiguate it with precedence rules, [...]


I'd be inclined to go that way, as the latter seems like the only reasonable 
(to me) parse for that syntax.  Indeed, that's how the current parser 
interprets this:


```
func(*expr for x in items)
 ^
SyntaxError: iterable unpacking cannot be used in comprehension
```

To get the former meaning, which is possible today, you already need 
parentheses, as in



   func(*(expr for x in items))




But it would be quite surprising for this minor issue to lead to the
major inconsistency of prohibiting unpacking inside generator comps when
it is allowed in list, dict and set comps.


Good point.  Now I'm much more inclined to define the generator expression 
`(*expr for x in items)`.  Thanks for your input!



On Sat, 16 Oct 2021, Serhiy Storchaka wrote:


It was considered and rejected in PEP 448. What was changed since? What
new facts or arguments have emerged?


I need to read the original discussion more (e.g. 
https://mail.python.org/pipermail/python-dev/2015-February/138564.html), but 
you can see the summary of why it was removed here: 
https://www.python.org/dev/peps/pep-0448/#variations


In particular, there was "limited support" before (and the generator ambiguity 
issue discussed above).  I expect now that we've gotten to enjoy PEP 448 for 5 
years, it's more "obvious" that this functionality is missing and useful.  So 
far that seems true (all responses have been at least +0), but if anyone 
disagree, please say so.


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DGPZMQXAZG55J4HLACIXMBZFCTEM6FPG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions

2021-10-16 Thread Erik Demaine

On Sat, 16 Oct 2021, David Mertz, Ph.D. wrote:


On Sat, Oct 16, 2021, 10:10 AM Erik Demaine
  (*it1, *it2, *it3)  # tuple with the concatenation of three
  iterables
  [*it1, *it2, *it3]  # list with the concatenation of three
  iterables
  {*it1, *it2, *it3}  # set with the union of three iterables
  {**dict1, **dict2, **dict3}  # dict with the combination of
  three dicts 

I'm +0 on the last three of these. 


But the first one is much more suggestive of a generator comprehension. I
would want/expect it to be equivalent to itertools.chain(), not create a
tuple.


I guess you were referring to `(*it for it in its)` (proposed notation) rather 
than `(*it1, *it2, *it3)` (which already exists and builds a tuple).


Very good point!  This is confusing.  I could also read `(*it for it in its)` 
as wanting to build the following generator (or something like it):


```
def generate():
for it in its:
yield from it
```

I guess the question is whether to define `(*it for it in its)` to mean tuple 
or generator comprehension or nothing at all.  Tuples are nice because they 
mirror `(*it1, *it2, *it3)` but bad for the reasons you raise:



Moreover, it is an anti-pattern to create large and indefinite sized tuples,
whereas such large collections as lists, sets, and dicts are common and
useful.


I'd be inclined to not define `(*it for it in its)`, given the ambiguity.

Assuming the support remains relatively unanimous for [*...], {*...}, and 
{**...} (thanks for all the quick replies!), I'll put together a PEP.


On Sat, 16 Oct 2021, Guido van Rossum wrote:


Seems sensible to me. I’d write the equivalency as

for x in y: answer.extend([…x…])


Oh, nice!  That indeed works in all cases.

Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2AZBMZGKL56PERIJRCPTIJ6BRITTWHGM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Unpacking in tuple/list/set/dict comprehensions

2021-10-16 Thread Erik Demaine
Extended unpacking notation (* and **) from PEP 448 gives us great ways to 
concatenate a few iterables or dicts:


```
(*it1, *it2, *it3)  # tuple with the concatenation of three iterables
[*it1, *it2, *it3]  # list with the concatenation of three iterables
{*it1, *it2, *it3}  # set with the union of three iterables
{**dict1, **dict2, **dict3}  # dict with the combination of three dicts
# roughly equivalent to dict1 | dict2 | dict3 thanks to PEP 584
```

I propose (not for the first time) that similarly concatenating an unknown 
number of iterables or dicts should be possible via comprehensions:


```
(*it for it in its)  # tuple with the concatenation of iterables in 'its'
[*it for it in its]  # list with the concatenation of iterables in 'its'
{*it for it in its}  # set with the union of iterables in 'its'
{**d for d in dicts} # dict with the combination of dicts in 'dicts'
```

The above is my attempt to argue that the proposed notation is natural:
`[*it for it in its]` is exactly analogous to `[*its[0], *its[1], ..., 
*its[len(its)-1]]`.


There are other ways to do this, of course:

```
[x for it in its for x in it]
itertools.chain(*its)
sum(it for it in its, [])
functools.reduce(operator.concat, its, [])
```

But none are as concise and (to me, and hopefully others who understand * 
notation) as intuitive.  For example, I recently wanted to write a recursion 
like so, which accumulated a set of results from within a tree structure:


```
def recurse(node):
  # base case omitted
  return {*recurse(child) for child in node.children}
```

In fact, I am teaching a class and just asked a question on a written exam for 
which several students wrote this exact code in their solution (which inspired 
writing this message).  So I do think it's quite intuitive, even to those 
relatively new to Python.


Now, on to previous proposals.  I found this thread from 2016 (!); please let 
me know if there are others.


https://mail.python.org/archives/list/python-ideas@python.org/thread/SBM3LYESPJMI3FMTMP3VQ6JKKRDHYP7A/#DE4PCVNXBQJIGFBYRB2X7JUFZT75KYFR

There are several arguments for and against this feature in that thread.  I'll 
try to summarize:


Arguments for:

* Natural extension to PEP 448 (it's mentioned as a variant within PEP 448)

* Easy to implement: all that's needed in CPython is to *remove* some code 
blocking this.


Arguments against:

* Counterintuitive (to some)

* Hard to teach

* `[...x... for x in y]` is no longer morally equivalent to
`answer = []; for x in y: answer.append(...x...)`
(unless `list1.append(a, b)` were equivalent to `list1.extend([a, b])`)

Above I've tried to counter the first two "against" arguments.
Some counters to the third "against" argument:

1. `[*...x... for x in y]` is equivalent to
`answer = []; for x in y: answer.extend(...x...)`
(about as easy to teach, I'd say)

2. Maybe `list1.append(a, b)` should be equivalent to `list1.extend([a, b])`?
It is in JavaScript (`Array.push`).  And I don't see why one would expect
it to append a tuple `(a, b)`; that's what `list1.append((a, b))` is for.
I think the main argument against this is to avoid programming errors,
which is fine, but I don't see why it should hold back an advanced feature
involving both unpacking and comprehensions.

Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7G732VMDWCRMWM4PKRG6ZMUKH7SUC7SH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Accessing target name at runtime

2021-10-16 Thread Erik Demaine

On Sat, 16 Oct 2021, Steven D'Aprano wrote:


The token should preferably be:

* self-explanatory, not line-noise;

* shorter rather than longer, otherwise it is easier to just
 type the target name as a string: 'x' is easier to type than
 NAME_OF_ASSIGNMENT_TARGET;

* backwards compatible, which means it can't be anything that
 is already a legal name or expression;

* doesn't look like an error or typo.


A possible soft keyword: __lhs__ (short for 'left-hand side'):


REGION = os.getenv(__lhs__)
db_url = config[REGION][__lhs__]


It's not especially short, and it's not backward-compatible,
but at least there's a history of adding double-underscore things.
Perhaps, for backward compatibility, the feature could be disabled in any 
scope (or file?) where __lhs__ is assigned, in which case it's treated like a 
variable as usual.  The magic version only applies when it's used in a 
read-only fashion.  It's kind of like a builtin variable, but its value 
changes on every line (and it's valid only in an assignment line).


One thing I wonder: what happens if you write the following?


foo[1] = __lhs__  # or <<< or whatever


Maybe you get 'foo[1]', or maybe this is invalid syntax, in the same way that 
the following is.



def foo[1]: pass



Classes, functions, decorators and imports already satisfy the "low
hanging fruit" for this functionality. My estimate is that well over 99%
of the use-cases for this fall into just four examples, which are
already satisfied by the interpreter:
[...]
# like func = decorator(func)
# similarly for classes
@decorator
def func(): ...


This did get me wondering about how you could simulate this feature with 
decorators.  Probably obvious, but here's the best version I came up with:


```
def env_var(x):
return os.getenv(x.__name__)

@env_var
def REGION(): pass
```

It's definitely ugly to avoid repetition...  Using a class, I guess we could 
at least get several such variables at once.



If we didn't already have interpreter support for these four cases, it
would definitely be worth coming up with a solution. But the use-cases
that remain are, I think, quite niche and uncommon.


To me (a mathematician), the existence of this magic in def, class, import, 
etc. is a sign that this is indeed useful functionality.  As a fan of 
first-class language features, it definitely makes me wonder whether it could 
be generalized.


But I'm not sure what the best mechanism is.  (From the description in the 
original post, I gather that variable assignment decorators didn't work out 
well.)  I wonder about some generalized mechanism for automatically setting 
the __name__ of an assigned object (like def and class), but I'm not sure what 
it would look like...


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BHGDRTX3BBYB66NINSTOPROTCIRKZNRU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dict_items.__getitem__?

2021-10-11 Thread Erik Demaine
There seems to be a growing list of issues with adding `itertools.first(x)` as 
shorthand for `next(iter(x))`:


* If `x` is an iterator, it modifies the iterator, which is counterintuitive 
from the name `first`.


* It'll still be difficult for new users to find/figure out.

In the end, I feel like the main case I want to use a `first` and `last` 
functions on are `dict`s; other objects like `range`, `str`, `list`, `tuple` 
all support `[0]` and `[-1]`.


So I wonder whether we should go back to this idea:

On Tue, 5 Oct 2021, Alex Waygood wrote:


[...] Another possibility I've been wondering about was
whether several methods should be added to the dict interface:
 *  dict.first_key = lambda self: next(iter(self))
 *  dict.first_val = lambda self: next(iter(self.values()))
 *  dict.first_item = lambda self: next(iter(self.items()))
 *  dict.last_key = lambda self: next(reversed(self))
 *  dict.last_val = lambda self: next(reversed(self.values()))
 *  dict.last_item = lambda self: next(reversed(self.items()))
But I think I like a lot more the idea of adding general ways of doing these
things to itertools.


At the least, I wonder whether a `dict.lastitem` method that's the 
nondestructive analog of `dict.popitem` would be good to add.  This would 
solve the case of "I want an arbitrary item from this dict, I don't care which 
one, but I don't want to modify the dict so I'd rather not use popitem" which 
I've seen repeated a few times in this thread.


By contrast, I don't think `next(iter(my_dict))` is an intuitive way to solve 
this problem, even for many experts; and I don't think it's as efficient as 
`my_dict.lastitem()` would be, because the current `dict` code maintains a 
pointer to the last item but not to the first item.


[I also admit that I've mostly forgotten the original situation where I wanted 
this functionality.  I believe it was an exhaustive search, where I wanted to 
branch on an arbitrary item of a dict, and nondestructively build new versions 
of that dict for recursive calls (instead of modifying before recursion and 
unmodifying afterward).]



One more idea to throw around: Consider the following "anonymous unpacking" 
syntax.


```
first, * = [1, 2, 3]
*, last = [1, 2, 3]
```

For someone used to unpacking syntax, this seems like a natural extension to 
what we have now, and is far more flexible than just extracting the first 
element.  The distinction from the existing methods (with e.g. `*_`) is that 
it wouldn't waste time extracting elements you don't want.  And it could work 
well with things like `dict` (and `dict_items` etc.).


Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IQ2EJM5BTDEO4URUHN3XGR6XSXX22HFR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] dict_items.__getitem__?

2021-10-04 Thread Erik Demaine
Have folks thought about allowing indexing dictionary views as in the 
following code, where d is a dict object?


d.keys()[0]
d.keys()[-1]
d.values()[0]
d.values()[-1]
d.items()[0]
d.items()[-1]  # item that would be returned by d.popitem()

I could see value to the last form in particular: you might want to inspect 
the last item of a dictionary before possibly popping it.


I've also often wanted to get an arbitrary item/key from a dictionary, and 
d.items()[0] seems natural for this.  Of course, the universal way to get the 
first item from an iterable x is


item = next(iter(x))

I can't say this is particularly readable, but it is functional and fast.  Or 
sometimes I use this pattern:


for item in x: break

If you wanted the last item of a dictionary d (the one to be returned from 
d.popitem()), you could write this beautiful code:


last = next(iter(reversed(d.items(


Given the dictionary order guarantee from Python 3.7, adding indexing 
(__getitem__) to dict views seems natural.  The potential catch is that (I 
think) it would require linear time to access an item in the middle, because 
you need to count the dummy elements.  But accessing [i] and [-i] should be 
doable in O(|i|) time.  (I've wondered about the possibility of doing binary 
or interpolation search, but without some stored index signposts, I don't 
think it's possible.)


Python is also full of operations that take linear time to do: list.insert(0, 
x), list.pop(0), list.index(), etc.  But it may be that __getitem__ takes 
constant time on all built-in data structures, and the apparent symmetry but 
very different performance between dict()[i] and list()[i] might be confusing. 
That said, I really just want d[0] and d[-1], which is when these are fast.


I found some related discussion in 
https://mail.python.org/archives/list/python-ideas@python.org/thread/QVTGZD6USSC34D4IJG76UPKZRXBBB4MM/

but not this exact idea.

Erik
--
Erik Demaine  |  edema...@mit.edu  |  http://erikdemaine.org/
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PPI747IBFYYRAVPUJDY4DKFNTJGASH3K/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:20 PM Cameron Simpson  wrote:
>
> On 08Oct2018 10:56, Ram Rachum  wrote:
> >That's incredibly interesting. I've never used mmap before.
> >However, there's a problem.
> >I did a few experiments with mmap now, this is the latest:
> >
> >path = pathlib.Path(r'P:\huge_file')
> >
> >with path.open('r') as file:
> >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
>
> Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?
>
> >for match in re.finditer(b'.', mmap):
> >pass
> >
> >The file is 338GB in size, and it seems that Python is trying to load it
> >into memory. The process is now taking 4GB RAM and it's growing. I saw the
> >same behavior when searching for a non-existing match.
> >
> >Should I open a Python bug for this?
>
> Probably not. First figure out what is going on. BTW, how much RAM have you
> got?
>
> As you access the mapped file the OS will try to keep it in memory in case you
> need that again. In the absense of competition, most stuff will get paged out
> to accomodate it. That's normal. All the data are "clean" (unmodified) so the
> OS can simply release the older pages instantly if something else needs the
> RAM.
>
> However, another possibility is the the regexp is consuming lots of memory.
>
> The regexp seems simple enough (b'.'), so I doubt it is leaking memory like
> mad; I'm guessing you're just seeing the OS page in as much of the file as it
> can.

Yup. Windows will aggressively fill up your RAM in cases like this
because after all why not?  There's no use to having memory just
sitting around unused.  For read-only, non-anonymous mappings it's not
much problem for the OS to drop pages that haven't been recently
accessed and use them for something else.  So I wouldn't be too
worried about the process chewing up RAM.

I feel like this is veering more into python-list territory for
further discussion though.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:23 PM Nathaniel Smith  wrote:
>
> On Mon, Oct 8, 2018 at 2:55 AM, Steven D'Aprano  wrote:
> >
> > On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote:
> >> Each tool which wants to use pyproject.toml has to add a toml lib  as a
> >> conditional or hard dependency.
> >>
> >> Since toml is now the standard configuration file format,
> >
> > It is? Did I miss the memo? Because I've never even heard of TOML before
> > this very moment.
>
> He's referring to PEPs 518 and 517 [1], which indeed standardize on
> TOML as a file format for Python package build metadata.
>
> I think moving anything into the stdlib would be premature though –
> TOML libraries are under active development, and the general trend in
> the packaging space has been to move things *out* of the stdlib (e.g.
> there's repeated rumblings about moving distutils out), because the
> stdlib release cycle doesn't work well for packaging infrastructure.

If I had the energy to argue it I would also argue against using TOML
in those PEPs.  I personally don't especially care for TOML and what's
"obvious" to Tom is not at all obvious to me.  I'd rather just stick
with YAML or perhaps something even simpler than either one.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2018-09-24 Thread Erik Bray
On Fri, Sep 21, 2018 at 12:58 AM Chris Angelico  wrote:
>
> On Fri, Sep 21, 2018 at 8:52 AM Kyle Lahnakoski  
> wrote:
> > Since the java.lang.Thread.stop() "debacle", it has been obvious that
> > stopping code to run other code has been dangerous.  KeyboardInterrupt
> > (any interrupt really) is dangerous. Now, we can probably code a
> > solution, but how about we remove the danger:
> >
> > I suggest we remove interrupts from Python, and make them act more like
> > java.lang.Thread.interrupt(); setting a thread local bit to indicate an
> > interrupt has occurred.  Then we can write explicit code to check for
> > that bit, and raise an exception in a safe place if we wish.  This can
> > be done with Python code, or convenient places in Python's C source
> > itself.  I imagine it would be easier to whitelist where interrupts can
> > raise exceptions, rather than blacklisting where they should not.
>
> The time machine strikes again!
>
> https://docs.python.org/3/c-api/exceptions.html#signal-handling

Although my original post did not explicitly mention
PyErr_CheckSignals() and friends, it had already taken that into
account and it is not a silver bullet, at least w.r.t. the exact issue
I raised, which had to do with the behavior of context managers versus
the

setup()
try:
do_thing()
finally:
cleanup()

pattern, and the question of how signals are handled between Python
interpreter opcodes.  There is a still-open bug on the issue tracker
discussing the exact issue in greater details:
https://bugs.python.org/issue29988
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Erik Bray
On Tue, Apr 10, 2018 at 9:50 PM, Eric V. Smith  wrote:
>
>>> 3. Annotations. They are used mainly by third party tools that
>>> statically analyze sources. They are rarely used at runtime.
>>
>> Even less used than docstrings probably.
>
> typing.NamedTuple and dataclasses use annotations at runtime.

Astropy uses annotations at runtime for optional unit checking on
arguments that take dimensionful quantities:
http://docs.astropy.org/en/stable/api/astropy.units.quantity_input.html#astropy.units.quantity_input
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP proposal: unifying function/method classes

2018-03-28 Thread Erik Bray
On Fri, Mar 23, 2018 at 11:25 AM, Antoine Pitrou  wrote:
> On Fri, 23 Mar 2018 07:25:33 +0100
> Jeroen Demeyer  wrote:
>
>> On 2018-03-23 00:36, Antoine Pitrou wrote:
>> > It does make sense, since the proposal sounds ambitious (and perhaps
>> > impossible without breaking compatibility).
>>
>> Well, *some* breakage of backwards compatibility will be unavoidable.
>>
>>
>> My plan (just a plan for now!) is to preserve backwards compatibility in
>> the following ways:
>>
>> * Existing Python attributes of functions/methods should continue to
>> exist and behave the same
>>
>> * The inspect module should give the same results as now (by changing
>> the implementation of some of the functions in inspect to match the new
>> classes)
>>
>> * Everything from the documented Python/C API.
>>
>>
>> This means that I might break compatibility in the following ways:
>>
>> * Changing the classes of functions/methods (this is the whole point of
>> this PEP). So anything involving isinstance() checks might break.
>>
>> * The undocumented parts of the Python/C API, in particular the C structure.
>
> One breaking change would be to add __get__ to C functions.  This means
> e.g. the following:
>
> class MyClass:
> my_open = open
>
> would make my_open a MyClass method, therefore you would need to spell
> it:
>
> class MyClass:
> my_open = staticmethod(open)
>
> ... if you wanted MyClass().my_open('some file') to continue to work.
>
> Of course that might be considered a minor annoyance.

I don't really see your point in this example.  For one: why would
anyone do this?  Is this based on a real example?  2) That's how any
function works.  If you put some arbitrary function in a class body,
and it's not able to accept an instance of that class as its first
argument, then it will always be broken unless you make it a
staticmethod.  I don't see how there should be any difference there if
the function were implemented in Python or in C.

Thanks,
E
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] importlib: making FileFinder easier to extend

2018-02-07 Thread Erik Bray
Hello,

Brief problem statement: Let's say I have a custom file type (say,
with extension .foo) and these .foo files are included in a package
(along with other Python modules with standard extensions like .py and
.so), and I want to make these .foo files importable like any other
module.

On its face, importlib.machinery.FileFinder makes this easy.  I make a
loader for my custom file type (say, FooSourceLoader), and I can use
the FileFinder.path_hook helper like:

sys.path_hooks.insert(0, FileFinder.path_hook((FooSourceLoader, ['.foo'])))
sys.path_importer_cache.clear()

Great--now I can import my .foo modules like any other Python module.
However, any standard Python modules now cannot be imported.  The way
PathFinder sys.meta_path hook works, sys.path_hooks entries are
first-come-first-serve, and furthermore FileFinder.path_hook is very
promiscuous--it will take over module loading for *any* directory on
sys.path, regardless what the file extensions are in that directory.
So although this mechanism is provided by the stdlib, it can't really
be used for this purpose without breaking imports of normal modules
(and maybe it's not intended for that purpose, but the documentation
is unclear).

There are a number of different ways one could get around this.  One
might be to pass FileFinder.path_hook loaders/extension pairs for all
the basic file types known by the Python interpreter.  Unfortunately
there's no great way to get that information.  *I* know that I want to
support .py, .pyc, .so etc. files, and I know which loaders to use for
them.  But that's really information that should belong to the Python
interpreter, and not something that should be reverse-engineered.  In
fact, there is such a mapping provided by
importlib.machinery._get_supported_file_loaders(), but this is not a
publicly documented function.

One could probably think of other workarounds.  For example you could
implement a custom sys.meta_path hook.  But I think it shouldn't be
necessary to go to higher levels of abstraction in order to do
this--the default sys.path handler should be able to handle this use
case.

In order to support adding support for new file types to
sys.path_hooks, I ended up implementing the following hack:

#
import os
import sys

from importlib.abc import PathEntryFinder


@PathEntryFinder.register
class MetaFileFinder:
"""
A 'middleware', if you will, between the PathFinder sys.meta_path hook,
and sys.path_hooks hooks--particularly FileFinder.

The hook returned by FileFinder.path_hook is rather 'promiscuous' in that
it will handle *any* directory.  So if one wants to insert another
FileFinder.path_hook into sys.path_hooks, that will totally take over
importing for any directory, and previous path hooks will be ignored.

This class provides its own sys.path_hooks hook as follows: If inserted
on sys.path_hooks (it should be inserted early so that it can supersede
anything else).  Its find_spec method then calls each hook on
sys.path_hooks after itself and, for each hook that can handle the given
sys.path entry, it calls the hook to create a finder, and calls that
finder's find_spec.  So each sys.path_hooks entry is tried until a spec is
found or all finders are exhausted.
"""

def __init__(self, path):
if not os.path.isdir(path):
raise ImportError('only directories are supported', path=path)

self.path = path
self._finder_cache = {}

def __repr__(self):
return '{}({!r})'.format(self.__class__.__name__, self.path)

def find_spec(self, fullname, target=None):
if not sys.path_hooks:
return None

for hook in sys.path_hooks:
if hook is self.__class__:
continue

finder = None
try:
if hook in self._finder_cache:
finder = self._finder_cache[hook]
if finder is None:
# We've tried this finder before and got an ImportError
continue
except TypeError:
# The hook is unhashable
pass

if finder is None:
try:
finder = hook(self.path)
except ImportError:
pass

try:
self._finder_cache[hook] = finder
except TypeError:
# The hook is unhashable for some reason so we don't bother
# caching it
pass

if finder is not None:
spec = finder.find_spec(fullname, target)
if spec is not None:
return spec

# Module spec not found through any of the finders
return None

def invalidate_caches(self):
for finder in self._finder_cache.values():
finder.invalidate_caches()


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-29 Thread Erik Bray
On Thu, Dec 28, 2017 at 8:42 PM, Serhiy Storchaka <storch...@gmail.com> wrote:
> 28.12.17 12:10, Erik Bray пише:
>>
>> There's no index() alternative to int().
>
>
> operator.index()

Okay, and it's broken.  That doesn't change my other point that some
functions that could previously take non-int arguments can no
longer--if we agree on that at least then I can set about making a bug
report and fixing it.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-28 Thread Erik Bray
On Fri, Dec 8, 2017 at 7:20 PM, Ethan Furman <et...@stoneleaf.us> wrote:
> On 12/08/2017 04:33 AM, Erik Bray wrote:
>
>> More importantly not as many objects that coerce to int actually
>> implement __index__.  They probably *should* but there seems to be
>> some confusion about how that's to be used.
>
>
> __int__ is for coercion (float, fraction, etc)
>
> __index__ is for true integers
>
> Note that if __index__ is defined, __int__ should also be defined, and
> return the same value.
>
> https://docs.python.org/3/reference/datamodel.html#object.__index__

This doesn't appear to be enforced, though I think maybe it should be.

I'll also note that because of the changes I pointed out in my
original post, it's now necessary for me to explicitly cast as int()
objects that previously "just worked" when passed as arguments in some
functions in itertools, collections, and other modules with C
implementations.  However, this is bad because if some broken code is
passing floats to these arguments, they will be quietly cast to int
and succeed, when really I should only be accepting objects that have
__index__.  There's no index() alternative to int().

I think changing all these functions to do the appropriate
PyIndex_Check is a correct and valid fix, but I think it also
stretches beyond the original purpose of __index__.  I think that
__index__ is relatively unknown, and perhaps there should be better
documentation as to when and how it should be used over the
better-known __int__.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
On Fri, Dec 8, 2017 at 1:52 PM, Antoine Pitrou  wrote:
> On Fri, 8 Dec 2017 14:30:00 +0200
> Serhiy Storchaka 
> wrote:
>>
>> NumPy integers implement __index__.
>
> That doesn't help if a function calls e.g. PyLong_AsLongAndOverflow().

Right--pointing to __index__ basically implies that PyIndex_Check and
subsequent PyNumber_AsSsize_t than there currently are.  That I could
agree with but then it becomes a question of where are those cases?
And what do with, e.g. interfaces like PyLong_AsLongAndOverflow().
Add more PyNumber_ conversion functions?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
On Fri, Dec 8, 2017 at 12:26 PM, Serhiy Storchaka <storch...@gmail.com> wrote:
> 08.12.17 12:41, Erik Bray пише:
>>
>> IIUC, it seems to be carry-over from Python 2's PyLong API, but I
>> don't see an obvious reason for it.  In every case there's an explicit
>> PyLong_Check first anyways, so not calling __int__ doesn't help for
>> the common case of exact int objects; adding the fallback costs
>> nothing in that case.
>
>
> There is also a case of int subclasses. It is expected that PyLong_AsLong is
> atomic, and calling __int__ can lead to crashes or similar consequences.
>
>> I ran into this because I was passing an object that implements
>> __int__ to the maxlen argument to deque().  On Python 2 this used
>> PyInt_AsSsize_t which does fall back to calling __int__, whereas
>> PyLong_AsSsize_t does not.
>
>
> PyLong_* functions provide an interface to PyLong objects. If they don't
> return the content of a PyLong object, how can it be retrieved? If you want
> to work with general numbers you should use PyNumber_* functions.

By "you " I assume you meant the generic "you".  I'm not the one who
broke things in this case :)

> In your particular case it is more reasonable to fallback to __index__
> rather than __int__. Unlikely maxlen=4.2 makes sense.

That's true, but in Python 2 that was possible:

>>> deque([], maxlen=4.2)
deque([], maxlen=4)

More importantly not as many objects that coerce to int actually
implement __index__.  They probably *should* but there seems to be
some confusion about how that's to be used.  It was mainly motivated
by slices, but it *could* be used in general cases where it definitely
wouldn't make sense to accept a float (I wonder if maybe the real
problem here is that floats can be coerced automatically to ints)

In other words, there are probably countless other cases in the stdlib
at all where it "doesn't make sense" to accept a float, but that
otherwise should accept objects that can be coerced to int without
having to manually wrap those objects with an int(o) call.

>> Currently the following functions fall back on __int__ where available:
>>
>> PyLong_AsLong
>> PyLong_AsLongAndOverflow
>> PyLong_AsLongLong
>> PyLong_AsLongLongAndOverflow
>> PyLong_AsUnsignedLongMask
>> PyLong_AsUnsignedLongLongMask
>
>
> I think this should be deprecated (and there should be an open issue for
> this). Calling __int__ is just a Python 2 legacy.

Okay, but then there are probably many cases where they should be
replaced with PyNumber_ equivalents or else who knows how much code
would break.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
IIUC, it seems to be carry-over from Python 2's PyLong API, but I
don't see an obvious reason for it.  In every case there's an explicit
PyLong_Check first anyways, so not calling __int__ doesn't help for
the common case of exact int objects; adding the fallback costs
nothing in that case.

I ran into this because I was passing an object that implements
__int__ to the maxlen argument to deque().  On Python 2 this used
PyInt_AsSsize_t which does fall back to calling __int__, whereas
PyLong_AsSsize_t does not.

Currently the following functions fall back on __int__ where available:

PyLong_AsLong
PyLong_AsLongAndOverflow
PyLong_AsLongLong
PyLong_AsLongLongAndOverflow
PyLong_AsUnsignedLongMask
PyLong_AsUnsignedLongLongMask

whereas the following (at least according to the docs--haven't checked
the code in all cases) do not:

PyLong_AsSsize_t
PyLong_AsUnsignedLong
PyLong_AsSize_t
PyLong_AsUnsignedLongLong
PyLong_AsDouble
PyLong_AsVoidPtr

I think this inconsistency should be fixed, unless there's some reason
for it I'm not seeing.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] install pip packages from Python prompt

2017-11-04 Thread Erik Bray
On Nov 4, 2017 08:31, "Stephen J. Turnbull" <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

Erik Bray writes:

 > Nope.  I totally get that they don’t know what a shell or command prompt
 > is.  THEY. NEED. TO. LEARN.


Just to be clear I did not write this. Someone replying to me did.

I'm going to go over all the different proposals in this thread and see if
I can synthesize a list of options. I think, even if it's not a solution
that winds up in the stdlib, it would be good to have some user stories
about how package installation from within an interactive prompt might work
(even if not from the standard REPL, which it should be noted has had small
improvements made to it over the years).

I also have my doubts about whether this *shouldn't* be possible. I mean,
to a lot of beginners starting out the basic REPL *is* Python. They're so
new to the scene they don't even know what IPython or Jupyter is or why
they might want that. They aren't experienced enough to even know what
they're missing out on. In classrooms we can resolve that easily by
pointing our students to whatever tools we think will work best for them,
but not everyone has that privilege.

Best,
Erik

I don't want to take a position on the proposal, and I agree that we
should *strongly* encourage everyone to learn.  But "THEY. NEED. TO.
LEARN." is not obvious to me.

Anecdotally, my students are doing remarkably (to me, as a teacher)
complex modeling with graphical interfaces to statistical and
simulation packages (SPSS/AMOS, Artisoc, respectively), and collection
of large textual databases from SNS with cargo-culted Python programs.
For the past twenty years teaching social scientists, these accidental
barriers (as Fred Brooks would have called them) have dropped
dramatically, to the point where it's possible to do superficially
good-looking (= complex) but entirely meaningless :-/ empirical
research.  (In some ways I think this lowered cost has been horribly
detrimental to my work as an educator in applied social science. ;-)

The point being that "user-friendly" UI in many fields where (fairly)
advanced computing is used is more than keeping up with the perceived
needs of most computer users, while the essential (in the sense of
Brooks) non-computing modeling difficulties of their jobs remain.

By "perceived" I mean I want my students using TeX, but it's hard to
force them when all their professors (except me and a couple
mathematicians) use Word (speaking of irreproducible results).  It's
good enough for government work, and that's in fact where many of them
end up (and the great majority are either in government or in
equivalent corporate bureaucrat positions).  Yes, I meant the
deprecatory connotations of "perceived", but realistically, I admit
that maybe they *don't* *need* the more polished tech that I could
teach them.


I remember when I first started out teaching Software Carpentry I made the
embarrassing mistake (coming from Physics) of assuming that LaTex is
de-facto in most other academic fields :)

 > Hiding it is not a good idea for anyone.

Agreed.  Command lines and REPLs teach humility, to me as well as my
students. :-)

Steve


--
Associate Professor  Division of Policy and Planning Science
http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] install pip packages from Python prompt

2017-11-02 Thread Erik Bray
On Oct 30, 2017 8:57 PM, "Alex Walters" <tritium-l...@sdamon.com> wrote:



> -Original Message-
> From: Python-ideas [mailto:python-ideas-bounces+tritium-
> list=sdamon@python.org] On Behalf Of Erik Bray
> Sent: Monday, October 30, 2017 6:28 AM
> To: Python-Ideas <python-ideas@python.org>
> Subject: Re: [Python-ideas] install pip packages from Python prompt
>
> On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters <tritium-l...@sdamon.com>
> wrote:
> > Then those users have more fundamental problems.  There is a minimum
> level
> > of computer knowledge needed to be successful in programming.
> Insulating
> > users from the reality of the situation is not preparing them to be
> > successful.  Pretending that there is no system command prompt, or
shell,
> or
> > whatever platform specific term applies, only hurts new programmers.
> Give
> > users an error message they can google, and they will be better off in
the
> > long run than they would be if we just ran pip for them.
>
> While I completely agree with this in principle, I think you
> overestimate the average beginner.

Nope.  I totally get that they don’t know what a shell or command prompt
is.  THEY. NEED. TO. LEARN.  Hiding it is not a good idea for anyone.  If
this is an insurmountable problem for the newbie, maybe they really
shouldn’t be attempting to program.  This field is not for everyone.


Reading this I get the impression, and correct me if I'm wrong, that you've
never taught beginners programming. Of course long term (heck in fact
fairly early on) they need to learn these nitty-gritty and sometimes
frustrating lessons, but not in a 2 hour intro to programming for total
beginners.

And I beg to differ--this field is for everyone, and increasingly moreso
every day. Doesn't mean it's easy, but it is and can be for everyone.

Whether this specific proposal is technically feasible in a cross-platform
manner with the state of the Python interpreter and import system is
another question. But that's a discussion worth having. "Some people aren't
cut out for programming" isn't.


>  Many beginners I've taught or
> helped, even if they can manage to get to the correct command prompt,
> often don't even know how to run the correct Python.  They might often
> have multiple Pythons installed on their system--maybe they have
> Anaconda, maybe Python installed by homebrew, or a Python that came
> with an IDE like Spyder.  If they're on OSX often running "python"
> from the command prompt gives the system's crippled Python 2.6 and
> they don't know the difference.
>
> One thing that has been a step in the right direction is moving more
> documentation toward preferring running `python -m pip` over just
> `pip`, since this often has a better guarantee of running `pip` in the
> Python interpreter you intended.  But that still requires one to know
> how to run the correct Python interpreter from the command-line (which
> the newbie double-clicking on IDLE may not even have a concept of...).
>
> While I agree this is something that is important for beginners to
> learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for
> many newbies just to install one or two packages from pip, which they
> often might need/want to do for whatever educational pursuit they're
> following (heck, it's pretty common even just to want to install the
> `requests` module, as I would never throw `urllib` at a beginner).
>
> So while I don't think anything proposed here will work technically, I
> am in favor of an in-interpreter pip install functionality.  Perhaps
> it could work something like this:
>
> a) Allow it *only* in interactive mode:  running `pip(...)` (or
> whatever this looks like) outside of interactive mode raises a
> `RuntimeError` with the appropriate documentation
> b) When running `pip(...)` the user is supplied with an interactive
> prompt explaining that since installing packages with `pip()` can
> result in changes to the interpreter, it is necessary to restart the
> interpreter after installation--give them an opportunity to cancel the
> action in case they have any work they need to save.  If they proceed,
> install the new package then restart the interpreter for them.  This
> avoids any ambiguity as to states of loaded modules before/after pip
> install.
> > From: Stephan Houben [mailto:stephan...@gmail.com]
> > Sent: Sunday, October 29, 2017 3:43 PM
> > To: Alex Walters <tritium-l...@sdamon.com>
> > Cc: Python-Ideas <python-ideas@python.org>
> > Subject: Re: [Python-ideas] install pip packages from Python prompt
> >
> >
> >
> > Hi Alex,
> >
> >
> >
> > 2017-10-29 20:26 GMT+01:00 Alex Walters <tri

Re: [Python-ideas] install pip packages from Python prompt

2017-10-30 Thread Erik Bray
On Mon, Oct 30, 2017 at 11:27 AM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters <tritium-l...@sdamon.com> wrote:
>> Then those users have more fundamental problems.  There is a minimum level
>> of computer knowledge needed to be successful in programming.  Insulating
>> users from the reality of the situation is not preparing them to be
>> successful.  Pretending that there is no system command prompt, or shell, or
>> whatever platform specific term applies, only hurts new programmers.  Give
>> users an error message they can google, and they will be better off in the
>> long run than they would be if we just ran pip for them.
>
> While I completely agree with this in principle, I think you
> overestimate the average beginner.  Many beginners I've taught or
> helped, even if they can manage to get to the correct command prompt,
> often don't even know how to run the correct Python.  They might often
> have multiple Pythons installed on their system--maybe they have
> Anaconda, maybe Python installed by homebrew, or a Python that came
> with an IDE like Spyder.  If they're on OSX often running "python"
> from the command prompt gives the system's crippled Python 2.6 and
> they don't know the difference.


I should add--another case that is becoming extremely common is
beginners learning Python for the first time inside the
Jupyter/IPython Notebook.  And in my experience it can be very
difficult for beginners to understand the connection between what's
happening in the notebook ("it's in the web-browser--what does that
have to do with anything on my computer??") and the underlying Python
interpreter, file system, etc.  Being able to pip install from within
the Notebook would be a big win.  This is already possible since
IPython allows running system commands and it is possible to run the
pip executable from the notebook, then manually restart the Jupyter
kernel.

It's not 100% clear to me how my proposal below would work within a
Jupyter Notebook, so that would also be an angle worth looking into.

Best,
Erik


> One thing that has been a step in the right direction is moving more
> documentation toward preferring running `python -m pip` over just
> `pip`, since this often has a better guarantee of running `pip` in the
> Python interpreter you intended.  But that still requires one to know
> how to run the correct Python interpreter from the command-line (which
> the newbie double-clicking on IDLE may not even have a concept of...).
>
> While I agree this is something that is important for beginners to
> learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for
> many newbies just to install one or two packages from pip, which they
> often might need/want to do for whatever educational pursuit they're
> following (heck, it's pretty common even just to want to install the
> `requests` module, as I would never throw `urllib` at a beginner).
>
> So while I don't think anything proposed here will work technically, I
> am in favor of an in-interpreter pip install functionality.  Perhaps
> it could work something like this:
>
> a) Allow it *only* in interactive mode:  running `pip(...)` (or
> whatever this looks like) outside of interactive mode raises a
> `RuntimeError` with the appropriate documentation
> b) When running `pip(...)` the user is supplied with an interactive
> prompt explaining that since installing packages with `pip()` can
> result in changes to the interpreter, it is necessary to restart the
> interpreter after installation--give them an opportunity to cancel the
> action in case they have any work they need to save.  If they proceed,
> install the new package then restart the interpreter for them.  This
> avoids any ambiguity as to states of loaded modules before/after pip
> install.
>
>
>
>> From: Stephan Houben [mailto:stephan...@gmail.com]
>> Sent: Sunday, October 29, 2017 3:43 PM
>> To: Alex Walters <tritium-l...@sdamon.com>
>> Cc: Python-Ideas <python-ideas@python.org>
>> Subject: Re: [Python-ideas] install pip packages from Python prompt
>>
>>
>>
>> Hi Alex,
>>
>>
>>
>> 2017-10-29 20:26 GMT+01:00 Alex Walters <tritium-l...@sdamon.com>:
>>
>> return “Please run pip from your system command prompt”
>>
>>
>>
>>
>>
>> The target audience for my proposal are people who do not know
>>
>> which part of the sheep the "system command prompt" is.
>>
>> Stephan
>>
>>
>>
>>
>>
>> From: Python-ideas
>> [mailto:python-ideas-bounces+tritium-list=sdamon@python.org] On Behalf
>> Of Stephan Houben
>> Sent: Sunda

Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 3:19 PM, Greg Ewing <greg.ew...@canterbury.ac.nz> wrote:
> Erik Bray wrote:
>>
>> At this point a potentially
>> waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
>> while inside the with statement's suite, and finally block, and hence
>> Lock.__exit__ are entered.
>
>
> Seems to me this is the behaviour you *want* in this case,
> otherwise the lock can be acquired and never released.
> It's disconcerting that it seems to be very difficult to
> get that behaviour with a pure Python implementation.

I think normally you're right--this is the behavior you would *want*,
but not the behavior that's consistent with how Python implements the
`with` statement, all else being equal.  Though it's still not
entirely fair either because if Lock.__enter__ were pure Python
somehow, it's possible the exception would be raised either before or
after the lock is actually marked as "acquired", whereas in the C
implementation acquisition of the lock will always succeed (assuming
the lock was free, and no other exceptional conditions) before the
signal handler is executed.

>> I think it might be possible to
>> gain more consistency between these cases if pending signals are
>> checked/handled after any direct call to PyCFunction from within the
>> ceval loop.
>
>
> IMO that would be going in the wrong direction by making
> the C case just as broken as the Python case.
>
> Instead, I would ask what needs to be done to make this
> work correctly in the Python case as well as the C case.

You have a point there, but at the same time the Python case, while
"broken" insofar as it can lead to broken code, seems correct from the
Pythonic perspective.  The other possibility would be to actually
change the semantics of the `with` statement. Or as you mention below,
a way to temporarily mask signals...

> I don't think it's even possible to write Python code that
> does this correctly at the moment. What's needed is a
> way to temporarily mask delivery of asynchronous exceptions
> for a region of code, but unless I've missed something,
> no such facility is currently provided.
>
> What would such a facility look like? One possibility
> would be to model it on the sigsetmask() system call, so
> there would be a function such as
>
>mask_async_signals(bool)
>
> that turns delivery of async signals on or off.
>
> However, I don't think that would work. To fix the locking
> case, what we need to do is mask async signals during the
> locking operation, and only unmask them once the lock has
> been acquired. We might write a context manager with an
> __enter__ method like this:
>
>def __enter__(self):
>   mask_async_signals(True)
>   try:
>  self.acquire()
>   finally:
>  mask_async_signals(False)
>
> But then we have the same problem again -- if a Keyboard
> Interrupt occurs after mask_async_signals(False) but
> before __enter__ returns, the lock won't get released.

Exactly.

> Another approach would be to provide a context manager
> such as
>
>async_signals_masked(bool)
>
> Then the whole locking operation could be written as
>
>with async_signals_masked(True):
>   lock.acquire()
>   try:
>  with async_signals_masked(False):
> # do stuff here
>   finally:
>  lock.release()
>
> Now there's no possibility for a KeyboardInterrupt to
> be delivered until we're safely inside the body, but we've
> lost the ability to capture the pattern in the form of
> a context manager.
>
> The only way out of this I can think of at the moment is
> to make the above pattern part of the context manager
> protocol itself. In other words, async exceptions are
> always masked while the __enter__ and __exit__ methods
> are executing, and unmasked while the body is executing.

I think so too.  That's more or less in line with Nick's idea on njs's
issue (https://bugs.python.org/issue29988) of an ATOMIC_UNTIL opcode.
That's just one implementation possibility.  My question would be to
make that a language-level requirement of the context manager
protocol, or just something CPython does...

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 3:09 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> On 28 June 2017 at 21:40, Erik Bray <erik.m.b...@gmail.com> wrote:
>>> My colleague's contention is that given
>>>
>>> lock = threading.Lock()
>>>
>>> this is simply *wrong*:
>>>
>>> lock.acquire()
>>> try:
>>> do_something()
>>> finally:
>>> lock.release()
>>>
>>> whereas this is okay:
>>>
>>> with lock:
>>> do_something()
>>
>> Technically both are slightly racy with respect to async signals (e.g.
>> KeyboardInterrupt), but the with statement form is less exposed to the
>> problem (since it does more of its work in single opcodes).
>>
>> Nathaniel Smith posted a good write-up of the technical details to the
>> issue tracker based on his work with trio:
>> https://bugs.python.org/issue29988
>
> Interesting; thanks for pointing this out.  Part of me felt like this
> has to have come up before but my searching didn't bring this up
> somehow (and even then it's only a couple months old itself).
>
> I didn't think about the possible race condition before
> WITH_CLEANUP_START, but obviously that's a possibility as well.
> Anyways since this is already acknowledged as a real bug I guess any
> further followup can happen on the issue tracker.

On second thought, maybe there is a case to made w.r.t. making a
documentation change about the semantics of the `with` statement:

The old-style syntax cannot make any guarantees about atomicity w.r.t.
async events.  That is, there's no way syntactically in Python to
declare that no exception will be raised between "lock.acquire()" and
the setup of the "try/finally" blocks.

However, if issue-29988 were *fixed* somehow (and I'm not convinced it
can't be fixed in the limited case of `with` statements) then there
really would be a major semantic difference of the `with` statement in
that it does support this invariant.  Then the question is whether
that difference be made a requirement of the language (probably too
onerous a requirement?), or just a feature of CPython (which should
still be documented one way or the other IMO).

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 28 June 2017 at 21:40, Erik Bray <erik.m.b...@gmail.com> wrote:
>> My colleague's contention is that given
>>
>> lock = threading.Lock()
>>
>> this is simply *wrong*:
>>
>> lock.acquire()
>> try:
>> do_something()
>> finally:
>> lock.release()
>>
>> whereas this is okay:
>>
>> with lock:
>> do_something()
>
> Technically both are slightly racy with respect to async signals (e.g.
> KeyboardInterrupt), but the with statement form is less exposed to the
> problem (since it does more of its work in single opcodes).
>
> Nathaniel Smith posted a good write-up of the technical details to the
> issue tracker based on his work with trio:
> https://bugs.python.org/issue29988

Interesting; thanks for pointing this out.  Part of me felt like this
has to have come up before but my searching didn't bring this up
somehow (and even then it's only a couple months old itself).

I didn't think about the possible race condition before
WITH_CLEANUP_START, but obviously that's a possibility as well.
Anyways since this is already acknowledged as a real bug I guess any
further followup can happen on the issue tracker.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
Hi folks,

I normally wouldn't bring something like this up here, except I think
that there is possibility of something to be done--a language
documentation clarification if nothing else, though possibly an actual
code change as well.

I've been having an argument with a colleague over the last couple
days over the proper way order of statements when setting up a
try/finally to perform cleanup of some action.  On some level we're
both being stubborn I think, and I'm not looking for resolution as to
who's right/wrong or I wouldn't bring it to this list in the first
place.  The original argument was over setting and later restoring
os.environ, but we ended up arguing over
threading.Lock.acquire/release which I think is a more interesting
example of the problem, and he did raise a good point that I do want
to bring up.



My colleague's contention is that given

lock = threading.Lock()

this is simply *wrong*:

lock.acquire()
try:
do_something()
finally:
lock.release()

whereas this is okay:

with lock:
do_something()


Ignoring other details of how threading.Lock is actually implemented,
assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls
release() then as far as I've known ever since Python 2.5 first came
out these two examples are semantically *equivalent*, and I can't find
any way of reading PEP 343 or the Python language reference that would
suggest otherwise.

However, there *is* a difference, and has to do with how signals are
handled, particularly w.r.t. context managers implemented in C (hence
we are talking CPython specifically):

If Lock.__enter__ is a pure Python method (even if it maybe calls some
C methods), and a SIGINT is handled during execution of that method,
then in almost all cases a KeyboardInterrupt exception will be raised
from within Lock.__enter__--this means the suite under the with:
statement is never evaluated, and Lock.__exit__ is never called.  You
can be fairly sure the KeyboardInterrupt will be raised from somewhere
within a pure Python Lock.__enter__ because there will usually be at
least one remaining opcode to be evaluated, such as RETURN_VALUE.
Because of how delayed execution of signal handlers is implemented in
the pyeval main loop, this means the signal handler for SIGINT will be
called *before* RETURN_VALUE, resulting in the KeyboardInterrupt
exception being raised.  Standard stuff.

However, if Lock.__enter__ is a PyCFunction things are quite
different.  If you look at how the SETUP_WITH opcode is implemented,
it first calls the __enter__ method with _PyObjet_CallNoArg.  If this
returns NULL (i.e. an exception occurred in __enter__) then "goto
error" is executed and the exception is raised.  However if it returns
non-NULL the finally block is set up with PyFrame_BlockSetup and
execution proceeds to the next opcode.  At this point a potentially
waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
while inside the with statement's suite, and finally block, and hence
Lock.__exit__ are entered.

Long story short, because Lock.__enter__ is a C function, assuming
that it succeeds normally then

with lock:
do_something()

always guarantees that Lock.__exit__ will be called if a SIGINT was
handled inside Lock.__enter__, whereas with

lock.acquire()
try:
...
finally:
lock.release()

there is at last a small possibility that the SIGINT handler is called
after the CALL_FUNCTION op but before the try/finally block is entered
(e.g. before executing POP_TOP or SETUP_FINALLY).  So the end result
is that the lock is held and never released after the
KeyboardInterrupt (whether or not it's handled somehow).

Whereas, again, if Lock.__enter__ is a pure Python function there's
less likely to be any difference (though I don't think the possibility
can be ruled out entirely).

At the very least I think this quirk of CPython should be mentioned
somewhere (since in all other cases the semantic meaning of the
"with:" statement is clear).  However, I think it might be possible to
gain more consistency between these cases if pending signals are
checked/handled after any direct call to PyCFunction from within the
ceval loop.

Sorry for the tl;dr; any thoughts?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Run length encoding

2017-06-19 Thread Erik

On 19/06/17 02:47, David Mertz wrote:
As an only semi-joke, I have created a module on GH that meets the needs 
of this discussion (using the spelling I think are most elegant):


https://github.com/DavidMertz/RLE


It's a shame you have to build that list when encoding. I tried to work 
out a way to get the number of items in an iterable without having to 
capture all the values (on the understanding that if the iterable is 
already an iterator, it would be consumed).


The best I came up with so far (not general purpose, but it works in 
this scenario) is:


from iterator import groupby
from operator import countOf

def rle_encode(it):
return ((k, countOf(g, k)) for k, g in groupby(it))

In your test code, this speeds things up quite a bit over building the 
list, but that's presumably only because both groupby() and countOf() 
will use the standard class comparison operator methods which in the 
case of ints will short-circuit with a C-level pointer comparison first.


For user-defined classes with complicated comparison methods, getting 
the length of the group by comparing the items will probably be worse.


Is there a better way of implementing a general-purpose "ilen()"? I 
tried a couple of other things, but they all required at least one 
lambda function and slowed things down by about 50% compared to the 
list-building version.


(I agree this is sort of a joke, but it's still an interesting puzzle ...).

Regards, E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context

2017-06-16 Thread Erik

[cross-posted to python-ideas]

Hi Robert,

On 16/06/17 12:32, Robert Vanden Eynde wrote:
Hello, I would like to propose an idea for the language but I don't know 
where I can talk about it.


Can you please explain what the problem is that you are trying to solve?


In a nutshell, I would like to be able to write:
y = (b+2 for b = a + 1)


The above is (almost) equivalent to:

y = (a+1)+2

I realize the parentheses are not required, but I've included them 
because if your example mixed operators with different precedence then 
they might be necessary.


Other than binding 'b' (you haven't defined what you expect the scope of 
that to be, but I'll assume it's the outer scope for now), what is it 
about the form you're proposing that's different?



Or in list comprehension:
Y = [b+2 for a in L for b = a+1]

Which can already be done like this:
Y = [b+2 for a in L for b in [a+1]]


Y = [(a+1)+2 for a in L]

Which is less obvious, has a small overhead (iterating over a list) and 
get messy with multiple assignment:

Y =  [b+c+2 for a in L for b,c in [(a+1,a+2)]]

New syntax would allow to write:
Y =  [b+c+2 for a in L for b,c = (a+1,a+2)]


Y = [(a+1)+(a+2)+2 for a in L]

My first example (b+2 for b = a+1) can already be done using ugly syntax 
using lambda


y = (lambda b: b+2)(b=a+1)
y = (lambda b: b+2)(a+1)
y = (lambda b=a+1: b+2)()

Choice of syntax: for is good because it uses current keyword, and the 
analogy for x = 5 vs for x in [5] is natural.


But the "for" loses the meaning of iteration.
The use of "with" would maybe sound more logical.

Python already have the "functional if", lambdas, list comprehension, 
but not simple assignment functional style.


Can you present an example that can't be re-written simply by reducing 
the expression as I have done above?


Regards, E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Dictionary destructing and unpacking.

2017-06-07 Thread Erik

On 07/06/17 23:42, C Anthony Risinger wrote:

Neither of these are really comparable to destructuring.


No, but they are comparable to the OP's suggested new built-in method 
(without requiring each mapping type - not just dicts - to implement 
it). That was what _I_ was responding to.


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Dictionary destructing and unpacking.

2017-06-07 Thread Erik

On 07/06/17 19:14, Nick Humrich wrote:

a, b, c = mydict.unpack('a', 'b', 'c')


def retrieve(mapping, *keys):
   return (mapping[key] for key in keys)



$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def retrieve(mapping, *keys):
... return (mapping[key] for key in keys)
...
>>> d = {'a': 1, 'b': None, 100: 'Foo' }
>>> a, b, c = retrieve(d, 'a', 'b', 100)
>>> a, b, c
(1, None, 'Foo')


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] π = math.pi

2017-06-02 Thread Erik Bray
On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing  wrote:
> Victor Stinner wrote:
>>
>> How do you write π (pi) with a keyboard on Windows, Linux or macOS?
>
>
> On a Mac, π is Option-p and ∑ is Option-w.

I don't have a strong opinion about it being in the stdlib, but I'd
also point out that a strong advantage to having these defined in a
module at all is that third-party interpreters (e.g. IPython, bpython,
some IDEs) that support tab-completion make these easy to type as
well, and I find them to be very readable for math-heavy code.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Suggestion: push() method for lists

2017-05-22 Thread Erik

On 21/05/17 15:43, Paul Laos wrote:
 push(obj) would be 
equivalent to insert(index = -1, object), having -1 as the default index 
parameter. In fact, push() could replace both append() and insert() by 
unifying them.


I don't think list.insert() with an index of -1 does what you think it does:

$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> l = [0, 1, 2]
>>> l
[0, 1, 2]
>>> l.insert(-1, 99)
>>> l
[0, 1, 99, 2]
>>>

Because the indices can be thought of as referencing the spaces 
_between_ the objects, having a push() in which -1 is referencing a 
different 'space' than a -1 given to insert() or a slice operation 
refers to would, I suspect, be a source of confusion (and off-by-one bugs).


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add an option for delimiters in bytes.hex()

2017-05-03 Thread Erik

On 04/05/17 01:24, Steven D'Aprano wrote:

On Thu, May 04, 2017 at 12:13:25AM +0100, Erik wrote:

I had a use-case where splitting an iterable into a sequence of
same-sized chunks efficiently improved the performance of my code

[...]

So I didn't propose it. I have no idea now what I spent my saved hours
doing, but I imagine that it was fun



Summary: I didn't present the argument because I'm not a masochist


I'm not sure what the point of that anecdote was, unless it was "I wrote
some useful code, and you missed out".


Then you have misunderstood me. Paul suggested that my use-case 
(chunking could be faster) was perhaps enough to propose that my patch 
may be considered. I responded with historical/empirical evidence that 
perhaps that would actually not be the case.


I was responding, honestly, to the questions raised by Paul's email.


Your comments come across as a passive-aggressive chastisment of the
core devs and the Python-Ideas community for being too quick to reject
useful code: we missed out on something good, because you don't have the
time or energy to deal with our negativity and knee-jerk rejection of
everything good. That's the way your series of posts come across to me.


I apologise if my words or my turn of phrase do not appeal to you. I am 
trying to be constructive with everything I post.


If you choose to interpret my messages in a different way then I'm not 
sure what I can do about that.


Back to the important stuff though:


- you could have offered it to the moreitertools project;


A more efficient version of moreitertools.chunked() is what we're 
talking about.



- you could have published it on PyPy;


Does PyPy support C extension modules? If so, that's a possibility.


- you could have proposed it on Python-Ideas with an explicit statement


I may well do that - my current patch (because of when I did it) is 
against a Py2 codebase, but I could port it to Py3. I still have a 
nagging doubt that I'd be wasting my time though ;)




If
you care so little that you can't be bothered even to propose it, why do
you care if it is rejected?


You are mistaking not caring enough about the functionality with not 
caring enough to enter into an argument about including that 
functionality ...


I didn't propose it at the time because of the reasons I mentioned. But 
when I saw something being discussed yet again that I had a general 
solution for already written I thought I mention it in case it was 
useful. As I said, I'm _trying_ to be constructive.


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add an option for delimiters in bytes.hex()

2017-05-03 Thread Erik

Hi Paul,

On 03/05/17 08:57, Paul Moore wrote:
> On 3 May 2017 at 02:48, Erik <pyt...@lucidity.plus.com> wrote:
>> Anyway, I know you can't stop anyone from *proposing* something like 
this,

>> but as soon as they do you may decide to quote the recipe from
>> "https://docs.python.org/3/library/functions.html#zip; and try to block
>> their proposition. There are already threads on fora that do that.
>>
>> That was my sticking point at the time when I implemented a general
>> solution. Why bother to propose something that (although it made my code
>> significantly faster) had already been blocked as being something that
>> should be a python-level operation and not something to be included in a
>> built-in?
>
> It sounds like you have a reasonable response to the suggestion of
> using zip- that you have a use case where performance matters, and
> your proposed solution is of value in that case.

I don't think so, though.

I had a use-case where splitting an iterable into a sequence of 
same-sized chunks efficiently improved the performance of my code 
significantly (processing a LOT of 24-bit, multi-channel - 16 to 32 - 
PCM streams from a WAV file).


Having thought "I need to split this stream by a fixed number of bytes" 
and then found more_itertools.chunked() (and the 
zip_longest(*([iter(foo)] * num)) trick) it turned out they were not 
quick enough so I implemented itertools.chunked() in C.


That worked well for me, so when I was done I did a search in case it 
was worth proposing as an enhancement to feed it back to the community. 
Then I came across things such as the following:


http://bugs.python.org/issue6021

I am specifically referring to the "It has been rejected before" 
comment, also mentioned here:


https://mail.python.org/pipermail/python-dev/2012-July/120885.html

See this entire thread, too:

https://mail.python.org/pipermail/python-ideas/2012-July/015671.html

This is the reason why I really just didn't care enough to go through 
the process of proposing it in the end (even though the 
more_itertools.chunked function was one of the first 3 implemented in 
V1.0 and seems to _still_ be cropping up all the time in different 
guises - so is perhaps more fundamental than people recognise).


The strong implication of the discussions linked to above is that if it 
had been mentioned before it would be immediately rejected, and that was 
supported by several members of the community in good standing.


So I didn't propose it. I have no idea now what I spent my saved hours 
doing, but I imagine that it was fun


> Whether it's a
> *sufficient* response remains to be seen, but unless you present the
> argument we won't know.

Summary: I didn't present the argument because I'm not a masochist

Regards, E.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-05-02 Thread Erik

On 26/04/17 21:50, Chris Angelico wrote:

On Thu, Apr 27, 2017 at 6:24 AM, Erik <pyt...@lucidity.plus.com> wrote:

The background is that what I find myself doing a lot of for private
projects is importing data from databases into a structured collection of
objects and then grouping and analyzing the data in different ways before
graphing the results.

So yes, I tend to have classes that accept their entire object state as
parameters to the __init__ method (from the database values) and then any
other methods in the class are generally to do with the subsequent analysis
(including dunder methods for iteration, rendering and comparison etc).


You may want to try designing your objects as namedtuples. That gives
you a lot of what you're looking for.


I did look at this. It looked promising.

What I found was that I spent a lot of time working out how to subclass 
namedtuples properly (I do need to do that to add the extra logic - and 
sometimes some state - for my analysis) and once I got that working, I 
was left with a whole different set of boilerplate and special cases and 
therefore another set of things to remember if I return to this code at 
some point.


So I've reverted to regular classes and multiple assignments in __init__.

E.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add an option for delimiters in bytes.hex()

2017-05-02 Thread Erik

On 02/05/17 12:31, Steven D'Aprano wrote:

I disagree with this approach. There's nothing special about bytes.hex()
here, perhaps we want to format the output of hex() or bin() or oct(),
or for that matter "%x" and any of the other string templates?

In fact, this is a string operation that could apply to any character
string, including decimal digits.

Rather than duplicate the API and logic everywhere, I suggest we add a
new string method. My suggestion is str.chunk(size, delimiter=' ') and
str.rchunk() with the same arguments:

"1234ABCDEF".chunk(4)
=> returns "1234 ABCD EF"


FWIW, I implemented a version of something similar as a fixed-length 
"chunk" method in itertoolsmodule.c (it was similar to izip_longest - it 
had a "fill" keyword to pad the final chunk). It was ~100 LOC including 
the structure definitions. The chunk method was an iterator (so it 
returned a sequence of "chunks" as defined by the API).


Then I read that "itertools" should consist of primitives only and that 
we should defer to "moreitertools" for anything that is of a higher 
level (which this is - it can be done in terms of itertools functions). 
So I didn't propose it, although the processing of my WAV files (in 
which the sample data are groups of bytes - frames - of a fixed length) 
was significantly faster with it :(


I also looked at implementing itertools.chunk as a function that would 
make use of a "__chunk__" method on the source object if it existed 
(which allowed a class to support an even more efficient version of 
chunking - things like range() etc).



I don't see any advantage to adding this to bytes.hex(), hex(), oct(),
bin(), and I really don't think it is helpful to be grouping the
characters by the number of bits. Its a string formatting operation, not
a bit operation.


Why do you want to limit it to strings? Isn't something like this 
potentially useful for all sequences (where the result is a tuple of 
objects that are the same as the source sequence - be that strings or 
lists or lazy ranges or whatever?). Why aren't the chunks returned via 
an iterator?


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-28 Thread Erik

On 28/04/17 10:47, Paul Moore wrote:

On 28 April 2017 at 00:18, Erik <pyt...@lucidity.plus.com> wrote:

The semantics are very different and there's little or no connection
between importing a module and setting an attribute on self.


At the technical level of what goes on under the covers, yes. At the higher
level of what the words mean in spoken English, it's really not so different
a concept.


I disagree. If you were importing into the *class* (instance?) I might
begin to see a connection, but importing into self?


I know you already understand the following, but I'll spell it out 
anyway. Here's a module:


-
$ cat foo.py
def foo():
  global sys
  import sys

  current_namespace = set(globals().keys())
  print(initial_namespace ^ current_namespace)

def bar():
  before_import = set(locals().keys())
  import os
  after_import = set(locals().keys())
  print(before_import ^ after_import)

initial_namespace = set(globals().keys())
-

Now, what happens when I run those functions:

$ python3
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import foo
>>> foo.foo()
{'sys', 'initial_namespace'}
>>> foo.bar()
{'before_import', 'os'}
>>>

... so the net effect of "import" is to bind an object into a namespace 
(a dict somewhere). In the case of 'foo()' it's binding the module 
object for "sys" into the dict of the module object that represents 
'foo.py'. In the case of 'bar()' it's binding the module object for "os" 
into the dict representing the local namespace of the current instance 
of the bar() call.


Isn't binding an object to a namespace the same operation that 
assignment performs?


So it's a type of assignment, and one that doesn't require the name to 
be spelled twice in the current syntax (and that's partly why I took 
offense at a suggestion - not by you - that I was picking "random or 
arbitrary" keywords. I picked it for that specific reason).


I realize that there are other semantic changes (importing a module 
twice doesn't do anything - and specifically repeated "import * from 
mod" will not do anything if the module mutates) - and perhaps this is 
your point.



Also, if you try to make the obvious generalisations (which you'd *have* to be 
able to
make due to the way Python works) things quickly get out of hand:

def __init__(self, a):
self import a


self.a = a



OK, but self is just a variable name, so we can reasonably use a different name:

def __init__(foo, a):
foo import a


foo.a = a


So the syntax is  import 

Presumably the following also works, because there's nothing special
about parameters?

def __init__(x, a):
calc = a**2
x import calc


x.calc = calc


And of course there's nothing special about __init__

def my_method(self, a):
self import a


self.a = a


Or indeed about methods

def standalone(a, b):
a import b


a.b = b


or statements inside functions:

if __name __ == '__main__:
a = 12
b = 13
a import b


a.b = b


Hmm, I'd hope for a type error here. But what types would be allowed
for a?


I think you're assuming I'm suggesting some sort of magic around "self" 
or some such thing. I'm not. I've written above exactly what I would 
expect the examples to be equivalent to. It's just an assignment which 
doesn't repeat the name (and in the comma-separated version allows 
several names to be assigned using compact syntax without spelling them 
twice, which is where this whole thing spawned from).



See what I mean? Things get out of hand *very* fast.


I don't see how that's getting "out of hand". The proposal is nothing 
more complicated than a slightly-different spelling of assignment. It 
could be done today with a text-based preprocessor which converts the 
proposed form to an existing valid syntax. Therefore, if it's "out of 
hand" then so is the existing assignment syntax ;)


FWIW, I should probably state for the record that I'm not actually 
pushing for _anything_ right now. I'm replying to questions asked and 
also to statements made which I think have missed the point of what I 
was trying to say earlier. So I'm just engaging in the conversation at 
this point - if it appears confrontational then it's not meant to.



To summarise:

1. There's some serious technical issues with your proposal, which as
far as I can see can only be solved by arbitrary restrictions on how
it can be used


To be honest, I still don't understand what the serious technical issues 
are (other than the parser probably doesn't handle this sort of 
keyword/operator hybrid!). Is it just that I'm seeing the word "import" 
in this context as a type of assignment and you're seeing any reference 
to the word "import" as being a 

Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-27 Thread Erik

On 27/04/17 23:43, Steven D'Aprano wrote:

On Wed, Apr 26, 2017 at 11:29:19PM +0100, Erik wrote:

def __init__(self, a, b, c):
   self import a, b
   self.foo = c * 100


[snarky]
If we're going to randomly choose arbitrary keywords with no connection
to the operation being performed,


The keyword I chose was not random or arbitrary and it _does_ have a 
connection to the operation being performed (bind a value in the source 
namespace to the target namespace using the same name it had in the 
source namespace - or rename it using the 'as' keyword).



can we use `del` instead of `import`
because it's three characters less typing?


Comments like this just serve to dismiss or trivialize the discussion. 
We acknowledged that we're bikeshedding so it was not a serious 
suggestion, just a "synapse prodder" ...



But seriously, I hate this idea.


Good. It's not a proposal, but something that was supposed to generate 
constructive discussion.



The semantics are very different and there's little or no connection
between importing a module and setting an attribute on self.


At the technical level of what goes on under the covers, yes. At the 
higher level of what the words mean in spoken English, it's really not 
so different a concept.



If we're going to discuss pie-in-the-sky suggestions,


That is just dismissing/trivializing the conversation again.


(If you don't like "inject", I'm okay with "load" or even "push".)


No you're not, because that's a new keyword which might break existing 
code and that is even harder to justify than re-using an existing 
keyword in a different context.



the problem this solves isn't big or
important enough for the disruption of adding a new keyword.


So far, you are the only one to have suggested adding a new keyword, I 
think ;)


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 23:28, Paul Moore wrote:

Or to put it another way, if the only
reason for the syntax proposal is performance then show me a case
where performance is so critical that it warrants a language change.


It's the other way around.

The proposal (arguably) makes the code clearer but does not impact 
performance (and is a syntax error today, so does not break existing code).


The suggestions (decorators etc) make the code (arguably) clearer today 
without a syntax change, but impact performance.


So, those who think the decorators make for clearer code have to choose 
between source code clarity and performance.


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 19:15, Mike Miller wrote:

As the new syntax ideas piggyback on existing syntax, it doesn't feel
like that its a complete impossibility to have this solved.  Could be
another "fixed papercut" to drive Py3 adoption.  Taken individually not
a big deal but they add up.


*sigh* OK, this has occurred to me over the last couple of days but I 
didn't want to suggest it as I didn't want the discussion to fragment 
even more.


But, if we're going to bikeshed and there is some weight behind the idea 
that this "papercut" should be addressed, then given my previous 
comparisons with importing, what about having 'import' as an operator:


def __init__(self, a, b, c):
   self import a, b
   self.foo = c * 100

Also allows renaming:

def __init__(self, a, b, c):
   self import a, b, c as _c

Because people are conditioned to think the comma-separated values after 
"import" are not tuples, perhaps the use of import as an operator rides 
on that wave ...


(I do realise that blurring the lines between statements and operators 
like this is probably not going to work for technical reasons (and it 
just doesn't quite read correctly anyway), but now we're bikeshedding 
and who knows what someone else might come up with in response ...).


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 22:28, Paul Moore wrote:

On 26 April 2017 at 21:51, Erik <pyt...@lucidity.plus.com> wrote:

It doesn't make anything more efficient, however all of the suggestions of
how to do it with current syntax (mostly decorators) _do_ make things less
efficient.


Is instance creation the performance bottleneck in your application?


No, not at all. This discussion has split into two:

1) How can I personally achieve what I want for my own personal 
use-cases. This should really be on -list, and some variation of the 
decorator thing will probably suffice for me.


2) The original proposal, which does belong on -ideas and has to take 
into account the general case, not just my specific use-case.


The post you are responding to is part of (2), and hence reduced 
performance is a consideration.


Regards, E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 01:39, Nathaniel Smith wrote:
[snip discussion of why current augmented assignment operators are 
better for other reasons]



Are there any similar arguments for .=?


It doesn't make anything more efficient, however all of the suggestions 
of how to do it with current syntax (mostly decorators) _do_ make things 
less efficient.


So rather than a win/win as with current augmented assignment 
(compact/clearer code *and* potentially a performance improvement), it's 
now a tradeoff (wordy code *or* a performance reduction).


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 16:10, Nick Timkovich wrote:

I was wondering that if there are so many arguments to a function that
it *looks* ugly, that it might just *be* ugly.

For one, too many required arguments to a function (constructor,
whatever) is already strange. Binding them as attributes of the object,
unmodified in a constructor also seems to be rare.


Yes, and perhaps it's more of a problem for me because of my 
possibly-atypical use of Python.


The background is that what I find myself doing a lot of for private 
projects is importing data from databases into a structured collection 
of objects and then grouping and analyzing the data in different ways 
before graphing the results.


So yes, I tend to have classes that accept their entire object state as 
parameters to the __init__ method (from the database values) and then 
any other methods in the class are generally to do with the subsequent 
analysis (including dunder methods for iteration, rendering and 
comparison etc).


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 18:42, Mike Miller wrote:

I want to be able to say:


def __init__(self, foo, bar, baz, spam):
  self .= foo, bar, spam
  self.baz = baz * 100



I don't see ALL being set a big problem, and less work than typing
several of them out again.


Because, some of the parameters might be things that are just passed to 
another constructor to create an object that is then referenced by the 
object being created.


If one doesn't want the object's namespace to be polluted by that stuff 
(which may be large and also now can't be garbage collected while the 
object is alive) then a set of "del self.xxx" statements is required 
instead, so you've just replaced one problem with another ;)


I'd rather just explicitly say what I want to happen rather than have 
*everything* happen and then have to tidy that up instead ...


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 08:59, Paul Moore wrote:

It should be possible to modify the decorator to take a list
of the variable names you want to assign, but I suspect you won't like
that


Now you're second-guessing me.

> class MyClass:
> @auto_args('a', 'b')
> def __init__(self, a, b, c=None):
> pass

I had forgotten that decorators could take parameters. Something like 
that pretty much ticks the boxes for me.


I'd _prefer_ something that sits inside the method body rather than just 
outside it, and I'd probably _prefer_ something that wasn't quite so 
heavyweight at runtime (which may be an irrational concern on my part 
;)), but those aren't deal breakers, depending on the project - and the 
vast majority of what I do in Python is short-lived one-off projects and 
rapid prototyping for later implementation in another language, so I do 
seem to be fleshing out a set of classes from scratch and writing a 
bunch of __init__ methods far more of the time than people with 
long-lived projects would do. Perhaps that's why it irritates me more 
than it does some others ;)


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-26 Thread Erik

On 26/04/17 13:19, Joao S. O. Bueno wrote:

On 25 April 2017 at 19:30, Erik <pyt...@lucidity.plus.com> wrote:

decorators don't cut it anyway (at least not those
proposed) because they blindly assign ALL of the arguments. I'm more than
happy to hear of something that solves both of those problems without
needing syntax changes though, as that means I can have it today ;)


Sorry -  a decorator won't "blindly assign all argments" - it will do
that just if it is written to do so.


Right, and the three or four variants suggested (and the 
vars(self).update() suggestion) all do exactly that. I was talking about 
the specific responses (though I can see my language is vague).


[FWIW I've been using Python the whole time that decorators have existed 
and I've yet to need to write one - I've _used_ some non-parameterized 
ones though - so I guess I'd forgotten that they can take parameters]


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-25 Thread Erik

On 25/04/17 22:15, Brice PARENT wrote:

it may be easier to get something like this
(I think, as there is no new operator involved) :


No new operator, but still a syntax change, so that doesn't help from 
that POV.




def __init__(self, *args, **kwargs):
  self.* = *args
  self.** = **kwargs


What is "self.* = *args" supposed to do? For each positional argument, 
what name in the object is it bound to?


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Augmented assignment syntax for objects.

2017-04-25 Thread Erik

On 25/04/17 02:15, Chris Angelico wrote:

Bikeshedding: Your example looks a lot more like tuple assignment than
multiple assignment.


Well, originally, I thought it was just the spelling-the-same-name-twice 
thing that irritated me and I was just going to suggest a single 
assignment version like:


  self .= foo
  self .= bar

Then I thought that this is similar to importing (referencing an object 
from one namespace in another under the same name). In that scenario, 
instead of:


  from other import foo
  from other import bar

we have:

  from other import foo, bar

That's where the comma-separated idea came from, and I understand it 
looks like a tuple (which is why I explicitly mentioned that) but it 
does in the import syntax too ;)


The single argument version (though it doesn't help with vertical space) 
still reads better to me for the same reason that augmented assignment 
is clearer - there is no need to mentally parse that the same name is 
being used on both sides of the assignment because it's only spelled once.



self .= foo .= bar .= baz .= spam .= ham


Thanks for being the only person so far to understand that I don't 
necessarily want to bind ALL of the __init__ parameters to the object, 
just the ones I explicitly reference, but I'm not convinced by this 
suggestion. In chained assignment the thing on the RHS is bound to each 
name to the left of it and that is really not happening here.



The trouble is that this syntax is really only going to be used inside
__init__.


Even if that was true, who ever writes one of those? :D

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Augmented assignment syntax for objects.

2017-04-24 Thread Erik
Hi. I suspect that this may have been discussed to death at some point 
in the past, but I've done some searching and I didn't come up with 
much. Apologies if I'm rehashing an old argument ;)


I often find myself writing __init__ methods of the form:

def __init__(self, foo, bar, baz, spam, ham):
  self.foo = foo
  self.bar = bar
  self.baz = baz
  self.spam = spam
  self.ham = ham

This seems a little wordy and uses a lot of vertical space on the 
screen. Occasionally, I have considered something like:


def __init__(self, foo, bar, baz, spam, ham):
  self.foo, self.bar, self.baz, self.spam, self.ham = \
 foo, bar, baz, spam, ham

... just to make it a bit more compact - though in practice, I'd 
probably not do that with a list quite that long ... two or three items 
at most:


def __init__(self, foo, bar, baz):
   self.foo, self.bar, self.baz = foo, bar, baz

When I do that I'm torn because I know it has a runtime impact to create 
and unpack the implicit tuples and I'm also introducing a style 
asymmetry in my code just because of the number of parameters a method 
happens to have.


So why not have an augmented assignment operator for object attributes? 
It addresses one of the same broad issues that the other augmented 
assignment operators were introduced for (that of repeatedly spelling 
names).


The suggestion therefore is:

def __init__(self, foo, bar, baz, spam, ham):
  self .= foo, bar, baz, spam, ham

This is purely syntactic sugar for the original example:

def __init__(self, foo, bar, baz, spam, ham):
  self.foo = foo
  self.bar = bar
  self.baz = baz
  self.spam = spam
  self.ham = ham

... so if any of the attributes have setters, then they are called as 
usual. It's purely a syntactic shorthand. Any token which is not 
suitable on the RHS of the dot in a standard "obj.attr =" assignment is 
a syntax error (no "self .= 1").


The comma-separators in the example are not creating a tuple object, 
they would work at the same level in the parser as the import 
statement's comma-separated lists - in the same way that "from pkg 
import a, b, c" is the same as saying:


import pkg
a = pkg.a
b = pkg.b
c = pkg.c

... "self .= a, b, c" is the same as writing:

self.a = a
self.b = b
self.c = c

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-08 Thread Erik

On 08/03/17 11:07, Steven D'Aprano wrote:

I mentioned earlier that I have code which has to track the type of list
items, and swaps to a different algorithm when the types are not all the
same.


Hmmm. Yes, I guess if the expensive version requires a lot of 
isinstance() messing or similar for each element then it could be better 
to have optimized versions for homogeneous lists of ints or strings etc.



A list.is_heterogeneous() method
could be implemented if it was necessary,


I would prefer to get the list item's type:

if mylist.__type_hint__ is float:


If you know the list is homogeneous then the item's type is 
"type(mylist[0])".


Also, having it be a function call gives an obvious place to put the 
transition from "unknown" to known state if the tri-state hint approach 
was taken. Otherwise, that would have to be hooked into the attribute 
access somehow.


That's for someone who wants to try implementing it to decide and 
propose though :)


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

2017-03-07 Thread Erik

On 07/03/17 20:46, Erik wrote:

(unless it
was acceptable that once heterogeneous, a list is always considered
heterogeneous - i.e., delete always sets the hint to NULL).


Rubbish. I meant that delete would not touch the hint at all.

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] For/in/as syntax

2017-03-04 Thread Erik

Hi Brice,

On 04/03/17 08:45, Brice PARENT wrote:

* Creating a real object at runtime for each loop which needs to be
the target of a non-inner break or continue



However, I'm not sure the object should be constructed and fed for every
loop usage. It should probably only be instanciated if explicitly asked
by the coder (by the use of "as loop_name").


That's what I meant by "needs to be the target of a non-inner break or 
continue" (OK, you are proposing something more than just a referenced 
break/continue target, but we are talking about the same thing). Only 
loops which use the syntax get a loop manager object.



* For anything "funky" (my words, not yours ;)), there needs to be a
way of creating a custom loop object - what would the syntax for that
be? A callable needs to be invoked as well as the name bound (the
current suggestion just binds a name to some magical object that
appears from somewhere).



I don't really understand what this means, as I'm not aware of how those
things work in the background.


What I mean is, in the syntax "for spam in ham as eggs:" the name "eggs" 
is bound to your loop manager object. Where is the constructor call for 
this object? what class is it? That's what I meant by "magical".


If you are proposing the ability to create user-defined loop managers 
then there must be somewhere where your custom class's constructor is 
called. Otherwise how does Python know what type of object to create?


Something like (this is not a proposal, just something plucked out of 
the air to hopefully illustrate what I mean):


  for spam in ham with MyLoop() as eggs:
  eggs.continue()


I guess it would be magical in the sense it's not
the habitual way of constructing an object. But it's what we're already
used to with "as". When we use a context manager, like "with
MyPersonalStream() as my_stream:", my_stream is not an object of type
"MyPersonalStream" that has been built using the constructor, but the
return of __enter__()


By you have to spell the constructor (MyPersonalStream()) to see what 
type of object is being created (whether or not the eventual name bound 
in your context is to the result of a method call on that object, the 
constructor of your custom context manager is explicitly called.



If you are saying that the syntax always implicitly creates an instance 
of a builtin class which can not be subclassed by a custom class then 
that's a bit different.




This solution, besides having been explicitly rejected by Guido himself,


I didn't realise that. Dead in the water then probably, which is fine, I 
wasn't pushing it.



brings two functionalities that are part of the proposal, but are not
its main purpose, which is having the object itself. Allowing to break
and continue from it are just things that it could bring to us, but
there are countless things it could also bring (not all of them being
good ideas, of course), like the .skip() and the properties I mentioned,


I understand that, but I concentrated on those because they were easily 
converted into syntax (and would probably be the only things I'd find 
useful - all the other stuff is mostly doable using a custom iterator, I 
think).


I would agree that considering syntax for all of the extra things you 
mention would be a bad idea - which your loop manager object idea gets 
around.



but we could discuss about some methods like forloop.reset(),
forloop.is_first_iteration() which is just of shortcut to (forloop.count
== 0), forloop.is_last_iteration()


Also, FWIW, if I knew that in addition to the overhead of creating a 
loop manager object I was also incurring the overhead of a loop counter 
being maintained (usually, one is not required - if it is, use 
enumerate()) I would probably not use this construct and instead find 
ways of restructuring my code to avoid it using regular for loops.



I'm not beating up on you - like I said, I think the idea is interesting.

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] More classical for-loop

2017-02-18 Thread Erik

On 18/02/17 19:35, Mikhail V wrote:

You mean what my proposal would bring
technically better than e.g.:

for i,e in enumerate(Seq)

Well, nothing, and I will simply use it,
with only difference it could be:

for i,e over enumerate(Seq)

In this case only space holes will be
smoothed out, so pure optical fix.


But you also make the language's structure not make sense. For good or 
bad, English is the language that the keywords are written in so it 
makes sense for the Python language constructs to follow English constructs.


An iterable in Python (something that can be the target of a 'for' loop) 
is a collection of objects (whether they represent a sequence of 
integers, a set of unique values, a list of random things, whatever).


It is valid English to say "for each object in my collection, I will do 
the following:".


It is not valid English to say "for each object over my collection, I 
will do the following:".


In that respect, "in" is the correct keyword for Python to use. In the 
physical world, if the "collection" is some coins in your pocket, would 
you say "for each coin over my pocket, I will take it out and look at it"?


Other than that, I also echo Stephen's comments that not all iterables' 
lengths can be known in advance, and not all iterables can be indexed, 
so looping using length and indexing is a subset of what the 'for' loop 
can do today.


Why introduce new syntax for a restricted subset of what can already be 
done? Soon, someone else will propose another syntax for a different 
subset. This is why people are talking about the "burden" of learning 
these extra syntaxes. Rather than 10 different syntaxes for 10 different 
subsets, why not just learn the one syntax for the general case?


E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Python reviewed

2017-01-09 Thread Erik

On 10/01/17 01:44, Simon Lovell wrote:

Regarding the logical inconsistency of my argument, well I am saying
that I would prefer my redundancy at the end of the loop rather than the
beginning. To say that the status quo is better is to say that you
prefer your redundancy at the beginning.


It's not really that one prefers redundancy anywhere. It's more a 
question of:


a) Does the redundancy have any (however small) benefit?
b) How "expensive" is the redundancy (in this case, that equates to 
mandatory characters typed and subsequent screen noise when reading the 
code).


I don't understand how a "redundancy" of a trailing colon in any 
statement that will introduce a new level of indentation is worse than 
having to remember to type "end" when a dedent (which is zero 
characters) does that.


Trailing colon "cost": 1 * (0.n)
Block end "cost": (len("end") + len(statement_text)) * 1.0


I still struggle to see why it should be
mandatory though?


That looks like a statement, but you've ended it with a question mark. 
Are you asking if you still struggle? I can't tell. Perhaps it's just 
the correct use of punctuation that you're objecting to ;)


> One more comment I wanted to make about end blocks, is that a
> respectable editor will add them for you,

You are now asking me to write code with what you describe as a 
"respectable" editor. I use vim, which is very respectable, thank you. 
You'd like me to use "EditPlus 2" or equivalent. I struggle to see why 
that should be mandatory.


Thanks for starting an entertaining thread, though ;)

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-30 Thread Erik Bray
On Fri, Dec 30, 2016 at 5:05 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 29 December 2016 at 22:12, Erik Bray <erik.m.b...@gmail.com> wrote:
>>
>> 1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the
>> implementation--that the keys are integers starting from zero)
>> 2) pthreads: Does not definite an uninitialized default value for
>> keys, for reasons described at [1] under "Non-Idempotent Data Key
>> Creation".  I understand their reasoning, though I can't claim to know
>> specifically what they mean when they say that some implementations
>> would require the mutual-exclusion to be performed on
>> pthread_getspecific() as well.  I don't know that it applies here.
>
>
> That section is a little weird, as they describe two requests (one for a
> known-NULL default value, the other for implicit synchronisation of key
> creation to prevent race conditions), and only provide the justification for
> rejecting one of them (the second one).

Right, that is confusing to me as well. I'm guessing the reason for
rejecting the first is in part a way to force us to recognize the
second issue.

> If I've understood correctly, the situation they're worried about there is
> that pthread_key_create() has to be called at least once-per-process, but
> must be called before *any* call to pthread_getspecific or
> pthread_setspecific for a given key. If you do "implicit init" rather than
> requiring the use of an explicit mechanism like pthread_once (or our own
> Py_Initialize and module import locks), then you may take a small
> performance hit as either *every* thread then has to call
> pthread_key_create() to ensure the key exists before using it, or else
> pthread_getspecific() and pthread_setspecific() have to become potentially
> blocking calls. Neither of those is desirable, so it makes sense to leave
> that part of the problem to the API client.
>
> In our case, we don't want the implicit synchronisation, we just want the
> known-NULL default value so the "Is it already set?" check can be moved
> inside the library function.

Okay, we're on the same page here then.  I just wanted to make sure
there wasn't anything else I was missing in Python's case.

>> 3) windows: The return value of TlsAlloc() is a DWORD (unsigned int)
>> and [2] states that its value should be opaque.
>>
>> So in principle we can cover all cases with an opaque struct that
>> contains, as its first member, an is_initialized flag.  The tricky
>> part is how to initialize the rest of the struct (containing the
>> underlying implementation-specific key).  For 1) and 3) it doesn't
>> matter--it can just be zero.  For 2) it's trickier because there's no
>> defined constant value to initialize a pthread_key_t to.
>>
>> Per Nick's suggestion this can be worked around by relying on C99's
>> initialization semantics. Per [3] section 6.7.8, clause 21:
>>
>> """
>> If there are fewer initializers in a brace-enclosed list than there
>> are elements or members of an aggregate, or fewer characters in a
>> string literal used to initialize an array of known size than there
>> are elements in the array, the remainder of the aggregate shall be
>> initialized implicitly the same as objects that have static storage
>> duration.
>> """
>>
>> How objects with static storage are initialized is described in the
>> previous page under clause 10, but in practice it boils down to what
>> you would expect: Everything is initialized to zero, including nested
>> structs and arrays.
>>
>> So as long as we can use this feature of C99 then I think that's the
>> best approach.
>
>
>
> I checked PEP 7 to see exactly which features we've added to the approved C
> dialect, and designated initialisers are already on the list:
> https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
>
> So I believe that would allow the initializer to be declared as something
> like:
>
> #define Py_tss_NEEDS_INIT {.is_initialized = false}

Great!  One could argue about whether or not the designated
initializer syntax also incorporates omitted fields, but it would seem
strange to insist that it doesn't.

Have a happy new year,

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-29 Thread Erik Bray
On Wed, Dec 21, 2016 at 5:07 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 21 December 2016 at 20:01, Erik Bray <erik.m.b...@gmail.com> wrote:
>>
>> On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> > Option 2: Similar to option 1, but using a custom type alias, rather
>> > than
>> > using a C99 bool directly
>> >
>> > The closest API we have to these semantics at the moment would be
>> > PyGILState_Ensure, so the following API naming might work for option 2:
>> >
>> > Py_ensure_t
>> > Py_ENSURE_NEEDS_INIT
>> > Py_ENSURE_INITIALIZED
>> >
>> > Respectively, these would just be aliases for bool, false, and true.
>> >
>> > And then modify the proposed PyThread_tss_create and PyThread_tss_delete
>> > APIs to accept a "Py_ensure_t *init_flag" in addition to their current
>> > arguments.
>>
>> That all sounds good--between the two option 2 looks a bit more explicit.
>>
>> Though what about this?  Rather than adding another type, the original
>> proposal could be changed slightly so that Py_tss_t *is* partially
>> defined as a struct consisting of a bool, with whatever the native TLS
>> key is.   E.g.
>>
>> typedef struct {
>> bool init_flag;
>> #if defined(_POSIX_THREADS)
>> pthreat_key_t key;
>> #elif defined (NT_THREADS)
>> DWORD key;
>> /* etc... */
>> } Py_tss_t;
>>
>> Then it's just taking Masayuki's original patch, with the global bool
>> variables, and formalizing that by combining the initialized flag with
>> the key, and requiring the semantics you described above for
>> PyThread_tss_create/delete.
>>
>> For Python's purposes it seems like this might be good enough, with
>> the more general purpose pthread_once-like functionality not required.
>
>
> Aye, I also thought of that approach, but talked myself out of it since
> there's no definable default value for pthread_key_t. However, C99 partial
> initialisation may deal with that for us (by zeroing the memory without
> actually assigning a typed value to it), and if it does, I agree it would be
> better to handle the initialisation flag automatically rather than requiring
> callers to do it.

I think I understand what you're saying here...  To be clear, let me
enumerate the three currently supported cases and how they're
affected:

1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the
implementation--that the keys are integers starting from zero)
2) pthreads: Does not definite an uninitialized default value for
keys, for reasons described at [1] under "Non-Idempotent Data Key
Creation".  I understand their reasoning, though I can't claim to know
specifically what they mean when they say that some implementations
would require the mutual-exclusion to be performed on
pthread_getspecific() as well.  I don't know that it applies here.
3) windows: The return value of TlsAlloc() is a DWORD (unsigned int)
and [2] states that its value should be opaque.

So in principle we can cover all cases with an opaque struct that
contains, as its first member, an is_initialized flag.  The tricky
part is how to initialize the rest of the struct (containing the
underlying implementation-specific key).  For 1) and 3) it doesn't
matter--it can just be zero.  For 2) it's trickier because there's no
defined constant value to initialize a pthread_key_t to.

Per Nick's suggestion this can be worked around by relying on C99's
initialization semantics. Per [3] section 6.7.8, clause 21:

"""
If there are fewer initializers in a brace-enclosed list than there
are elements or members of an aggregate, or fewer characters in a
string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration.
"""

How objects with static storage are initialized is described in the
previous page under clause 10, but in practice it boils down to what
you would expect: Everything is initialized to zero, including nested
structs and arrays.

So as long as we can use this feature of C99 then I think that's the
best approach.



[1] 
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
[2] 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).aspx
[3] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-21 Thread Erik Bray
On Wed, Dec 21, 2016 at 11:01 AM, Erik Bray <erik.m.b...@gmail.com> wrote:
> That all sounds good--between the two option 2 looks a bit more explicit.
>
> Though what about this?  Rather than adding another type, the original
> proposal could be changed slightly so that Py_tss_t *is* partially
> defined as a struct consisting of a bool, with whatever the native TLS
> key is.   E.g.
>
> typedef struct {
> bool init_flag;
> #if defined(_POSIX_THREADS)
> pthreat_key_t key;

*pthread_key_t* of course, though I wonder if that was a Freudian slip :)

> #elif defined (NT_THREADS)
> DWORD key;
> /* etc... */
> } Py_tss_t;
>
> Then it's just taking Masayuki's original patch, with the global bool
> variables, and formalizing that by combining the initialized flag with
> the key, and requiring the semantics you described above for
> PyThread_tss_create/delete.
>
> For Python's purposes it seems like this might be good enough, with
> the more general purpose pthread_once-like functionality not required.

Of course, that's not to say it might not be useful for some other
purpose, but then it's outside the scope of this discussion as long as
it isn't needed for TLS key initialization.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-21 Thread Erik Bray
On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 21 December 2016 at 01:35, Masayuki YAMAMOTO <ma3yuki.8mam...@gmail.com>
> wrote:
>>
>> 2016-12-20 22:30 GMT+09:00 Erik Bray <erik.m.b...@gmail.com>:
>>>
>>> This is probably an implementation detail, but ISTM that even with
>>> PyThread_call_once, it will be necessary to reset any used once_flags
>>> manually in PyOS_AfterFork, essentially for the same reason the
>>> autoTLSkey is reset there currently...
>>
>>
>> Deleting threads key is executed on *_Fini functions, but Py_FinalizeEx
>> function that calls *_Fini functions doesn't terminate CPython interpreter.
>> Furthermore, source comment and document have said description about
>> reinitialization after calling Py_FinalizeEx. [1] [2] That is to say there
>> is an implicit possible that is reinitialization contrary to name
>> "call_once" on a process level. Therefore, if CPython interpreter continues
>> to allow reinitialization, I'd suggest to rename the call_once API to avoid
>> misreading semantics. (for example, safe_init, check_init)
>
>
> Ouch, I'd missed that, and I agree it's not a negligible implementation
> detail - there are definitely applications embedding CPython out there that
> rely on being able to run multiple Initialize/Finalize cycles in the same
> process and have everything "just work". It also means using the
> "PyThread_*" prefix for the initialisation tracking aspect would be
> misleading, since the life cycle details are:
>
> 1. Create the key for the first time if it has never been previously set in
> the process
> 2. Destroy and reinit if Py_Finalize gets called
> 3. Destroy and reinit if a new subprocess is forked
>
> It also means we can't use pthread_once even in the pthread TLS
> implementation, since it doesn't provide those semantics.
>
> So I see two main alternatives here.
>
> Option 1: Modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "bool *init_flag" pointer in addition to their current
> arguments.
>
> If *init_flag is true, then PyThread_tss_create is a no-op, otherwise it
> sets the flag to true after creating the key.
> If *init_flag is false, then PyThread_tss_delete is a no-op, otherwise it
> sets the flag to false after deleting the key.
>
> Option 2: Similar to option 1, but using a custom type alias, rather than
> using a C99 bool directly
>
> The closest API we have to these semantics at the moment would be
> PyGILState_Ensure, so the following API naming might work for option 2:
>
> Py_ensure_t
> Py_ENSURE_NEEDS_INIT
> Py_ENSURE_INITIALIZED
>
> Respectively, these would just be aliases for bool, false, and true.
>
> And then modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "Py_ensure_t *init_flag" in addition to their current
> arguments.

That all sounds good--between the two option 2 looks a bit more explicit.

Though what about this?  Rather than adding another type, the original
proposal could be changed slightly so that Py_tss_t *is* partially
defined as a struct consisting of a bool, with whatever the native TLS
key is.   E.g.

typedef struct {
bool init_flag;
#if defined(_POSIX_THREADS)
pthreat_key_t key;
#elif defined (NT_THREADS)
DWORD key;
/* etc... */
} Py_tss_t;

Then it's just taking Masayuki's original patch, with the global bool
variables, and formalizing that by combining the initialized flag with
the key, and requiring the semantics you described above for
PyThread_tss_create/delete.

For Python's purposes it seems like this might be good enough, with
the more general purpose pthread_once-like functionality not required.

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Mon, Dec 19, 2016 at 3:45 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> On 17 December 2016 at 03:51, Antoine Pitrou <solip...@pitrou.net> wrote:
>>>
>>> On Fri, 16 Dec 2016 13:07:46 +0100
>>> Erik Bray <erik.m.b...@gmail.com> wrote:
>>> > Greetings all,
>>> >
>>> > I wanted to bring attention to an issue that's been languishing on the
>>> > bug tracker since last year, which I think would best be addressed by
>>> > changes to CPython's C-API.  The original issue is at
>>> > http://bugs.python.org/issue25658, but I have made an effort below in
>>> > a sort of proto-PEP to summarize the problem and the proposed
>>> > solution.
>>> >
>>> > I haven't written this up in the proper PEP format because I want to
>>> > see if the idea has some broader support first, and it's also not
>>> > clear to me whether C-API changes (especially to undocumented APIs)
>>> > even require their own PEP.
>>>
>>> This is a nice detailed write-up and I'm in favour of the proposal.
>>
>>
>> Likewise - we know the status quo isn't right, and the proposed change
>> addresses that. In reviewing the patch on the tracker, the one downside I've
>> found is that due to "pthread_key_t" being an opaque type with no defined
>> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add
>> separate boolean flag variables to track whether or not the key had been
>> created. (The pthread examples at
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
>> use pthread_once for a similar effect)
>>
>> I don't see any obvious way around that either, as even using a small struct
>> for native pthread TLS keys would still face the problem of how to
>> initialise the pthread_key_t field.
>
> Hmm...fair point that it's not pretty.  One way around it, albeit
> requiring more work/complexity, would be to extend this proposal to
> add a new function analogous to pthread_once--say--PyThread_call_once,
> and an associated Py_once_flag_t

Oops--fat-fingered a 'send' command before I finished.

So  workaround would be to add a PyThread_call_once function,
analogous to pthread_once.  Yet another interface one needs to
implement for a native thread implementation, but not too hard either.
For pthreads there's already an obvious analogue that can be wrapped
directly.  For other platforms that don't have a direct analogue a
(naive) implementation is still fairly simple: All you need in
Py_once_flag_t is a boolean flag with an associated mutex, and a
sentinel value analogous to PTHREAD_ONCE_INIT.

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 17 December 2016 at 03:51, Antoine Pitrou <solip...@pitrou.net> wrote:
>>
>> On Fri, 16 Dec 2016 13:07:46 +0100
>> Erik Bray <erik.m.b...@gmail.com> wrote:
>> > Greetings all,
>> >
>> > I wanted to bring attention to an issue that's been languishing on the
>> > bug tracker since last year, which I think would best be addressed by
>> > changes to CPython's C-API.  The original issue is at
>> > http://bugs.python.org/issue25658, but I have made an effort below in
>> > a sort of proto-PEP to summarize the problem and the proposed
>> > solution.
>> >
>> > I haven't written this up in the proper PEP format because I want to
>> > see if the idea has some broader support first, and it's also not
>> > clear to me whether C-API changes (especially to undocumented APIs)
>> > even require their own PEP.
>>
>> This is a nice detailed write-up and I'm in favour of the proposal.
>
>
> Likewise - we know the status quo isn't right, and the proposed change
> addresses that. In reviewing the patch on the tracker, the one downside I've
> found is that due to "pthread_key_t" being an opaque type with no defined
> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add
> separate boolean flag variables to track whether or not the key had been
> created. (The pthread examples at
> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
> use pthread_once for a similar effect)
>
> I don't see any obvious way around that either, as even using a small struct
> for native pthread TLS keys would still face the problem of how to
> initialise the pthread_key_t field.

Hmm...fair point that it's not pretty.  One way around it, albeit
requiring more work/complexity, would be to extend this proposal to
add a new function analogous to pthread_once--say--PyThread_call_once,
and an associated Py_once_flag_t
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Sat, Dec 17, 2016 at 8:21 AM, Stephen J. Turnbull
<turnbull.stephen...@u.tsukuba.ac.jp> wrote:
> Erik Bray writes:
>
>  > Abstract
>  > 
>  >
>  > The proposal is to add a new Thread Local Storage (TLS) API to CPython
>  > which would supersede use of the existing TLS API within the CPython
>  > interpreter, while deprecating the existing API.
>
> Thank you for the analysis!

And thank *you* for the feedback!

> Question:
>
>  > Further, the old PyThread_*_key* functions will be marked as
>  > deprecated.
>
> Of course, but:
>
>  > Additionally, the pthread implementations of the old
>  > PyThread_*_key* functions will either fail or be no-ops on
>  > platforms where sizeof(pythead_t) != sizeof(int).
>
> Typo "pythead_t" in last line.

Thanks, yes, that was suppose to be pthread_key_t of course.  I think
I had a few other typos too.

> I don't understand this.  I assume that there are no such platforms
> supported at present.  I would think that when such a platform becomes
> supported, code supporting "key" functions becomes unsupportable
> without #ifdefs on that platform, at least directly.  So you should
> either (1) raise UnimplementedError, or (2) provide the API as a
> wrapper over the new API by making the integer keys indexes into a
> table of TSS'es, or some such device.  I don't understand how (3)
> "make it a no-op" can be implemented for PyThread_create_key -- return
> 0 or -1?  That would only work if there's a failure return status like
> 0 or -1, and it seems really dangerous to me since in general a lot of
> code doesn't check status even though it should.  Even for code
> checking the status, the error message will be suboptimal ("creation
> failed" vs. "unimplemented").

Masayuki already explained this downthread I think, but I could have
probably made that section more precise.  The point was that
PyThread_create_key should immediately return -1 in this case.  This
is just a subtle difference over the current situation, which is that
PyThread_create_key succeeds, but the key is corrupted by being cast
to an int, so that later calls to PyThread_set_key_value and the like
fail unexpectedly.  The point is that PyThread_create_key (and we're
only talking about the pthread implementation thereof, to be clear)
must fail immediately if it can't work correctly.

#ifdefs on the platform would not be necessary--instead, Masayuki's
patch adds a feature check in configure.ac for sizeof(int) ==
sizeof(pthread_key_t).  It should be noted that even this check is not
100% perfect, as on Linux pthread_key_t is an unsigned int, and so
technically can cause Python's signed int key to overflow, but there's
already an explicit check for that (which would be kept), and it's
also a very unlikely scenario.

> I gather from references to casting pthread_key_t to unsigned int and
> back that there's probably code that does this in ways making (2) too
> dangerous to support.  If true, perhaps that should be mentioned here.

It's not necessarily too dangerous, so much as not worth the trouble,
IMO.  Simpler to just provide, and immediately use the new API and
make the old one deprecated and explicitly not supported on those
platforms where it can't work.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Sun, Dec 18, 2016 at 12:10 AM, Masayuki YAMAMOTO
 wrote:
> 2016-12-17 18:35 GMT+09:00 Stephen J. Turnbull
> :
>>
>> I don't understand this.  I assume that there are no such platforms
>> supported at present.  I would think that when such a platform becomes
>> supported, code supporting "key" functions becomes unsupportable
>> without #ifdefs on that platform, at least directly.  So you should
>> either (1) raise UnimplementedError, or (2) provide the API as a
>> wrapper over the new API by making the integer keys indexes into a
>> table of TSS'es, or some such device.  I don't understand how (3)
>> "make it a no-op" can be implemented for PyThread_create_key -- return
>> 0 or -1?  That would only work if there's a failure return status like
>> 0 or -1, and it seems really dangerous to me since in general a lot of
>> code doesn't check status even though it should.  Even for code
>> checking the status, the error message will be suboptimal ("creation
>> failed" vs. "unimplemented").
>
>
> PyThread_create_key has required user to check the return value since when
> key creation fails, returns -1 instead of valid key value.  Therefore, my
> patch changes PyThread_create_key that always return -1 on platforms that
> cannot cast key to int safely and current API never return valid key value
> to these platforms.  Its advantage to not change function specifications and
> no effect on supported platforms. Hence, this is reason that doesn't raise
> any exception on the API.
>
> (2) of ideas can enable current API on specific-platforms. If it's simple,
> I'd have liked to select it.  However, work that brings current API using
> native TLS to specific-platforms brings duplication implementation that
> manages keys, and it's ugly (same reason for Erik's draft, the last item of
> Rejected Ideas).  Thus, I gave up to keep feature and decided to implement
> "no-op", delegate error handling to API users.

Yep--I think it speaks to the sensibleness of that decision that I
pretty much read your mind :)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-16 Thread Erik Bray
with POSIX (and in
fact makes invalid assumptions about pthreads).


Rationale for Proposed Solution
===

The use of an opaque type (Py_tss_t) to key TLS values allows the API
to be compatible, at least in this regard, with CPython's internal TLS
implementation, as well as all present (NT and posix) and future
(C11?) native TLS implementations supported by CPython, as it allows
the definition of Py_tss_t to depend on the underlying implementation.

A new API must be introduced, rather than changing the function
signatures of the current API, in order to maintain backwards
compatibility.  The new API also more clearly groups together these
related functions under a single name prefix, "PyThread_tss_".  The
"tss" in the name stands for "thread-specific storage", and was
influenced by the naming and design of the "tss" API that is part of
the C11 threads API.  However, this is in no way meant to imply
compatibility with or support for the C11 threads API, or signal any
future intention of supporting C11--it's just the influence for the
naming and design.

Changing PyThread_create_key to immediately return a failure status on
systems using pthreads where sizeof(int) != sizeof(pthread_key_t) is
intended as a sanity check:  Currently, PyThread_create_key will
report initial success on such systems, but attempts to use the
returned key are likely to fail.  Although in practice this failure
occurs quickly during interpreter startup, it's better to fail
immediately at the source of failure (PyThread_create_key) rather than
sometime later when use of an invalid key is attempted.


Rejected Ideas
==

* Do nothing: The status quo is fine because it works on Linux, and
platforms wishing to be supported by CPython should follow the
requirements of PEP-11.  As explained above, while this would be a
fair argument if CPython were being to asked to make changes to
support particular quirks of a specific platform, in this case the
platforms in question are only asking to fix a quirk of CPython that
prevents it from being used to its full potential on those platforms.
The fact that the current implementation happens to work on Linux is a
happy accident, and there's no guarantee that will never change.

* Affected platforms should just configure Python --without-threads:
This is a possible temporary workaround to the issue, but only that.
Python should not be hobbled on affected platforms despite them being
otherwise perfectly capable of running multi-threaded Python.

* Affected platforms should not define Py_HAVE_NATIVE_TLS: This is a
more acceptable alternative to the previous idea, and in fact there is
a patch to do just that [2].  However, CPython's internal TLS
implementation being "slower and clunkier" in general than native
implementations still needlessly hobbles performance on affected
platforms.  At least one other module (tracemalloc) is also broken if
Python is built without Py_HAVE_NATIVE_TLS.

* Keep the existing API, but work around the issue by providing a
mapping from pthread_key_t values to ints.  A couple attempts were
made at this [3] [4], but this only injects needless complexity and
overhead into performance-critical code on platforms that are not
currently affected by this issue (such as Linux).  Even if use of this
workaround were made conditional on platform compatibility, it
introduces platform-specific code to maintain, and still has the
problem of the previous rejected ideas of needlessly hobbling
performance on affected platforms.


Implementation
==

An initial version of a patch [5] is available on the bug tracker for
this issue.  The patch is proposed and written by Masayuki Yamamoto,
who should be considered a co-author of this proto-PEP, though I have
not consulted directly with him in writing this.  If he's reading, he
should chime in in case I've misrepresented anything.


If you've made it this far, thanks for reading and thank you for your
consideration,

Erik

[1] https://bugs.python.org/msg116292
[2] http://bugs.python.org/file45548/configure-pthread_key_t.patch
[3] http://bugs.python.org/file44269/issue25658-1.patch
[4] http://bugs.python.org/file44303/key-constant-time.diff
[5] http://bugs.python.org/file45763/pythread-tss.patch
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] if-statement in for-loop

2016-10-03 Thread Erik

Hi,

On 11/09/16 10:36, Dominik Gresch wrote:

So I asked myself if a syntax as follows would be possible:

for i in range(10) if i != 5:
body


I've read the thread and I understand the general issues with making the 
condition part of the expression.


However, what if this wasn't part of changing the expression syntax but 
changing the declarative syntax instead to remove the need for a newline 
and indent after the colon? I'm fairly sure this will have been 
suggested and shot down in the past, but I couldn't find any obvious 
references so I'll say it (again?).


The expression suggested could be spelled:

for i in range(10): if i != 5:
body

So, if a colon followed by another suite is equivalent to the same 
construct but without the INDENT (and then the corresponding DEDENT 
unwinds up to the point of the first keyword) then we get something 
that's pretty much as succinct as Dominik suggested.


Of course, we then might get:

for i in myweirdobject: if i != 5: while foobar(i) > 10: while frob(i+1) 
< 99:

body

... which is hideous. But is it actually _likely_?

E.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] if-statement in for-loop

2016-09-27 Thread Erik Bray
On Tue, Sep 27, 2016 at 5:33 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 28 September 2016 at 00:55, Erik Bray <erik.m.b...@gmail.com> wrote:
>> On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach
>> <mafagafogiga...@gmail.com> wrote:
>>> On 09/11/2016 06:36 AM, Dominik Gresch wrote:
>>>>
>>>> So I asked myself if a syntax as follows would be possible:
>>>>
>>>> for i in range(10) if i != 5:
>>>> body
>>>>
>>>> Personally, I find this extremely intuitive since this kind of
>>>> if-statement is already present in list comprehensions.
>>>>
>>>> What is your opinion on this? Sorry if this has been discussed before --
>>>> I didn't find anything in the archives.
>>>>
>>>
>>> I find it interesting.
>>>
>>> I thing that this will likely take up too many columns in more convoluted
>>> loops such as
>>>
>>> for element in collection if is_pretty_enough(element) and ...:
>>> ...
>>>
>>> However, this "problem" is already faced by list comprehensions, so it is
>>> not a strong argument against your idea.
>>
>> Sorry to re-raise this thread--I'm inclined to agree that the case
>> doesn't really warrant new syntax.  I just wanted to add that I think
>> the very fact that this syntax is supported by list comprehensions is
>> an argument *in its favor*.
>>
>> I could easily see a Python newbie being confused that they can write
>> "for x in y if z" inside a list comprehension, but not in a bare
>> for-statement.  Sure they'd learn quickly enough that the filtering
>> syntax is unique to list comprehensions.  But to anyone who doesn't
>> know the historical progression of the Python language that would seem
>> highly arbitrary and incongruous I would think.
>>
>> Just $0.02 USD from a pedagogical perspective.
>
> This has come up before, and it's considered a teaching moment
> regarding how the comprehension syntax actually works: it's an
> *arbitrarily deep* nested chain of if statements and for statements.
>
> That is:
>
>   [f(x,y,z) for x in seq1 if p1(x) for y in seq2 if p2(y) for z in
> seq3 if p3(z)]
>
> can be translated mechanically to the equivalent nested statements
> (with the only difference being that the loop variable leak due to the
> missing implicit scope):
>
> result = []
> for x in seq1:
> if p1(x):
> for y in seq2:
> if p2(y):
> for z in seq3:
> if p3(z):
> result.append(f(x, y, z))
>
> So while the *most common* cases are a single for loop (map
> equivalent), or a single for loop and a single if statement (filter
> equivalent), they're not only the forms folks may encounter in the
> wild.

Thanks for pointing this out Nick.  Then following my own logic it
would be desirable to also allow the nested for loop syntax of list
comprehensions outside them as well.  That's a slippery slope to
incomprehensibility (they're bad enough in list comprehensions, though
occasionally useful).

This is a helpful way to think about list comprehensions though--I'll
remember it next time I teach them.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] if-statement in for-loop

2016-09-27 Thread Erik Bray
On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach
<mafagafogiga...@gmail.com> wrote:
> On 09/11/2016 06:36 AM, Dominik Gresch wrote:
>>
>> So I asked myself if a syntax as follows would be possible:
>>
>> for i in range(10) if i != 5:
>> body
>>
>> Personally, I find this extremely intuitive since this kind of
>> if-statement is already present in list comprehensions.
>>
>> What is your opinion on this? Sorry if this has been discussed before --
>> I didn't find anything in the archives.
>>
>
> I find it interesting.
>
> I thing that this will likely take up too many columns in more convoluted
> loops such as
>
> for element in collection if is_pretty_enough(element) and ...:
> ...
>
> However, this "problem" is already faced by list comprehensions, so it is
> not a strong argument against your idea.

Sorry to re-raise this thread--I'm inclined to agree that the case
doesn't really warrant new syntax.  I just wanted to add that I think
the very fact that this syntax is supported by list comprehensions is
an argument *in its favor*.

I could easily see a Python newbie being confused that they can write
"for x in y if z" inside a list comprehension, but not in a bare
for-statement.  Sure they'd learn quickly enough that the filtering
syntax is unique to list comprehensions.  But to anyone who doesn't
know the historical progression of the Python language that would seem
highly arbitrary and incongruous I would think.

Just $0.02 USD from a pedagogical perspective.

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-31 Thread Erik Bray
On Tue, Aug 30, 2016 at 5:48 AM, Ken Kundert
<python-id...@shalmirane.com> wrote:
> Erik,
> One aspect of astropy.units that differs significantly from what I am
> proposing is that with astropy.units a user would explicitly specify the scale
> factor along with the units, and that scale factor would not change even if 
> the
> value became very large or very small. For example:
>
> >>> from astropy import units as u
> >>> d_andromeda = 7.8e5 * u.parsec
> >>> print(d_andromeda)
> 78.0 pc
>
> >>> d_sun = 93e6*u.imperial.mile
> >>> print(d_sun.to(u.parsec))
> 4.850441695494146e-06 pc
>
> >>> print(d_andromeda.to(u.kpc))
> 780.0 kpc
>
> >>> print(d_sun.to(u.kpc))
> 4.850441695494146e-09 kpc
>
> I can see where this can be helpful at times, but it kind of goes against the
> spirit of SI scale factors, were you are generally expected to 'normalize' the
> scale factor (use the scale factor that results in the digits presented before
> the decimal point falling between 1 and 999). So I would expected
>
> d_andromeda = 780 kpc
> d_sun = 4.8504 upc
>
> Is the normalization available astropy.units and I just did not find it?
> Is there some reason not to provide the normalization?
>
> It seems to me that pre-specifying the scale factor might be preferred if one 
> is
> generating data for a table and all the magnitude of the values are known in
> advance to within 2-3 orders of magnitude.
>
> It also seems to me that if these assumptions were not true, then normalizing
> the scale factors would generally be preferred.
>
> Do you believe that?

Hi Ken,

I see what you're getting at, and that's a good idea.  There's also
nothing in the current implementation preventing it, and I think I'll
even suggest this to Astropy (with proper attribution)!  I think there
are reasons not to always do this, but it's a nice option to have.

Point being nothing about this particular feature requires special
support from the language, unless I'm missing something obvious.  And
given that Astropy (or any other units library) is third-party chances
are a feature like this will land in place a lot faster than it has
any chance of showing up in Python :)

Best,
Erik

> On Mon, Aug 29, 2016 at 03:05:50PM +0200, Erik Bray wrote:
>> Astropy also has a very powerful units package--originally derived
>> from pyunit I think but long since diverged and grown:
>>
>> http://docs.astropy.org/en/stable/units/index.html
>>
>> It was originally developed especially for astronomy/astrophysics use
>> and has some pre-defined units that many other packages don't have, as
>> well as support for logarithmic units like decibel and optional (and
>> customizeable) unit equivalences (e.g. frequency/wavelength or
>> flux/power).
>>
>> That said, its power extends beyond astronomy and I heard through last
>> week's EuroScipy that even some biology people have been using it.
>> There's been some (informal) talk about splitting it out from Astropy
>> into a stand-alone package.  This is tricky since almost everything in
>> Astropy has been built around it (dimensional calculations are always
>> used where possible), but not impossible.
>>
>> One of the other big advantages of astropy.units is the Quantity class
>> representing scale+dimension values.  This is deeply integrated into
>> Numpy so that units can be attached to Numpy arrays, and all Numpy
>> ufuncs can operate on them in a dimensionally meaningful way.  The
>> needs for this have driven a number of recent features in Numpy.  This
>> is work that, unfortunately, could never be integrated into the Python
>> stdlib.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A proposal to rename the term "duck typing"

2016-08-29 Thread Erik Bray
On Sun, Aug 28, 2016 at 7:41 PM, Bruce Leban <br...@leban.us> wrote:
>
>
> On Sunday, August 28, 2016, ROGER GRAYDON CHRISTMAN <d...@psu.edu> wrote:
>>
>>
>> We have a term in our lexicon "duck typing" that traces its origins, in
>> part to a quote along the lines of
>> "If it walks like a duck, and talks like a duck, ..."
>>
>> ...
>>
>> In that case, it would be far more appropriate for use to call this sort
>> of type analysis "witch typing"
>
>
> I believe the duck is out of the bag on this one. First the "duck test" that
> you quote above is over 100 years old.
> https://en.m.wikipedia.org/wiki/Duck_test So that's entrenched.
>
> Second this isn't a Python-only term anymore and language is notoriously
> hard to change prescriptively.
>
> Third I think the duck test is more appropriate than the witch test which
> involves the testers faking the results.

Agreed.

It's also fairly problematic given that you're deriving the term from
a sketch about witch hunts.  While the Monty Python sketch is
hilarious and, it's the ignorant mob that's the butt of the joke
rather than the "witch", this joke doesn't necessarily play well
universally, especially given that there places today where women are
being killed for being "witches".

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-29 Thread Erik Bray
On Mon, Aug 29, 2016 at 3:05 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert
> <python-id...@shalmirane.com> wrote:
>> On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote:
>>> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote:
>>> > On 2016-08-28 18:44, Ken Kundert wrote:
>>> > >When working with a general purpose programming language, the above 
>>> > >numbers
>>> > >become:
>>> > >
>>> > > 780kpc -> 7.8e+05
>>> [...]
>>>
>>> For the record, I don't know what kpc might mean. "kilo pico speed of
>>> light"? So I looked it up using units, and it is kilo-parsecs. That
>>> demonstrates that unless your audience is intimately familiar with the
>>> domain you are working with, adding units (especially units that aren't
>>> actually used for anything) adds confusion.
>>>
>>> Python is not a specialist application targetted at a single domain. It
>>> is a general purpose programming language where you can expect a lot of
>>> cross-domain people (e.g. a system administrator asked to hack on a
>>> script in a domain they know nothing about).
>>
>> I talked to astrophysicist about your comments, and what she said was:
>> 1. She would love it if Python had built in support for real numbers with SI
>>scale factors
>> 2. I told her about my library for reading and writing numbers with SI scale
>>factors, and she was much less enthusiastic because using it would require
>>convincing the rest of the group, which would be too much effort.
>> 3. She was amused by the "kilo pico speed of light" comment, but she was 
>> adamant
>>that the fact that you, or some system administrator, does not understand
>>what kpc means has absolutely no affect on her desired to use SI scale
>>factors. Her comment: I did not write it for him.
>> 4. She pointed out that the software she writes and uses is intended either 
>> for
>>herself of other astrophysicists. No system administrators involved.
>
> Astropy also has a very powerful units package--originally derived
> from pyunit I think but long since diverged and grown:
>
> http://docs.astropy.org/en/stable/units/index.html
>
> It was originally developed especially for astronomy/astrophysics use
> and has some pre-defined units that many other packages don't have, as
> well as support for logarithmic units like decibel and optional (and
> customizeable) unit equivalences (e.g. frequency/wavelength or
> flux/power).
>
> That said, its power extends beyond astronomy and I heard through last
> week's EuroScipy that even some biology people have been using it.
> There's been some (informal) talk about splitting it out from Astropy
> into a stand-alone package.  This is tricky since almost everything in
> Astropy has been built around it (dimensional calculations are always
> used where possible), but not impossible.
>
> One of the other big advantages of astropy.units is the Quantity class
> representing scale+dimension values.  This is deeply integrated into
> Numpy so that units can be attached to Numpy arrays, and all Numpy
> ufuncs can operate on them in a dimensionally meaningful way.  The
> needs for this have driven a number of recent features in Numpy.  This
> is work that, unfortunately, could never be integrated into the Python
> stdlib.

I'll also add that syntactic support for units has rarely been an
issue in Astropy.  The existing algebraic rules for units work fine
with Python's existing order of operations.  It can be *nice* to be
able to write "1m" instead of "1 * m" but ultimately it doesn't add
much for clarity (and if really desired could be handled with a
preparser--something I've considered adding for Astropy sources (via
codecs).

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-29 Thread Erik Bray
On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert
 wrote:
> On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote:
>> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote:
>> > On 2016-08-28 18:44, Ken Kundert wrote:
>> > >When working with a general purpose programming language, the above 
>> > >numbers
>> > >become:
>> > >
>> > > 780kpc -> 7.8e+05
>> [...]
>>
>> For the record, I don't know what kpc might mean. "kilo pico speed of
>> light"? So I looked it up using units, and it is kilo-parsecs. That
>> demonstrates that unless your audience is intimately familiar with the
>> domain you are working with, adding units (especially units that aren't
>> actually used for anything) adds confusion.
>>
>> Python is not a specialist application targetted at a single domain. It
>> is a general purpose programming language where you can expect a lot of
>> cross-domain people (e.g. a system administrator asked to hack on a
>> script in a domain they know nothing about).
>
> I talked to astrophysicist about your comments, and what she said was:
> 1. She would love it if Python had built in support for real numbers with SI
>scale factors
> 2. I told her about my library for reading and writing numbers with SI scale
>factors, and she was much less enthusiastic because using it would require
>convincing the rest of the group, which would be too much effort.
> 3. She was amused by the "kilo pico speed of light" comment, but she was 
> adamant
>that the fact that you, or some system administrator, does not understand
>what kpc means has absolutely no affect on her desired to use SI scale
>factors. Her comment: I did not write it for him.
> 4. She pointed out that the software she writes and uses is intended either 
> for
>herself of other astrophysicists. No system administrators involved.

Astropy also has a very powerful units package--originally derived
from pyunit I think but long since diverged and grown:

http://docs.astropy.org/en/stable/units/index.html

It was originally developed especially for astronomy/astrophysics use
and has some pre-defined units that many other packages don't have, as
well as support for logarithmic units like decibel and optional (and
customizeable) unit equivalences (e.g. frequency/wavelength or
flux/power).

That said, its power extends beyond astronomy and I heard through last
week's EuroScipy that even some biology people have been using it.
There's been some (informal) talk about splitting it out from Astropy
into a stand-alone package.  This is tricky since almost everything in
Astropy has been built around it (dimensional calculations are always
used where possible), but not impossible.

One of the other big advantages of astropy.units is the Quantity class
representing scale+dimension values.  This is deeply integrated into
Numpy so that units can be attached to Numpy arrays, and all Numpy
ufuncs can operate on them in a dimensionally meaningful way.  The
needs for this have driven a number of recent features in Numpy.  This
is work that, unfortunately, could never be integrated into the Python
stdlib.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/