Re: [Python-ideas] Range and slice syntax

2018-11-13 Thread Vladimir Filipović
On Mon, Nov 12, 2018 at 4:43 PM Nicholas Harrison
 wrote:
> Only when this is called (implicitly or explicitly) do checks for valid 
> objects and bounds occur. From my experience using slices, this is how they 
> work in that context too.

On reconsideration, I've found one more argument in favour of (at
least this aspect of?) the proposal: the slice.indices method, which
takes a sequence's length and returns an iterable (range) of all
indices of such a sequence that would be "selected" by the slice. Not
sure if it's supposed to be documented.

So there is definitely precedent for "though slices in general are
primarily a syntactic construct and new container-like classes can
choose any semantics for indexing with them, the semantics
specifically in the context of sequences have a bit of a privileged
place in the language with concrete expectations, including strictly
integer (or None) attributes".
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Range and slice syntax

2018-11-11 Thread Vladimir Filipović
On Sun, Nov 11, 2018 at 6:59 AM Nicholas Harrison
 wrote:

> Any of the values may be omitted and in the slice context the behavior has no 
> changes from what it already does: start and stop default to the beginning or 
> end of the list depending on direction and the step defaults to 1.

Just to point out, with slices it's a bit more complicated than that currently.

The start, stop and step values each default to None.

When slice-indexing built-in and (all? probably, not sure)
standard-library types, None values for start and stop are interpreted
consistently with what you described as defaults.
A None value for step is interpreted as either 1 or -1, depending on
the comparison of start and step, and accounting for None values in
either of them too.

--

In real life I've found a use for non-integer slice objects, and been
happy that Python allowed me to treat the slice as a purely syntactic
construct whose semantics (outside builtins) are not fixed.

My case was an interface to an external sparse time-series store, and
it was easy to make the objects indexable with [datetime1 : datetime2
: timedelta], with None's treated right etc.

(The primary intended use was in a REPL in a data-science context, so
if your first thought was a doubt about whether that syntax is neat or
abusive, please compare it to numpy or pandas idioms, not to
collection classes you use in server or application code.)

If this had not been syntactically possible, it would not have been a
great pain to have to work around it, but now it's existing code and I
can imagine other existing projects adapting the slice syntax to their
own needs. At first blush, it seems like your proposal would give
slices enough compulsory semantics to break some of such existing code
- maybe even numpy itself.

(FWIW, I've also occasionally had a need for non-integer ranges, and
chafed at having to implement or install them. I've also missed
hashable slices in real life, because functools.lru_cache.)

--

(Note I'm just a random person commenting on the mailing list, not
anybody with any authority or influence.)

I find this recurring idea of unifying slices and ranges seductive.
But it would take a lot more shaking-out to make sure the range
semantics can be vague-ified enough that they don't break non-integer
slice usage.

Also, I could imagine some disagreements about exactly how much
non-standard slice usage should be protected from breakage. Someone
could make the argument that _some_ objects as slice parameters are
just abuse and no sane person should have used them in the first
place. ("Really, slicing with [int : [[sys], ...] : __import__]? We
need to take care to not break THAT too?")
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Implementing a set of operation (+, /, - *) on dict consistent with linearAlgebrae

2018-10-31 Thread Vladimir Filipović
Julien, would I be correct if I summarized the changes you have in
mind like this:

for dictionaries d1 and d2,
non-Mapping ("scalar") sc,
binary operation ⊛,
and unary operation 퓊 (such as negation or abs()):

d1 ⊛ sc == {k: (v ⊛ sc) for k, v in d1.items()}
sc ⊛ d1 == {k: (sc ⊛ v) for k, v in d1.items()}
퓊(d1) == {k: 퓊(v) for k, v in d1.items()}
d1 ⊛ d2 == {k: (d1[k] ⊛ d2[k]) for k in d1.keys() & d2.keys()}
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Multi Statement Lambdas

2018-10-28 Thread Vladimir Filipović
On Tue, Oct 23, 2018 at 1:16 PM Chris Angelico  wrote:

> a lambda function should be treated as a block of code
> inside another function, and not as its own entity. You don't give a
> name to the block of code between "if" and "else" ...

Very well put! Thanks.

> Currently, lambda functions are permitted
> to mutate objects in surrounding scopes, but not to rebind them. Well,
> actually, PEP 572 might well be the solution there, but then you
> reopen all that lovely controversy...

I don't think I see how PEP 572 could help solve even a part of it. I
mean, I get it enables expressions in general to rebind names in the
surrounding scope, but it seems (I'm re-reading the scoping section
right now) that lambdas are in effect excluded from that; that any
`x := 1` inside a lambda necessarily refers to a local `x`.

> I mentioned abusing class syntax as an alternative solution. This
> would require a new API, but it wouldn't be hard to write a wrapper.
> ...
> In some contexts, that would be every bit as good as a lambda-based
> solution. To compete, the proposed lambda syntax would have to be
> better than all these - by no means impossible, but it's a target to
> aim at.

Okay, that's... indeed looking abusive :)

In situations where it's applicable, it's arguably no worse than a
statement lambda for points 2-5 (placement, code locality, naming,
natural code inside it).

But in terms of "pomp and ceremony" (point 1) it's a bit of a step
backwards compared to even writing plain named functions: it calls for
multiple keywords and compulsory newlines for something very simple
that shouldn't visually command that much attention.

Minor point, but also understanding this construct (when reading code)
needs a bit of a learning curve, and statement lambdas wouldn't.

And it's only applicable with keyword arguments.

I understand you expect real examples. `Thread` and
`websocket.WebSocketApp` with their keyword arguments were genuine
examples from my experience, whereas I couldn't say I've ever felt a
need to pass a statement-lambda to `map` or `reduce`. So maybe that
disqualifies the positional-arguments objection against this solution,
but the verbosity complaint stands.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Multi Statement Lambdas

2018-10-23 Thread Vladimir Filipović
Chris, I'm happy to work with you to hammer out comparisons of various
solutions.

But I can't take on the role of an advocate for "multi-statement lambdas".
I don't even understand what precisely that covers, since we don't have
uni-statement lambdas.
_If that role would be needed for this discussion, the rest of what I'm
about to write may be a waste of your time!_

I got involved in this thread because examples of something were asked for,
and I had a couple. I could defend a claim like "some _limited set_ of
statements inside lambdas would be useful and pythonic", if that's a useful
conversation to have.

(My view BTW is that _indented compound statements_ in lambdas would be
_irreducibly_ unpythonic, and the only way that can change is if the
concept of pythonicity happens to drift over time.)

--

> Next step: Offer a variety of alternate
> syntaxes that *do* currently work, and then the proposed
> multi-statement lambda, and show how the current syntaxes are all
> ugly.

Good, but we may have a disconnect about which "ugliness"/problem is being
solved.

The central point of Python's lambdas, for me, is that when one needs to
refer to a _very simple_, _once-off_ function, lambdas _make the code
better._
The ways they make it better go beyond saving vertical space and can be
subtle: they improve the locality of code by allowing the function's
definition to be exactly in the place where it's used; a lot like the
difference between (in pseudocode) `output("Hello, world")` and having to
say `String hw = new String("Hello, world"); output(hw)`.


So, what does the existing lambda feature offer in those situations?

1. Virtually no pomp and ceremony (one keyword, sometimes needs
parentheses, no compulsory newlines).

2. Relieves me from having to decide on the place for the definition (right
before use? at the beginning of the using function/scope? right before the
using function/scope?). This could be resolved by just having a convention,
but I note there isn't an existing convention ("Schelling point") for the
just-define-a-named-function approach.

3. That code-locality thing. It relieves me as reader from having to
mentally connect its definition to its usage, and from having to deduce
that it's not also used elsewhere.

4. Relieves me from having to come up with a one-use name. No, `foo` isn't
as good.

5. (I shouldn't have to include this but) Expressions don't lose clarity by
being moved into a lambda. Past the keyword itself, the contents of the
lambda look exactly like (= as simple as) what I would write in non-lambda
Python code. I don't need to contort them into special lambda-specific
constructs.

None of those things are _universally_ desirable! But the existing lambda
lets me _choose_ that in some particular case (very simple, once-off
function) they do make the code better, so I can get them.


Except that sometimes I need a _very simple_, _once-off_ function that
would do something that's a statement. It seems like a very similar
situation, the concept doesn't seem antithetical to the philosophy of
Python's existing lambdas, so it's a bit frustrating that I can't choose to
use the same feature and reap those same benefits.

Other times, I want the lambda to do a couple of things in sequence. No ifs
or try-finallys, just one thing with its side-effects, followed by the
other. This is already doable as evaluation of a tuple (if the things are
expressions), and that's not _too_ ugly, but it would become more direct
(as in quality #5 above) if it was expressible as a pair of
expression-statements.

--

Okay now, what does the decorator solution do in that WebSocket situation
that just defining named functions doesn't already?

I suppose it helps somewhat with property #2 above (finding a place for the
definition).
Outside of that list, it lets us shorten the constructor call.

Am I missing something? Or am I moving the goalposts - were you explicitly
trying to solve something else?

--

Let me try to anticipate one more existing solution (the best I came up
with):

Where the statement is about updating counters or similar state variables,
we can make them items in a dict (or collections.Counter) instead, and
update them in a lambda via dict.update().

It's not terrible, but it means all other references to the state variable
must change in a way that makes them less clear. ("Why are they using this
dict at all? Ohh, because lambda.")

And having to replace `c += 1` with `d.update(c = d['c'] + 1)` or
`counter.update(c=1)` is still ugly.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add closing and iteration to threading.Queue

2018-10-22 Thread Vladimir Filipović
Nathaniel, thank you for the pointer to Trio.
Its approach seems very robust. I'm relieved to see that a solution so
fundamentally rebuilt has also settled on very similar semantics for
its `.close_put()`.

I think your `.clone()` idiom is very clear when the communication
objects are treated as distinct endpoints. Something with similar
effects on closing (not necessarily similar in idiom) would probably
be a neat enhancement to the standard Queue, though if I was making
that, I'd do it in an external package.

--

Antoine, regarding multiprocessing.Queue:

The similarity of meaning behind closing that I was getting at is that
mp.Q.close() means "I am done writing to this queue, and I don't care
about the rest of you", whereas the proposed meaning of q.Q.close() is
"Listen up, we are all done writing to this queue". I don't know yet
that this difference necessarily creates a true incompatibility.

That the effects (in terms of eager OS-resource cleanup) are different
shouldn't be a problem in itself - every implementation does the right
thing for itself.

--

On Mon, Oct 22, 2018 at 2:03 AM Terry Reedy  wrote:
> The proposed close method would only half-close the queue: closed to
> puts, open to gets (but perhaps close completely when the last item is
> gotten.

In other words: in this proposal, there is no such thing as "closed
for retrieval". A closed queue means exactly that it's closed for
insertion.

Retrieval becomes permanently impossible once the queue is closed and
exhausted, and that's a condition that get() must treat correctly and
usefully, but calling that condition "closed / completely closed /
closed for retrieval" would muddle up the terminology.

In the proposed implementation I've called it "exhausted", a name I've
picked up god-knows-when and from god-knows-where, but it seemed
reasonable.

--

Regarding sentinels in general: They are a limited-purpose solution,
and this proposal should make them unnecessary in 99% of the cases.

Firstly, they only naturally apply to FIFO queues. You could hack your
use of LIFO and priority queues to also rely on sentinels, but it's
very kludgey in the general cases, not a natural fit, and not
generalizable to user-created children of Queue (which Queue otherwise
explicitly aspires to support).

Secondly, they only work when the producer is the one driving the flow
and notifying the consumer that "no more is forthcoming". They don't
work when the producer is the one who needs to be notified.

Thirdly, they're a potential cause of deadlocks when the same threads
act as both producers and consumers. (E.g. in a parallelized
breadth-first-search.) I'm sure this is the circular flow that
Nathaniel was referring to, but I'll let him detail it more or correct
me.

Fourthly, they don't make it easy to query the Queue about whether
it's closed. This probably isn't a big deal admittedly.

Sure, when sentinels are adequate, they're adequate. This proposal
aims to be more general-purpose than that.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Multi Statement Lambdas

2018-10-21 Thread Vladimir Filipović
On Mon, Oct 22, 2018 at 12:52 AM Steven D'Aprano  wrote:
> > def inc_counter():
> > counter += 1
>
> I don't think that's a real working example.
> ...
> You need to declare counter and sum as global variables.

Good catch!
None of the examples were real, in the sense of being copied directly
from code in actual use.
They were synthesized to illustrate a point while being as brief as possible.

I'm not even really advocating for multi-statement (or statement)
lambdas BTW, I was just answering the question of when one might even
want an anonymous function that can't be expressed using the current
lambda syntax.

> Its not the naming that would make it turn kludgy, but the use of global
> variables. If you had three websockets, would you want them all to share
> the same counter?

No, the naming would turn it kludgey precisely because I'd want
different functions. If there's only one "on_open", it suggests a
natural name for itself. If there are multiple, they can either be
called "on_open1", "on_open2"... or I can keep reusing the same name
in the same scope but that would make the code less clear, not more.

> You've fallen straight into the classic eager versus late binding of
> closures Gotcha.

I did! /facepalm
Thanks. I constructed a very poor example, and it's beginning to look
like the particular point I was trying to illustrate with that doesn't
actually stand.

> If your intention was to demonstrate that multi-statement lambda would
> be a bug-magnet, you have done an excellent job :-)

I don't know why you'd say that. That was a zero-statement lambda :)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add closing and iteration to threading.Queue

2018-10-21 Thread Vladimir Filipović
On Sun, Oct 21, 2018 at 8:45 PM MRAB  wrote:
> FTR, this has been discussed before:
>
> [Python-ideas] `__iter__` for queues?
> https://mail.python.org/pipermail/python-ideas/2010-January/006711.html

Thank you!

For the sake of clarity, I want to outline a few differences between
that discussion and my proposal:

1. Much of the discussion there seemed to implicitly limit itself to
consideration of FIFO queues. This proposal works cleanly for child
classes too, including any (API-compliant) user-written children.

2. Throughout that discussion, iteration is the A feature, and closing
is occasionally mentioned as a possible prerequisite. In this
proposal, the A feature is closing, which enables sensible iteration
(as a B feature) but is useful even if iteration isn't used.

3. There's naturally a lot of quick spitballing of various
mutually-incompatible ideas there, whereas this is one rounded
self-consistent proposal. Most of what I've come up with has already
been anticipated there but it's all mixed up textually.

4. This proposal sidesteps a lot of the doubts and difficulties by
just not using sentinels at all. Being closed is part of the queue's
state that can be queried at any time, and will affect put() calls
immediately, without waiting for a sentinel to float up to the front.
(With recognition that your (MRAB's) message towards that thread's end
already proposed the same approach.)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Add closing and iteration to threading.Queue

2018-10-21 Thread Vladimir Filipović
Hi!

I originally submitted this as a pull request. Raymond Hettinger
suggested it should be given a shakeout in python-ideas first.

https://github.com/python/cpython/pull/10018
https://bugs.python.org/issue35034

--

Briefly:

Add a close() method to Queue, which should simplify many common uses
of the class and reduce the space for some easy-to-make errors.

Also add an __iter__() method which in conjunction with close() would
further simplify some common use patterns.

--

At eye-watering length:

Apologies in advance for the length of this message. This isn't a PEP
in disguise, it's a proposal for a very small, simple and I dare
imagine uncontroversial feature. I'm new to contributing to Python and
after the BPO/github submission I didn't manage to come up with a
better way to present it than this.

The issue

Code using threading.Queue often needs to coordinate a "work is
finished as far as I care" state between the producing and consuming
side. Not "work" in the task_done() sense of completion of processing
of queue items, "work" in the simpler sense of just passing data
through the queue.

For example, a producer can be driving the communication by enqueuing
e.g. names of files that need to be processed, and once it's enqueued
the last filename, it can be useful to inform the consumers that no
further names will be coming, so after they've retrieved what's
in-flight currently, they don't need to bother waiting for any more.
Alternatively, a consumer can be driving the communication, and may
need to let the producers know "I'm not interested in any more, so you
can stop wasting resources on producing and enqueuing them".
Also, a third, coordinating component may need to let both sides know
that "Our collective work here is done. Start wrapping it up y'all,
but don't drop any items that are still in-flight."

In practice it's probably the exception, not the rule, when any piece
of code interacting with a Queue _doesn't_ have to either inform
another component that its interest in transferring the data has
ended, or watch for such information.

In the most common case of producer letting consumers know that it's
done, this is usually implemented (over and over again) with sentinel
objects, which is at best needlessly verbose and at worst error-prone.
A recipe for multiple consumers making sure nobody misses the sentinel
is not complicated, but neither is it obvious the first time one needs
to do it.
When a generic sentinel (None or similar) isn't adequate, some
component needs to create the sentinel object and communicate it to
the others, which complicates code, and especially complicates
interfaces between components that are not being developed together
(e.g. if one of them is part of a library and expects the library-user
code to talk to it through a Queue).

In the less common cases where the producers are the ones being
notified, there isn't even a typical solution - everything needs to be
cooked up from scratch using synchronization primitives.

--

A solution

Adding a close() method to the Queue that simply prohibits all further
put()'s (with other methods acting appropriately when the queue is
closed) would simplify a lot of this in a clean and safe way - for the
most obvious example, multi-consumer code would not have to juggle
sentinel objects.

Adding a further __iter__() method (that would block as necessary, and
stop its iteration once the queue is closed and exhausted) would
especially simplify many unsophisticated consumers.

This is a current fairly ordinary pattern:

# Producer:
while some_condition:
q.put(generate_item())
q.put(sentinel)

# Consumer:
while True:
item = q.get()
if item == sentinel:
q.put(sentinel)
break
process(item)

(This consumer could be simplified a little with an assignment
expression or an iter(q.get, sentinel), but one of those is super new
and the other seems little-known in spite of being nearly old enough
to vote.)

With the close() and __iter__(), this would become:

# Producer:
with closing(q):
while some_condition:
q.put(generate_item())

# Consumer:
for item in q:
process(item)

Apart from it being shorter and less error-prone (e.g. if
generate_item raises), the implied interface for initializing the two
components is also simplified, because there's no sentinel to pass
around.

More complex consumers that couldn't take advantage of the __iter__()
would still benefit from being able to explicitly and readably find
out (via exception or querying) that the queue has been closed and
exhausted, without juggling the sentinel.

I honestly think this small addition would be an unqualified win. And
it would not change anything for code that doesn't want to use it.

--

I've got a sample implementation ready for Queue and its children.
(Link is at the start of this message. It includes documentation
updates too, in case those clarify anything further at this stage.)

If this is going in the right