Re: [Python-ideas] Add closing and iteration to threading.Queue

2018-10-22 Thread Nathaniel Smith
On Sun, Oct 21, 2018 at 8:31 PM, Guido van Rossum  wrote:
> On Sun, Oct 21, 2018 at 6:08 PM Nathaniel Smith  wrote:
>> I'm not sure if this is an issue the way Queue is used in practice, but in
>> general you have to be careful with this kind of circular flow because if
>> your queue communicates backpressure (which it should) then circular flows
>> can deadlock.
>
> Nathaniel, would you be able to elaborate more on the issue of backpressure?
> I think a lot of people here are not really familiar with the concepts and
> its importance, and it changes how you have to think about queues and the
> like.

Sure.

Suppose you have some kind of producer connected to some kind of
consumer. If the producer consistently runs faster than the consumer,
what should happen? By default with queue.Queue, there's no limit on
its internal buffer, so if the producer puts, say, 10 items per
second, and the consumer only gets, say, 1 item per second, then the
internal buffer grows by 9 items per second. Basically you have a
memory leak, which will eventually crash your program. And well before
that, your latency will become terrible. How can we avoid this?

I guess we could avoid this by carefully engineering our systems to
make sure that producers always run slower than consumers, but that's
difficult and fragile. Instead, what we usually want to do is to
dynamically detect when a producer is outrunning a consumer, and apply
*backpressure*. (It's called that b/c it involves the consumer
"pushing back" against the producer.) The simplest way is to put a
limit on how large our Queue's buffer can grow, and make put() block
if it would exceed this limit. That way producers are automatically
slowed down, because they have to wait for the consumer to drain the
buffer before they can continue executing.

This simple approach also works well when you have several tasks
arranged in a pipeline like A -> B -> C, where B gets objects from A,
does some processing, and then puts new items on to C. If C is running
slow, this will eventually apply backpressure to B, which will block
in put(), and then since B is blocked and not calling get(), then A
will eventually get backpressure too. In fact, this works fine for any
acyclic network topology.

If you have a cycle though, like A -> B -> C -> A, then you at least
potentially have the risk of deadlock, where every task is blocked in
put(), and can't continue until the downstream task calls get(), but
it never will because it's blocked in put() too. Sometimes it's OK and
won't deadlock, but you need to think carefully about the details to
figure that out.

If a task gets and puts to the same queue, like someone suggested
doing for the sentinel value upthread, then that's a cycle and you
need to do some more analysis. (I guess if you have a single sentinel
value, then queue.Queue is probably OK, since the minimal buffer size
it supports is 1? So when the last thread get()s the sentinel, it
knows that there's at least 1 free space in the buffer, and can put()
it back without blocking. But if there's a risk of somehow getting
multiple sentinel values, or if Queues ever gain support for
zero-sized buffers, then this pattern could deadlock.)

There's a runnable example here:
https://trio.readthedocs.io/en/latest/reference-core.html#buffering-in-channels
And I also wrote about backpressure and asyncio here:
https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/#bug-1-backpressure

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] TypeHinting: From variable name to type

2018-10-22 Thread Anders Hovmöller
I'll just simply reply here that Mypy on our entire code base just silently 
outputs nothing. Nothing at all. I tried to look at the code to find out why 
and maybe implement the name checking feature discussed here and was terrified 
by the Byzantine complexity of the code base. I gave up. 

From what I later understood of Mypy it is supposed to be used to check all 
types or none on a file level, not just check what can be checked so you can 
gradually add typing in reasonable places. This seems the totally wrong 
approach to me. I think we agree on this point. 

As for why we have consistent names, it was before my time but keeping this 
consistency isn't much work, we certainly don't go back and change code to 
enforce conventions. 

> On 23 Oct 2018, at 01:02, Steven D'Aprano  wrote:
> 
> On Fri, Oct 19, 2018 at 01:14:39PM +0200, Anders Hovmöller wrote:
> 
> [I wrote this]
>>> I've seen far too many variables called (let's say) "mylist" which 
>>> actually hold a dict or a tuple (or in one memorable case, a string!) to 
>>> unconditionally believe the name.
>> 
>> This is an even stronger argument for the proposal I think. If IDEs 
>> and static analysis tools looked at stuff like this and assumed a type 
>> _and then found that this assumption is violated_, it would be a huge 
>> win!
> 
> That's an argument for *static type checking*, not necessarily an 
> argument for *this* specific proposal. Any method of checking types, be 
> it static typing, mandatory type declarations, type inference, stub 
> files, annotations, or anything else, would help in that situation.
> 
> This specific proposal assumes that when you have a single name, it 
> always (or at least, nearly always) means the same type, across an 
> entire package. Therefore it is safe to map the name *alone* to the 
> type, rather than annotate the variable (name in a specific context) 
> with the type.
> 
> Of course I get it that you only choose *selected* names to label with 
> this name2type mapping. I just don't think this happens often enough to 
> make it a language feature, or would be valuable enough when it does 
> happen, to make up for the disadvantages.
> 
> You're using a *project-wide global* declaration, distant from the 
> variable's context, possibly even in a completely different file from 
> where the variable is used. That's just about the worst possible way to 
> annotate a variable with a type.
> 
> But if some, or all, IDEs want to support this, I have no objections. 
> Just don't use it in any code I have to read :-)
> 
> 
> [...]
>>> If your IDE doesn't do type inference, get a better IDE *wink*
>> 
>> 
>> Which IDE would this be? PyCharm doesn't do this in the general case. 
>> Not even close in the code base I work on.
> 
> I don't know whether PyCharm does type-inference, but it *claims* to, 
> and it does have a plugin which runs MyPy, and MyPy certainly does.
> 
> If PyCharm lacks this feature, have you checked out the competition 
> (e.g. Wing IDE, PyDev, Spyder)? Have you submitted a feature request? It 
> could integrate better with MyPy, or possibly even the new type-checking 
> tool from Facebook:
> 
> https://pypi.org/project/pyre-check/
> 
> or Google's:
> 
> https://github.com/google/pytype
> 
> Jedi is another project which claims to implement type-inference and 
> only require hints as a fallback:
> 
> https://jedi.readthedocs.io/en/latest/docs/features.html#type-hinting
> 
> 
 And this mapping dict exists once per library.
>>> 
>>> Or more likely, doesn't exist at all.
>> 
>> You seem to argue here, and generally, that the normal case for code 
>> bases is that you have no consistent naming. This seems strange to me. 
> 
> The point of modules, and functions, is *encapsulation*. If I name a 
> variable "spam" in function 1, why *must* a similar variable use the 
> same name in function 2 let alone function 202, distant in another 
> module? That implies a level of coupling that makes me uncomfortable.
> 
> And probably an excess of generic names like "response" and too few 
> *specific* names like "client_response" and "server_response".
> 
> I am impressed by you and your team's attention to detail at requiring 
> consistent names across such a large code base. I don't know if it is a 
> good thing or a bad thing or just a thing, but I can admire the 
> discipline it requires. I am certainly not that meticulous.
> 
> But I'm also faintly horrified at how much pointless pretend- 
> productivity this book-keeping may (or not!) have involved. You know the 
> sort of time-wasting busy-work coders can get up to when they want to 
> look busy without actually thinking too hard:
> 
> - prettifying code layout and data structures;
> - pointless PEP-8-ifying code, whether it needs it or not;
> - obsessing about consistent names everywhere;
> 
> etc. I know this because I've wasted many hours doing all of these.
> 
> (I shouldn't *need* to say this, but I will anyway: I am making no 
> comment on *you 

Re: [Python-ideas] TypeHinting: From variable name to type

2018-10-22 Thread Steven D'Aprano
On Fri, Oct 19, 2018 at 01:14:39PM +0200, Anders Hovmöller wrote:

[I wrote this]
> > I've seen far too many variables called (let's say) "mylist" which 
> > actually hold a dict or a tuple (or in one memorable case, a string!) to 
> > unconditionally believe the name.
> 
> This is an even stronger argument for the proposal I think. If IDEs 
> and static analysis tools looked at stuff like this and assumed a type 
> _and then found that this assumption is violated_, it would be a huge 
> win!

That's an argument for *static type checking*, not necessarily an 
argument for *this* specific proposal. Any method of checking types, be 
it static typing, mandatory type declarations, type inference, stub 
files, annotations, or anything else, would help in that situation.

This specific proposal assumes that when you have a single name, it 
always (or at least, nearly always) means the same type, across an 
entire package. Therefore it is safe to map the name *alone* to the 
type, rather than annotate the variable (name in a specific context) 
with the type.

Of course I get it that you only choose *selected* names to label with 
this name2type mapping. I just don't think this happens often enough to 
make it a language feature, or would be valuable enough when it does 
happen, to make up for the disadvantages.

You're using a *project-wide global* declaration, distant from the 
variable's context, possibly even in a completely different file from 
where the variable is used. That's just about the worst possible way to 
annotate a variable with a type.

But if some, or all, IDEs want to support this, I have no objections. 
Just don't use it in any code I have to read :-)


[...]
> > If your IDE doesn't do type inference, get a better IDE *wink*
> 
> 
> Which IDE would this be? PyCharm doesn't do this in the general case. 
> Not even close in the code base I work on.

I don't know whether PyCharm does type-inference, but it *claims* to, 
and it does have a plugin which runs MyPy, and MyPy certainly does.

If PyCharm lacks this feature, have you checked out the competition 
(e.g. Wing IDE, PyDev, Spyder)? Have you submitted a feature request? It 
could integrate better with MyPy, or possibly even the new type-checking 
tool from Facebook:

https://pypi.org/project/pyre-check/

or Google's:

https://github.com/google/pytype

Jedi is another project which claims to implement type-inference and 
only require hints as a fallback:

https://jedi.readthedocs.io/en/latest/docs/features.html#type-hinting


> >> And this mapping dict exists once per library.
> > 
> > Or more likely, doesn't exist at all.
> 
> You seem to argue here, and generally, that the normal case for code 
> bases is that you have no consistent naming. This seems strange to me. 

The point of modules, and functions, is *encapsulation*. If I name a 
variable "spam" in function 1, why *must* a similar variable use the 
same name in function 2 let alone function 202, distant in another 
module? That implies a level of coupling that makes me uncomfortable.

And probably an excess of generic names like "response" and too few 
*specific* names like "client_response" and "server_response".

I am impressed by you and your team's attention to detail at requiring 
consistent names across such a large code base. I don't know if it is a 
good thing or a bad thing or just a thing, but I can admire the 
discipline it requires. I am certainly not that meticulous.

But I'm also faintly horrified at how much pointless pretend- 
productivity this book-keeping may (or not!) have involved. You know the 
sort of time-wasting busy-work coders can get up to when they want to 
look busy without actually thinking too hard:

- prettifying code layout and data structures;
- pointless PEP-8-ifying code, whether it needs it or not;
- obsessing about consistent names everywhere;

etc. I know this because I've wasted many hours doing all of these.

(I shouldn't *need* to say this, but I will anyway: I am making no 
comment on *you and your team* specifically, as I don't know you. I'm 
making a general observation about the tendency of many programmers, 
myself included, to waste time in unproductive "refactoring" which 
doesn't actually help code quality.)

I don't see "inconsistent" names in different functions or modules to be 
a problem that needs to be avoided. Hence, I don't avoid it. I don't go 
out of my way to use unique names, but nor do I try to impose a 
single-name policy.

In any case, we're still up against the fact that in 2018, the state of 
the art in type checkers is that they ought to be able to do what a 
human reader does and infer the type of your variables. It shouldn't 
matter whether you call it "response" or "spam" or "zpaqxerag", if your 
type-checker can't work out that

foo = HttpResponse(*args)

is a HttpResponse object, something has gone wrong somewhere. Adding yet 
another way to annotate variables instead of improving type-inference 
seems like 

Re: [Python-ideas] [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

2018-10-22 Thread Michael Selik
I switched this thread to the python-ideas list, since this is proposing a
new feature.


On Mon, Oct 22, 2018 at 12:13 PM Sean Harrington 
wrote:

> I contend that multiprocessing.Pool is used most frequently with a single
> task. I am proposing a feature that enforces this invariant, optimizes task
> memory-footprints & thus serialization time, and preserves the
> well-established interface to Pool through subclassing.
>

We've got a disagreement over interface design. That's more easily
discussed with concrete examples. In my own work, I often create a procmap
or threadmap function to make that pattern more pleasant. Since you're
making the proposal, why not share some examples of using Pool in the
manner you'd like? As real a chunk of code as you're able to share publicly
(and concisely).
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] TypeHinting: From variable name to type

2018-10-22 Thread Dan Sommers

On 10/22/18 9:04 AM, Steven D'Aprano wrote:


My IDE is Unix. (Technically, Linux.)


+1


Or just google https://duckduckgo.com/?q=unix+as+an+ide


Thank you for not verbing DuckDuckGo!  :-)


... (I use an actual GUI editor, not
Vim or Emacs) ...


[ ... must ... resist ... holy war ... ]

Dan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] TypeHinting: From variable name to type

2018-10-22 Thread Steven D'Aprano
On Mon, Oct 22, 2018 at 02:35:40PM +0200, Anders Hovmöller wrote:
> 
> > But the critical point here is that we should not add a language feature 
> > to make up for the limitations of a single IDE. If the free version of 
> > PyCharm is underpowered, perhaps you ought to try the paid version, or 
> > another IDE, or submit a feature request to PyCharm, *before* turning to 
> > the Python language.
> 
> Which IDE do you use that is so much greater than PyCharm? I would love to 
> try it!

My IDE is Unix. (Technically, Linux.)

https://sanctum.geek.nz/arabesque/series/unix-as-ide/

Or just google https://duckduckgo.com/?q=unix+as+an+ide

Although I'm not as hard-core as some (I use an actual GUI editor, not 
Vim or Emacs). I'm not opposed to the concept of IDEs as such, my first 
two development environments (THINK Pascal, and Hypercard) were IDEs and 
they were great, especially considing the limited resources they had 
available back in 1988 or so.

But... modern IDEs... I dunno... I don't begrudge you if you like them, 
but I don't think they're for me.


> > Of course I do. It isn't an edge-case, it is representative of the vast 
> > majority of variable names:
> > 
> > - "A single variable name is always the same type" is the edge-case.
> 
> I strongly disagree.

Okay.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] TypeHinting: From variable name to type. Yes: no change to language, just convetion

2018-10-22 Thread Anders Hovmöller

>> This is certainly not something that requires language support.  It can 
>> easily be purely a convention, as long as different IDEs, linters, type 
>> checkers, etc. agree on what the convention is.  Maybe at some point in the 
>> future, if the convention becomes adopted, there might be some help in 
>> having a standard library module, or even minimal language recognition, of 
>> the convention.  But let's work on adopting a convention first.
> 
> Yes, this sounds good. There is no need to change the python language, it is 
> should be a convetion.


Might not even be a convention. I have now tried using 
https://github.com/edreamleo/make-stub-files 
 on the code base at work. I had 
to fix some fatal bugs, and it's still pretty buggy even after those fixes, AND 
PyCharm has some nasty bugs with stub files that are blockers, but it's clear 
that generating stub files to do this is feasible.

/ Anders___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] TypeHinting: From variable name to type

2018-10-22 Thread Anders Hovmöller


> But the critical point here is that we should not add a language feature 
> to make up for the limitations of a single IDE. If the free version of 
> PyCharm is underpowered, perhaps you ought to try the paid version, or 
> another IDE, or submit a feature request to PyCharm, *before* turning to 
> the Python language.

Which IDE do you use that is so much greater than PyCharm? I would love to try 
it!


> Of course I do. It isn't an edge-case, it is representative of the vast 
> majority of variable names:
> 
> - "A single variable name is always the same type" is the edge-case.

I strongly disagree. I also wrote a mail with my motivation and context where I 
asked you to supply your basis for believing this. You have not replied. I 
think we need to try to understand each others perspective here, but it's 
really hard to understand yours if you won't give us any information on it.

/ Anders

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Multi Statement Lambdas

2018-10-22 Thread Serhiy Storchaka

22.10.18 02:16, Terry Reedy пише:
All 
functions created from lambda expressions get the same pseudo-name 
''.  This can make tracebacks worse.  Perhaps more importantly, 
proper testing may become harder.


See https://bugs.python.org/issue34856. But this can work only while 
lambda's body is a simple expression.



 >>> for i in map(lambda x: x **
   2, 'abc'):
 print(i)

Traceback (most recent call last):
   File "", line 2, in 
     2, 'abc'):
   File "", line 2, in 
     2, 'abc'):
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'


In 3.8 the traceback is different (an can be even more informative with 
resolved issue34856).


Traceback (most recent call last):
  File "1.py", line 1, in 
for i in map(lambda x: x **
  File "1.py", line 1, in 
for i in map(lambda x: x **
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

Given that there might be a hundred functions named '', I think 
the specific name is a bit helpful.


I think the main problem is not with tracebacks, but with reprs. If you 
have a pack of callbacks, it is not easy to figure out what they do if 
they are anonymous functions.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add closing and iteration to threading.Queue

2018-10-22 Thread Vladimir Filipović
Nathaniel, thank you for the pointer to Trio.
Its approach seems very robust. I'm relieved to see that a solution so
fundamentally rebuilt has also settled on very similar semantics for
its `.close_put()`.

I think your `.clone()` idiom is very clear when the communication
objects are treated as distinct endpoints. Something with similar
effects on closing (not necessarily similar in idiom) would probably
be a neat enhancement to the standard Queue, though if I was making
that, I'd do it in an external package.

--

Antoine, regarding multiprocessing.Queue:

The similarity of meaning behind closing that I was getting at is that
mp.Q.close() means "I am done writing to this queue, and I don't care
about the rest of you", whereas the proposed meaning of q.Q.close() is
"Listen up, we are all done writing to this queue". I don't know yet
that this difference necessarily creates a true incompatibility.

That the effects (in terms of eager OS-resource cleanup) are different
shouldn't be a problem in itself - every implementation does the right
thing for itself.

--

On Mon, Oct 22, 2018 at 2:03 AM Terry Reedy  wrote:
> The proposed close method would only half-close the queue: closed to
> puts, open to gets (but perhaps close completely when the last item is
> gotten.

In other words: in this proposal, there is no such thing as "closed
for retrieval". A closed queue means exactly that it's closed for
insertion.

Retrieval becomes permanently impossible once the queue is closed and
exhausted, and that's a condition that get() must treat correctly and
usefully, but calling that condition "closed / completely closed /
closed for retrieval" would muddle up the terminology.

In the proposed implementation I've called it "exhausted", a name I've
picked up god-knows-when and from god-knows-where, but it seemed
reasonable.

--

Regarding sentinels in general: They are a limited-purpose solution,
and this proposal should make them unnecessary in 99% of the cases.

Firstly, they only naturally apply to FIFO queues. You could hack your
use of LIFO and priority queues to also rely on sentinels, but it's
very kludgey in the general cases, not a natural fit, and not
generalizable to user-created children of Queue (which Queue otherwise
explicitly aspires to support).

Secondly, they only work when the producer is the one driving the flow
and notifying the consumer that "no more is forthcoming". They don't
work when the producer is the one who needs to be notified.

Thirdly, they're a potential cause of deadlocks when the same threads
act as both producers and consumers. (E.g. in a parallelized
breadth-first-search.) I'm sure this is the circular flow that
Nathaniel was referring to, but I'll let him detail it more or correct
me.

Fourthly, they don't make it easy to query the Queue about whether
it's closed. This probably isn't a big deal admittedly.

Sure, when sentinels are adequate, they're adequate. This proposal
aims to be more general-purpose than that.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Python Enhancement Proposal for List methods

2018-10-22 Thread Siva Sukumar Reddy
Sounds good to me. Thanks.

On Mon 22 Oct, 2018, 06:18 David Mertz,  wrote:

> I have changed my opinion to -1 on list.replace(). While max=n is a useful
> argument, a plain function could work equally well on every sequence and
> not be specific to lists.
>
> On Sun, Oct 21, 2018, 10:18 AM David Mertz 
>> The list comprehensions are not very hard, and are more general. EXCEPT
>> with the limited number of occurrences. We have this for str.replace(...,
>> max=n), and it is useful fairly often.
>>
>> I'm +0.5 on .replace() with that capability. But -1 on .removeall() that
>> adds nothing to an easy listcomp.
>>
>> On Sun, Oct 21, 2018, 9:01 AM Siva Sukumar Reddy > wrote:
>>
>>> Hey everyone,
>>>
>>> I am really new to Python contribution community want to propose below
>>> methods for List object. Forgive me if this is not the format to send an
>>> email.
>>>
>>> 1. *list.replace( item_to_be_replaced, new_item )*: which replaces all
>>> the occurrences of an element in the list instead of writing a new list
>>> comprehension in place.
>>> 2. *list.replace( item_to_be_replaced, new_item, number_of_occurrences
>>> )*: which replaces the occurrences of an element in the list till
>>> specific number of occurrences of that element. The number_of_occurrences
>>> can defaulted to 0 which will replace all the occurrences in place.
>>> 3. *list.removeall( item_to_be_removed )*: which removes all the
>>> occurrences of an element in a list in place.
>>>
>>> What do you think about these features?
>>> Are they PEP-able? Did anyone tried to implement these features before?
>>> Please let me know.
>>>
>>> Thank you,
>>> Sukumar
>>> ___
>>> Python-ideas mailing list
>>> Python-ideas@python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/