Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-13 Thread Guido van Rossum
On 10/13/05, Gustavo J. A. M. Carneiro <[EMAIL PROTECTED]> wrote:
>   I'd just like to point out that Queue is not quite as useful as people
> seem to think in this thread.  The main problem is that I can't
> integrate Queue into a select/poll based main loop.

Well, you're mixing two incompatible paradigms there, so that's to be
expected, right? Either you're using async I/O or you're using
threads. Mixing the two causes confusion and bugs no matter what you
try.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-13 Thread Gustavo J. A. M. Carneiro
  I'd just like to point out that Queue is not quite as useful as people
seem to think in this thread.  The main problem is that I can't
integrate Queue into a select/poll based main loop.

  The other day I wanted extended a python main loop, which uses poll(),
to be thread safe, so I could queue idle functions from separate
threads.  Obviously Queue doesn't work (no file descriptor to poll), so
I just ended up creating a pipe, to which I send a single byte when I
want to "wake up" the main loop to make it realize changes in its
configuration, such as a new callback added.

  I guess this is partly an unix problem.  There's no system call to say
like "wake me up when one of these descriptors has data OR when this
condition variable is set".  Windows has WaitForMultipleObjects, which I
suspect is quite a bit more powerful.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
The universe is always one step beyond logic.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-12 Thread Nick Coghlan
Guido van Rossum wrote:
> Apart from trying to guess the API without reading the docs (:-), what
> are the use cases for using put/get with a timeout? I have a feeling
> it's not that common.

Actually, I think wanting to use a timeout is an artifact of a history of 
dealing with too many C libraries which don't provide a proper event-based or 
select-style interface (which means the calls have to time out periodically in 
order to respond gracefully to program shutdown requests).

However, because Queues are multi-producer, that isn't a problem - I just have 
to remember to push the shutdown request in through the Queue.

Basically, I'd fallen into the "trying-to-write-C-in-Python" trap and I simply 
didn't notice until I read the responses in this thread :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Josiah Carlson

[Guido]
> >> Apart from trying to guess the API without reading the docs (:-), what
> >> are the use cases for using put/get with a timeout? I have a feeling
> >> it's not that common.

[Josiah Carlson]
> > With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
> > use one in the tuple space implementation I have for connections to the
> > tuple space).

[Tim Peters]
> Passing timeout=0 is goofy:  use {get,put}_nowait() instead.  There's
> no difference in semantics.

I understand this, as do many others who use it.  However, having both
manually and automatically tuned timeouts myself in certain applications,
the timeout=0 case is useful.  Uncommon?  Likely, I've not yet seen any
examples of anyone using this particular timeout method at koders.com .


> > Note that technically speaking, Queue.Queue from Pythons
> > prior to 2.4 is broken: get_nowait() may not get an object even if the
> > Queue is full, this is caused by "elif not self.esema.acquire(0):" being
> > called for non-blocking requests.  Tim did more than simplify the
> > structure by rewriting it, he fixed this bug.
> 
> I don't agree it was a bug, but I did get fatally weary of arguing
> with people who insisted it was ;-)  It's certainly easier to explain
> (and the code is easier to read) now.

When getting an object from a non-empty queue fails because some other
thread already had the lock, and it is a fair assumption that the other
thread will release the lock within the next context switch...

Because I still develop on Python 2.3 (I need to support a commercial
codebase made with 2.3), I was working around it by using the timeout
parameter:
try:
connection = connection_queue.get(timeout=.01)
except Queue.Empty:
connection = make_new_connection()

With only get_nowait() calls, by the time I hit 3-4 threads, it was
failing to pick up connections even when there were hundreds in the
queue, and I quickly ran into the file handle limit for my platform, not
to mention that the server I was connecting to used asynchronous sockets
and select, which died at the 513th incoming socket.

I have since copied the implementation of 2.4's queue into certain
portions of code which make use of get_nowait() and its variants
(handline the deque reference as necessary).

Any time one needs to work around a "not buggy feature" with some
claimed "unnecessary feature", it tends to smell less than pristine to
my nose.


> > With block=True, timeout=None, worker threads pulling from a work-to-do
> > queue, and even a thread which handles the output of those threads via
> > a result queue.
> 
> Guido understands use cases for blocking and non-blocking put/get, and
> Queue always supported those possibilities.  The timeout argument got
> added later, and it's not really clear _why_ it was added.  timeout=0
> isn't a sane use case (because the same effect can be gotten with
> non-blocking put/get).

def t():
try:
#thread state setup...
while not QUIT:
try:
work = q.get(timeout=5)
except Queue.Empty:
continue
#handle work
finally:
#thread state cleanup...

Could the above be daemonized?  Certainly, but then the thread state
wouldn't be cleaned up.  If you can provide me with a way of doing the
above with equivalent behavior, using only get_nowait() and get(), then
put it in the documentation.  If not, then I'd say that the timeout
argument is a necessarily useful feature.

[Guido]
> But one lesson we can learn from sockets (or perhaps the reason why
> people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is
> just a different way to spell blocking=False. The socket module makes
> sure that the socket ends up in exactly the same state no matter which
> API is used; and in fact the setblocking() API is redundant.

This would suggest to me that at least for sockets, setblocking() could
be deprecated, as could the block parameter in Queue.  I wouldn't vote
for either deprecation, but it would seem to make more sense than to
remove the timeout arguments from both.


 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Guido van Rossum
On 10/11/05, Tim Peters <[EMAIL PROTECTED]> wrote:
> Guido understands use cases for blocking and non-blocking put/get, and
> Queue always supported those possibilities.  The timeout argument got
> added later, and it's not really clear _why_ it was added.  timeout=0
> isn't a sane use case (because the same effect can be gotten with
> non-blocking put/get).

In the socket world, a similar bifurcation of the API has happened
(also under my supervision, even though the idea and prototype code
were contributed by others). The API there is very different because
the blocking or timeout is an attribute of the socket, not passed in
to every call.

But one lesson we can learn from sockets (or perhaps the reason why
people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is
just a different way to spell blocking=False. The socket module makes
sure that the socket ends up in exactly the same state no matter which
API is used; and in fact the setblocking() API is redundant.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Tim Peters
[Guido]
>> Apart from trying to guess the API without reading the docs (:-), what
>> are the use cases for using put/get with a timeout? I have a feeling
>> it's not that common.

[Josiah Carlson]
> With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
> use one in the tuple space implementation I have for connections to the
> tuple space).

Passing timeout=0 is goofy:  use {get,put}_nowait() instead.  There's
no difference in semantics.

> Note that technically speaking, Queue.Queue from Pythons
> prior to 2.4 is broken: get_nowait() may not get an object even if the
> Queue is full, this is caused by "elif not self.esema.acquire(0):" being
> called for non-blocking requests.  Tim did more than simplify the
> structure by rewriting it, he fixed this bug.

I don't agree it was a bug, but I did get fatally weary of arguing
with people who insisted it was ;-)  It's certainly easier to explain
(and the code is easier to read) now.

> With block=True, timeout=None, worker threads pulling from a work-to-do
> queue, and even a thread which handles the output of those threads via
> a result queue.

Guido understands use cases for blocking and non-blocking put/get, and
Queue always supported those possibilities.  The timeout argument got
added later, and it's not really clear _why_ it was added.  timeout=0
isn't a sane use case (because the same effect can be gotten with
non-blocking put/get).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Josiah Carlson

Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > Optionally, the existing "put" and "get" methods could be deprecated, with 
> > the
> > goal of eventually changing their signature to match the put_wait and 
> > get_wait
> > methods above.
> 
> Apart from trying to guess the API without reading the docs (:-), what
> are the use cases for using put/get with a timeout? I have a feeling
> it's not that common.

With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
use one in the tuple space implementation I have for connections to the
tuple space). Note that technically speaking, Queue.Queue from Pythons
prior to 2.4 is broken: get_nowait() may not get an object even if the
Queue is full, this is caused by "elif not self.esema.acquire(0):" being
called for non-blocking requests.  Tim did more than simplify the
structure by rewriting it, he fixed this bug.

With block=True, timeout=None, worker threads pulling from a work-to-do
queue, and even a thread which handles the output of those threads via
a result queue.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Guido van Rossum
On 10/11/05, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> The multi-processing discussion reminded me that I have a few problems I run
> into every time I try to use Queue objects.
>
> My first problem is finding it:
>
> Py> from threading import Queue # Nope
> Traceback (most recent call last):
>File "", line 1, in ?
> ImportError: cannot import name Queue
> Py> from Queue import Queue # Ah, there it is

I don't think that's a reason to move it.

>>> from sys import Queue
ImportError: cannon import name Queue
>>> from os import Queue
ImportError: cannot import name Queue
>>> # Well where the heck is it?!

> What do people think of the idea of adding an alias to Queue into the
> threading module so that:
> a) the first line above works; and

I see no need. Code that *doesn't* need Queue but does use threading
shouldn't have to pay for loading Queue.py.

> b) Queue can be documented with all of the other threading primitives,
>rather than being off somewhere else in its own top-level section.

Do top-level sections have to limit themselves to a single module?

Even if they do, I think it's fine to plant a prominent link to the
Queue module. You can't really expect people to learn how to use
threads wisely from reading the library reference anyway.

> My second problem is with the current signatures of the put() and get()
> methods. Specifically, the following code blocks forever instead of raising an
> Empty exception after 500 milliseconds as one might expect:
>from Queue import Queue
>x = Queue()
>x.get(0.5)

I'm not sure if I have much sympathy with a bug due to refusing to
read the docs... :)

> I assume the current signature is there for backward compatibility with the
> original version that didn't support timeouts (considering the difficulty of
> telling the difference between "x.get(1)" and "True = 1; x.get(True)" from
> inside the get() method)

Huh? What a bizarre idea. Why would you do that? I gues I don't
understand where you're coming from.

> However, the need to write "x.get(True, 0.5)" seems seriously redundant, given
> that a single paramater can actually handle all the options (as is currently
> the case with Condition.wait()).

So write x.get(timeout=0.5). That's clear and unambiguous.

> The "put_nowait" and "get_nowait" functions are fine, because they serve a
> useful documentation purpose at the calling point (particularly given the
> current clumsy timeout signature).
>
> What do people think of the idea of adding "put_wait" and "get_wait" methods
> with the signatures:
>put_wait(item,[timeout=None)
>get_wait([timeout=None])

-1. I'd rather not tweak the current Queue module at all until Python
3000. Then we could force people to use keyword args.

> Optionally, the existing "put" and "get" methods could be deprecated, with the
> goal of eventually changing their signature to match the put_wait and get_wait
> methods above.

Apart from trying to guess the API without reading the docs (:-), what
are the use cases for using put/get with a timeout? I have a feeling
it's not that common.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Making Queue.Queue easier to use

2005-10-11 Thread Nick Coghlan
The multi-processing discussion reminded me that I have a few problems I run 
into every time I try to use Queue objects.

My first problem is finding it:

Py> from threading import Queue # Nope
Traceback (most recent call last):
   File "", line 1, in ?
ImportError: cannot import name Queue
Py> from Queue import Queue # Ah, there it is

What do people think of the idea of adding an alias to Queue into the 
threading module so that:
a) the first line above works; and
b) Queue can be documented with all of the other threading primitives,
   rather than being off somewhere else in its own top-level section.

My second problem is with the current signatures of the put() and get() 
methods. Specifically, the following code blocks forever instead of raising an 
Empty exception after 500 milliseconds as one might expect:
   from Queue import Queue
   x = Queue()
   x.get(0.5)

I assume the current signature is there for backward compatibility with the 
original version that didn't support timeouts (considering the difficulty of 
telling the difference between "x.get(1)" and "True = 1; x.get(True)" from 
inside the get() method)

However, the need to write "x.get(True, 0.5)" seems seriously redundant, given 
that a single paramater can actually handle all the options (as is currently 
the case with Condition.wait()).

The "put_nowait" and "get_nowait" functions are fine, because they serve a 
useful documentation purpose at the calling point (particularly given the 
current clumsy timeout signature).

What do people think of the idea of adding "put_wait" and "get_wait" methods 
with the signatures:
   put_wait(item,[timeout=None)
   get_wait([timeout=None])

Optionally, the existing "put" and "get" methods could be deprecated, with the 
goal of eventually changing their signature to match the put_wait and get_wait 
methods above.

If people are amenable to these ideas, I should be able to work up a patch for 
them this week.

Regards,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com