Re: [Python-Dev] Issue #10348: concurrent.futures doesn't work on BSD
On Dec 29, 2010, at 2:55 PM, Victor Stinner wrote: Le mercredi 29 décembre 2010 à 21:49 +0100, Martin v. Löwis a écrit : Of course, one may wonder why test_first_completed manages to create 41 SemLock objects, when all it tries to do is two future calls. More numbers (on Linux): - Queue: 3 SemLock - Condition: 4 SemLock - Event: 5 SemLock - Call (test_concurrent_futures): 10 SemLock (2 Event) - ProcessPoolExecutor: 11 SemLock (2 Queue, 1 Condition) FreeBSD 7.2 is limited to 30 semaphores, so with ProcessPoolExecutor, you can only create *one* Call object, whereas some tests use 4 Call objects or more. Great detective work! This would suggest that ProcessPoolExecutors are useable on FreeBSD 7.2 so long as the user doesn't create more than two at once (which probably isn't a big deal for most apps). So skipping the test is probably the way to go. Cheers, Brian Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue #10348: concurrent.futures doesn't work on BSD
On Dec 29, 2010, at 12:49 PM, Martin v. Löwis wrote: If the functionality is not supported then users get an import error (within multiprocessing). However, RDM's understanding is correct, and the test is creating more than supported. Hmm. The tests do the absolute minimum stuff that exercises the code; doing anything less, and they would be useless. Of course, one may wonder why test_first_completed manages to create 41 SemLock objects, when all it tries to do is two future calls. I actually think that my tests may be overdone - in order to probe for specific race conditions they use a lot of locks to force calls to complete in a specific order. I'm thinking about pairing the tests down to only demonstrate basic correctness. This should fix the tests on FreeBSD and Windows. Then, when Python 3.2 is released, I can gradually introduce more comprehensive tests while ensuring that I keep the buildbots green on all supported platforms. Thoughts? Cheers, Brian So if the minimal test case fails, I'd claim that the module doesn't work on FreeBSD, period. ISTM that Posix IPC is just not a feasible approach to do IPC synchronization on FreeBSD, so it's better to say that multiprocessing is not supported on FreeBSD (until SysV IPC is getting used, hoping that this will fare better). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 11, 2010, at 6:44 AM, Thomas Nagy wrote: --- El vie, 10/12/10, Brian Quinlan escribió: On Dec 10, 2010, at 10:51 AM, Thomas Nagy wrote: --- El vie, 10/12/10, Brian Quinlan escribió: On Dec 10, 2010, at 5:36 AM, Thomas Nagy wrote: I have a process running for a long time, and which may use futures of different max_workers count. I think it is not too far-fetched to create a new futures object each time. Yet, the execution becomes slower after each call, for example with http://freehackers.org/~tnagy/futures_test.py: import concurrent.futures from queue import Queue import datetime class counter(object): def __init__(self, fut): self.fut = fut def run(self): def look_busy(num, obj): tot = 0 for x in range(num): tot += x obj.out_q.put(tot) start = datetime.datetime.utcnow() self.count = 0 self.out_q = Queue(0) for x in range(1000): self.count += 1 self.fut.submit(look_busy, self.count, self) while self.count: self.count -= 1 self.out_q.get() delta = datetime.datetime.utcnow() - start print(delta.total_seconds()) fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) for x in range(100): # comment the following line fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) c = counter(fut) c.run() The runtime grows after each step: 0.216451 0.225186 0.223725 0.74 0.230964 0.240531 0.24137 0.252393 0.249948 0.257153 ... Is there a mistake in this piece of code? There is no mistake that I can see but I suspect that the circular references that you are building are causing the ThreadPoolExecutor to take a long time to be collected. Try adding: c = counter(fut) c.run() +fut.shutdown() Even if that fixes your problem, I still don't fully understand this because I would expect the runtime to fall after a while as ThreadPoolExecutors are collected. The shutdown call is indeed a good fix :-) Here is the time response of the calls to counter() when shutdown is not called: http://www.freehackers.org/~tnagy/runtime_futures.png FWIW, I think that you are confusion the term future with executor. A future represents a single work item. An executor creates futures and schedules their underlying work. Ah yes, sorry. I have also realized that the executor is not the killer feature I was expecting, it can only replace a little part of the code I have: controlling the exceptions and the workflow is the most complicated part. I have also observed a minor performance degradation with the executor replacement (3 seconds for 5000 work items). The amount of work items processed by unit of time does not seem to be a straight line: http://www.freehackers.org/~tnagy/runtime_futures_2.png . That looks pretty linear to me. Out of curiosity, what is the _thread_references for? There is a big comment above it in the code: # Workers are created as daemon threads. This is done to allow the interpreter # to exit when there are still idle threads in a ThreadPoolExecutor's thread # pool (i.e. shutdown() was not called). However, allowing workers to die with # the interpreter has two undesirable properties: # - The workers would still be running during interpretor shutdown, # meaning that they would fail in unpredictable ways. # - The workers could be killed while evaluating a work item, which could # be bad if the callable being evaluated has external side-effects e.g. # writing to a file. # # To work around this problem, an exit handler is installed which tells the # workers to exit when their work queues are empty and then waits until the # threads finish. _thread_references = set() _shutdown = False def _python_exit(): global _shutdown _shutdown = True for thread_reference in _thread_references: thread = thread_reference() if thread is not None: thread.join() Is it still unclear why it is there? Maybe you could propose some additional documentation. Cheers, Brian The source file for the example is in: http://www.freehackers.org/~tnagy/futures_test3.py The diagram was created by: http://www.freehackers.org/~tnagy/futures_test3.plot Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 11, 2010, at 6:33 PM, Nick Coghlan wrote: On Sun, Dec 12, 2010 at 6:53 AM, Brian Quinlan br...@sweetapp.com wrote: Is it still unclear why it is there? Maybe you could propose some additional documentation. Did you get my question the other day as to whether a weakref.WeakKeySet might be a better choice? I believe you would be able to get rid of the periodic sweep for dead references if you did that, and I didn't spot any obvious downsides. No I didn't, sorry! Could you resent it if it has more details then the paragraph above? Cheers, Brian Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
Oops. I accidentally replied off-list: On Dec 10, 2010, at 5:36 AM, Thomas Nagy wrote: --- El jue, 9/12/10, Brian Quinlan escribió: On Dec 9, 2010, at 4:26 AM, Thomas Nagy wrote: I am looking forward to replacing a piece of code (http://code.google.com/p/waf/source/browse/trunk/waflib/Runner.py#86 ) by the futures module which was announced in python 3.2 beta. I am a bit stuck with it, so I have a few questions about the futures: 1. Is the futures API frozen? Yes. 2. How hard would it be to return the tasks processed in an output queue to process/consume the results while they are returned? The code does not seem to be very open for monkey patching. You can associate a callback with a submitted future. That callback could add the future to your queue. Ok, it works. I was thinking the object was cleaned up immediately after it was used. 3. How hard would it be to add new tasks dynamically (after a task is executed) and have the futures object never complete? I'm not sure that I understand your question. You can submit new work to an Executor at until time until it is shutdown and a work item can take as long to complete as you want. If you are contemplating tasks that don't complete then maybe you could be better just scheduling a thread. 4. Is there a performance evaluation of the futures code (execution overhead) ? No. Scott Dial did make some performance improvements so he might have a handle on its overhead. Ok. I have a process running for a long time, and which may use futures of different max_workers count. I think it is not too far-fetched to create a new futures object each time. Yet, the execution becomes slower after each call, for example with http://freehackers.org/~tnagy/futures_test.py : import concurrent.futures from queue import Queue import datetime class counter(object): def __init__(self, fut): self.fut = fut def run(self): def look_busy(num, obj): tot = 0 for x in range(num): tot += x obj.out_q.put(tot) start = datetime.datetime.utcnow() self.count = 0 self.out_q = Queue(0) for x in range(1000): self.count += 1 self.fut.submit(look_busy, self.count, self) while self.count: self.count -= 1 self.out_q.get() delta = datetime.datetime.utcnow() - start print(delta.total_seconds()) fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) for x in range(100): # comment the following line fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) c = counter(fut) c.run() The runtime grows after each step: 0.216451 0.225186 0.223725 0.74 0.230964 0.240531 0.24137 0.252393 0.249948 0.257153 ... Is there a mistake in this piece of code? There is no mistake that I can see but I suspect that the circular references that you are building are causing the ThreadPoolExecutor to take a long time to be collected. Try adding: c = counter(fut) c.run() + fut.shutdown() Even if that fixes your problem, I still don't fully understand these numbers because I would expect the runtime to fall after a while as ThreadPoolExecutors are collected. Cheers, Brian Thanks, Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 10, 2010, at 10:51 AM, Thomas Nagy wrote: --- El vie, 10/12/10, Brian Quinlan escribió: On Dec 10, 2010, at 5:36 AM, Thomas Nagy wrote: I have a process running for a long time, and which may use futures of different max_workers count. I think it is not too far-fetched to create a new futures object each time. Yet, the execution becomes slower after each call, for example with http://freehackers.org/~tnagy/futures_test.py: import concurrent.futures from queue import Queue import datetime class counter(object): def __init__(self, fut): self.fut = fut def run(self): def look_busy(num, obj): tot = 0 for x in range(num): tot += x obj.out_q.put(tot) start = datetime.datetime.utcnow() self.count = 0 self.out_q = Queue(0) for x in range(1000): self.count += 1 self.fut.submit(look_busy, self.count, self) while self.count: self.count -= 1 self.out_q.get() delta = datetime.datetime.utcnow() - start print(delta.total_seconds()) fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) for x in range(100): # comment the following line fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) c = counter(fut) c.run() The runtime grows after each step: 0.216451 0.225186 0.223725 0.74 0.230964 0.240531 0.24137 0.252393 0.249948 0.257153 ... Is there a mistake in this piece of code? There is no mistake that I can see but I suspect that the circular references that you are building are causing the ThreadPoolExecutor to take a long time to be collected. Try adding: c = counter(fut) c.run() +fut.shutdown() Even if that fixes your problem, I still don't fully understand this because I would expect the runtime to fall after a while as ThreadPoolExecutors are collected. The shutdown call is indeed a good fix :-) Here is the time response of the calls to counter() when shutdown is not called: http://www.freehackers.org/~tnagy/runtime_futures.png FWIW, I think that you are confusion the term future with executor. A future represents a single work item. An executor creates futures and schedules their underlying work. Hmmmthat is very suspicious - it looks like the ThreadPoolExecutors are not being collected. If you are feeling bored you could figure out why not :-) After trying to stop the program by using CTRL+C, the following error may appear, after which the process cannot be interrupted: 19:18:12 /tmp/build python3.2 futures_test.py 0.389657 0.417173 0.416513 0.421424 0.449666 0.482273 ^CTraceback (most recent call last): File futures_test.py, line 36, in module c.run() File futures_test.py, line 22, in run self.fut.submit(look_busy, self.count, self) File /usr/local/lib/python3.2/concurrent/futures/thread.py, line 114, in submit self._work_queue.put(w) File /usr/local/lib/python3.2/queue.py, line 135, in put self.not_full.acquire() KeyboardInterrupt It is not expected, is it? It isn't surprising. Python lock acquisitions are not interruptible and anytime you interrupt a program that manipulates locks you may kill the code that was going to cause the lock to be released. Cheers, Brian Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 10, 2010, at 11:39 AM, Thomas Nagy wrote: --- El vie, 10/12/10, Thomas Nagy escribió: --- El vie, 10/12/10, Brian Quinlan escribió: On Dec 10, 2010, at 5:36 AM, Thomas Nagy wrote: I have a process running for a long time, and which may use futures of different max_workers count. I think it is not too far-fetched to create a new futures object each time. Yet, the execution becomes slower after each call, for example with http://freehackers.org/~tnagy/futures_test.py: import concurrent.futures from queue import Queue import datetime class counter(object): def __init__(self, fut): self.fut = fut def run(self): def look_busy(num, obj): tot = 0 for x in range(num): tot += x obj.out_q.put(tot) start = datetime.datetime.utcnow() self.count = 0 self.out_q = Queue(0) for x in range(1000): self.count += 1 self.fut.submit(look_busy, self.count, self) while self.count: self.count -= 1 self.out_q.get() delta = datetime.datetime.utcnow() - start print(delta.total_seconds()) fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) for x in range(100): # comment the following line fut = concurrent.futures.ThreadPoolExecutor(max_workers=20) c = counter(fut) c.run() The runtime grows after each step: 0.216451 0.225186 0.223725 0.74 0.230964 0.240531 0.24137 0.252393 0.249948 0.257153 ... Is there a mistake in this piece of code? There is no mistake that I can see but I suspect that the circular references that you are building are causing the ThreadPoolExecutor to take a long time to be collected. Try adding: c = counter(fut) c.run() +fut.shutdown() Even if that fixes your problem, I still don't fully understand this because I would expect the runtime to fall after a while as ThreadPoolExecutors are collected. The shutdown call is indeed a good fix :-) Here is the time response of the calls to counter() when shutdown is not called: http://www.freehackers.org/~tnagy/runtime_futures.png After trying to stop the program by using CTRL+C, the following error may appear, after which the process cannot be interrupted: 19:18:12 /tmp/build python3.2 futures_test.py 0.389657 0.417173 0.416513 0.421424 0.449666 0.482273 ^CTraceback (most recent call last): File futures_test.py, line 36, in module c.run() File futures_test.py, line 22, in run self.fut.submit(look_busy, self.count, self) File /usr/local/lib/python3.2/concurrent/futures/thread.py, line 114, in submit self._work_queue.put(w) File /usr/local/lib/python3.2/queue.py, line 135, in put self.not_full.acquire() KeyboardInterrupt It is not expected, is it? The problem also occurs when using a callback: http://www.freehackers.org/~tnagy/futures_test2.py If it is necessary to catch KeyboardInterrupt exceptions to cancel the futures execution, then how about adding this detail to the docs? AFAIK, catching KeyboardInterrupt exceptions is not sufficient. Cheers, Brian Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 9, 2010, at 4:26 AM, Thomas Nagy wrote: Hello, I am looking forward to replacing a piece of code (http://code.google.com/p/waf/source/browse/trunk/waflib/Runner.py#86 ) by the futures module which was announced in python 3.2 beta. I am a bit stuck with it, so I have a few questions about the futures: 1. Is the futures API frozen? Yes. 2. How hard would it be to return the tasks processed in an output queue to process/consume the results while they are returned? The code does not seem to be very open for monkey patching. You can associate a callback with a submitted future. That callback could add the future to your queue. 3. How hard would it be to add new tasks dynamically (after a task is executed) and have the futures object never complete? I'm not sure that I understand your question. You can submit new work to an Executor at until time until it is shutdown and a work item can take as long to complete as you want. If you are contemplating tasks that don't complete then maybe you could be better just scheduling a thread. 4. Is there a performance evaluation of the futures code (execution overhead) ? No. Scott Dial did make some performance improvements so he might have a handle on its overhead. Cheers, Brian Thanks, Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] futures API
On Dec 9, 2010, at 2:39 PM, Raymond Hettinger wrote: On Dec 9, 2010, at 9:02 AM, Brian Quinlan wrote: On Dec 9, 2010, at 4:26 AM, Thomas Nagy wrote: Hello, I am looking forward to replacing a piece of code (http://code.google.com/p/waf/source/browse/trunk/waflib/Runner.py#86 ) by the futures module which was announced in python 3.2 beta. I am a bit stuck with it, so I have a few questions about the futures: 1. Is the futures API frozen? Yes. Yes, unless the current API is defective in some way. A beta1 release is a chance for everyone to exercise the new API and discover whether it is problematic in any real world applications. 2. How hard would it be to return the tasks processed in an output queue to process/consume the results while they are returned? The code does not seem to be very open for monkey patching. You can associate a callback with a submitted future. That callback could add the future to your queue. The would be a good example for the docs. I don't know what Thomas' use case is but I expect that taking the results of a future and asynchronously sticking it in another queue is not typical. Cheers, Brian 3. How hard would it be to add new tasks dynamically (after a task is executed) and have the futures object never complete? I'm not sure that I understand your question. You can submit new work to an Executor at until time until it is shutdown and a work item can take as long to complete as you want. If you are contemplating tasks that don't complete then maybe you could be better just scheduling a thread. 4. Is there a performance evaluation of the futures code (execution overhead) ? No. Scott Dial did make some performance improvements so he might have a handle on its overhead. FWIW, the source code is short and readable. From my quick read, it looks to be a relatively thin wrapper/adapter around existing tools. Most of the work still gets done by the threads or processes themselves. Think of this as a cleaner, more centralized API around the current toolset -- there is no deep new technology under the hood. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement [ACCEPTED]
On 13 Jul 2010, at 00:59, Titus von der Malsburg wrote: On Tue, Jul 13, 2010 at 12:48:35AM +1000, Nick Coghlan wrote: On Tue, Jul 13, 2010 at 12:19 AM, Titus von der Malsburg That's what actually happens, so you can code it either way That's great! None of the examples I found used the pythonic exception style, that's why I assumed that checking the return value is the only possibility. Reading the PEP carefully would have helped. :-) I'd add that it would feel more natural to me to write: try: print('%r page is %d bytes' % (url, len(future.result( - except FutureError: - print('%r generated an exception: %s' % (url, future.exception())) + except FutureError as e: + print('%r generated an exception: %s' % (url, e)) Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 28, 2010, at 1:39 PM, Scott Dial wrote: On 5/27/2010 4:13 AM, Brian Quinlan wrote: On 27 May 2010, at 17:53, Floris Bruynooghe wrote: On Thu, May 27, 2010 at 01:46:07PM +1200, Greg Ewing wrote: On 27/05/10 00:31, Brian Quinlan wrote: You have two semantic choices here: 1. let the interpreter exit with the future still running 2. wait until the future finishes and then exit I'd go for (1). I don't think it's unreasonable to expect a program that wants all its tasks to finish to explicitly wait for that to happen. I'd got for (1) as well, it's no more then reasonable that if you want a result you wait for it. And I dislike libraries doing magic you can't see, I'd prefer if I explicitly had to shut a pool down. I'm glad I'm not alone in preferring (1) tough. Keep in mind that this library magic is consistent with the library magic that the threading module does - unless the user sets Thread.daemon to True, the interpreter does *not* exit until the thread does. Given your rationale, I don't understand from the PEP: shutdown(wait=True) Signal the executor that it should free any resources that it is using when the currently pending futures are done executing. Calls to Executor.submit and Executor.map and made after shutdown will raise RuntimeError. If wait is True then the executor will not return until all the pending futures are done executing and the resources associated with the executor have been freed. Can you tell me what is the expected execution time of the following: executor = ThreadPoolExecutor(max_workers=1) executor.submit(lambda: time.sleep(1000)) executor.shutdown(wait=False) sys.exit(0) I believe it's 1000 seconds, which seems to defy my request of shutdown(wait=False) because secretly the Python exit is going to wait anyways. It would take 1000 seconds. ...then the executor will not return... should read ...then the method will not return ISTM, it is much easier to get behavior #2 if you have behavior #1, and it would also seem rather trivial to make ThreadPoolExecutor take an optional argument specifying which behavior you want. Adding a daemon option would be reasonable. If you don't shutdown your executors you are pretty much guaranteed to get random traceback output on exit through. Your reference implementation does not actually implement the specification given in the PEP, so it's quite impossible to check this myself. There is no wait=True option for shutdown() in the reference implementation, so I can only guess what that implementation might look like. Look at around line 129 in: http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/futures/thread.py Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 27, 2010, at 1:21 PM, Greg Ewing wrote: On 27/05/10 12:04, Jesse Noller wrote: Namespaces are only a honking great idea if you actually let them do the job they're designed for. concurrent.* is the namespace, futures is the package within the namespace - concurrent.futures is highly descriptive of the items contained therein. I was referring to the issue of ThreadPool vs. ThreadPoolExecutor etc. By your own argument above, concurrent.futures.ThreadPool is quite descriptive enough of what it provides. It's not a problem if some other module also provides something called a ThreadPool. I think that the Executor suffix is a good indicator of the interface being provided. Pool is not because you can could have Executor implementations that don't involve pools. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 27 May 2010, at 17:53, Floris Bruynooghe wrote: On Thu, May 27, 2010 at 01:46:07PM +1200, Greg Ewing wrote: On 27/05/10 00:31, Brian Quinlan wrote: You have two semantic choices here: 1. let the interpreter exit with the future still running 2. wait until the future finishes and then exit I'd go for (1). I don't think it's unreasonable to expect a program that wants all its tasks to finish to explicitly wait for that to happen. I'd got for (1) as well, it's no more then reasonable that if you want a result you wait for it. And I dislike libraries doing magic you can't see, I'd prefer if I explicitly had to shut a pool down. And yes, if you shut the interpreter down while threads are running they sometimes wake up at the wrong time to find the world around them destroyed. But that's part of programming with threads so it's not like the futures lib suddenly makes things behave differently. I'm glad I'm not alone in preferring (1) tough. Keep in mind that this library magic is consistent with the library magic that the threading module does - unless the user sets Thread.daemon to True, the interpreter does *not* exit until the thread does. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 28 May 2010, at 09:18, Greg Ewing wrote: Brian Quinlan wrote: I think that the Executor suffix is a good indicator of the interface being provided. It's not usually considered necessary for the name of a type to indicate its interface. We don't have 'listsequence' and 'dictmapping' for example. I think what bothers me most about these names is their longwindedness. Two parts to a name is okay, but three or more starts to sound pedantic. And for me, Pool is a more important piece of information than Executor. The fact that it manages a pool is the main reason I'd use such a module rather than just spawning a thread myself for each task. Actually, an executor implementation that created a new thread per task would still be useful - it would save you the hassle of developing a mechanism to wait for the thread to finish and to collect the results. We actually have such an implementation at Google and it is quite popular. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 28, 2010, at 11:57 AM, Reid Kleckner wrote: On Thu, May 27, 2010 at 4:13 AM, Brian Quinlan br...@sweetapp.com wrote: Keep in mind that this library magic is consistent with the library magic that the threading module does - unless the user sets Thread.daemon to True, the interpreter does *not* exit until the thread does. Is there a compelling to make the threads daemon threads? If not, perhaps they can just be normal threads, and you can rely on the threading module to wait for them to finish. Did you read my explanation of the reasoning behind my approach? Cheers, Brian Unrelatedly, I feel like this behavior of waiting for the thread to terminate usually manifests as deadlocks when the main thread throws an uncaught exception. The application then no longer responds properly to interrupts, since it's stuck waiting on a semaphore. I guess it's better than the alternative of random crashes when daemon threads wake up during interpreter shutdown, though. Reid ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 26 May 2010, at 18:44, Stephen J. Turnbull wrote: Nick Coghlan writes: On 26/05/10 13:51, Stephen J. Turnbull wrote: People have been asking what's special about this module, to violate the BCP principle? There's nothing special about the fact that several people would use a robust and debugged futures module if it were in the stdlib. That's true of *every* module that is worth a PEP. The trick with futures and executor pools is that they're a *better* way of programming with threads in many cases. and However, given the choices of [...]. I'll choose the first option every time, and my programs will be the worse for it. Again, nothing all that special about those; lots of proposed changes satisfy similar conditions. I don't think anyone denies the truth or applicability of those arguments. But are they enough? Really, what you're arguing is now is better than never. Indeed, that is so. But you shouldn't forget that is immediately followed by although never is often better than *right* now. I've been trying to stay out of the meta-discussions but *right* now would be 6 months if it applies in this context. If that is what *right* now means to you then I hope that I never have a heart attack in your presence and need an ambulance *right* now :-) Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 26 May 2010, at 18:09, Glyph Lefkowitz wrote: On May 24, 2010, at 5:36 AM, Brian Quinlan wrote: On May 24, 2010, at 5:16 AM, Glyph Lefkowitz wrote: On May 23, 2010, at 2:37 AM, Brian Quinlan wrote: On May 23, 2010, at 2:44 PM, Glyph Lefkowitz wrote: ProcessPoolExecutor has the same serialization perils that multiprocessing does. My original plan was to link to the multiprocessing docs to explain them but I couldn't find them listed. Linking to the pickle documentation might be a good start. Will do. Yes, the execution context is Executor-dependent. The section under ProcessPoolExecutor and ThreadPoolExecutor spells this out, I think. I suppose so. I guess I'm just looking for more precise usage of terminology. (This is a PEP, after all. It's a specification that multiple VMs may have to follow, not just some user documentation for a package, even if they'll *probably* be using your code in all cases.) I'd be happier if there were a clearer term than calls for the things being scheduled (submissions?), since the done callbacks aren't called in the subprocess for ProcessPoolExecutor, as we just discussed. Sure. Really, almost any contract would work, it just needs to be spelled out. It might be nice to know whether the thread invoking the callbacks is a daemon thread or not, but I suppose it's not strictly necessary. Your concerns is that the thread will be killed when the interpreter exits? It won't be. Good to know. Tell it to the PEP though, not me ;). Will do. No reaction on [invoker vs. future]? I think you'll wish you did this in a couple of years when you start bumping into application code that calls set_result :). My reactions are mixed ;-) Well, you are not obliged to take my advice, as long as I am not obliged to refrain from mocking you mercilessly if it happens that I was right in a couple of years ;-). I was looking for your reasoning rather than trying to negotiate the circumstances under which you would mock me. Your proposal is to add a level of indirection to make it harder for people to call implementation methods. The downside is that it makes it a bit harder to write tests and Executors. Both tests and executors will still create and invoke methods directly on one object; the only additional difficulty seems to be the need to type '.future' every so often on the executor/testing side of things, and that seems a cost well worth paying to avoid confusion over who is allowed to call those methods and when. I also can't see a big problem in letting people call set_result in client code though it is documented as being only for Executor implementations and tests. On the implementation side, I don't see why an Invoker needs a reference to the future. Well, uh... class Invoker(object): def __init__(self): Should only be called by Executor implementations. self.future = Future() ^ this is what I'd call a reference to the future I said exactly the opposite of what I meant: futures don't need a reference to the invoker. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 26, 2010, at 8:57 PM, Greg Ewing wrote: Having read through the PEP again, here are my thoughts. * I'm bothered by the term future. To my mind, it's too long on cleverness and too short on explanativeness. I think that the standard library is no place for cuteness of naming. The name of a stdlib module should reflect its functionality in some straightforward and obvious way. If I were looking for a thread pool or process pool implementation, the word future is not something that would spring readily to mind. The stated purpose of the module is to execute computations asynchronously, so perhaps a name such as asyntask would be appropriate, following the pattern of existing modules dealing with ansynchronous matters, ansyncore and asynchat. For the Future object itself, I'd suggest something like Task or Job. future is a computing science term of art, like thread. Anyway, this has been discussed in the past and Guido was happy with the name. * It seems unnecessarily verbose to tack Executor onto the end of every Executor subclass. They could simply be called ThreadPool and ProcessPool without losing anything. You could have general thread pools that aren't related to executors (actually, it would be great if Python had a good built-in thread pool implementation) and I'd like to avoid using an overly generic name. * I don't see a strong reason to put this module inside a newly-created namespace. If there were a namespace called concurrent, I would expect to find other existing concurrency-related modules there as well, such as threading and multiprocessing. But we can't move them there without breaking existing code. I think that Jesse was planning to add some functionality to this namespace. I don't really have an opinion on this. (More generally, I'm inclined to think that introducing a namespace package for a category of modules having existing members in the stdlib is an anti-pattern, unless it's done during the kind of namespace refactoring that we won't get another chance to perform until Py4k.) Concerning the structure of the PEP: * A section titled 'Specification' should *not* start with a bunch of examples. It may be appropriate to include short examples *following* items in the specification in order to illustrate the features concerned. Extended examples such as these belong in a section of their own. I thought that the specification would be difficult to follow without examples to pave the way. Anyone else have an opinion on this? * I found the examples included to be rather difficult to follow, and they served more to confuse than elucidate. I think this is partly because they are written in a fairly compressed style, burying important things being illustrated inside complicated expressions. Rewriting them in a more straightforward style might help. Do you think starting with a simpler example would help? I think that idiomatic future use will end up looking similar to my examples. If that is too complex for most users then we have a problem. Concerning details of the specification: * Is it possible to have more than one Executor active at a time? Of course. The fact that as_completed() is a module-level function rather than an Executor method suggests that it is, but it would be good to have this spelled out one way or the other in the PEP. I'll add a note to the global functions that they can accept futures from different in the same call. Cheers, Brian -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 26 May 2010, at 22:06, Floris Bruynooghe wrote: Hi On Sun, May 23, 2010 at 10:47:08AM +1000, Brian Quinlan wrote: Jesse, the designated pronouncer for this PEP, has decided to keep discussion open for a few more days. So fire away! In thread.py the module automatically registers a handler with atexit. I don't think I'm alone in thinking libraries should not be doing this sort of thing unconditionally behind a user's back. I'm also not so sure how comfortable I am with the module-level globals. Would it not be possible to have an exit handler on each thread pool which the documentation reccomends you register with atexit if it suits your application? I think that would get rid of the global singletons and hidden atexit in a fairly elegant way. First let me explain why I install at atexit handler. Imagine that the you write a script like this: t = ThreadPoolExecutor(1) t.submit(lambda url: print(urllib.open(url).read()), 'http://www.apple.com/') You have two semantic choices here: 1. let the interpreter exit with the future still running 2. wait until the future finishes and then exit I chose (2) but can be convinced otherwise. The obvious way to accomplish this is to make the worker thread non-daemon so the interpreter won't exit while it is running. But since the worker thread is part of a pool, it won't stop while it's executor is alive. So my approach was to make worker threads daemon and install an atexit handler that sets a global indicating that the interpreter is exiting so any workers should exit when when their work queues are empty. It then calls join on each worker thread so the interpreter will not exit until they are finished. I think that this approach is reasonable assuming that you want (2). I also don't have the aversion to globals that you do :-) Lastly _base.py creates a LOGGER (connected to sys.stderr if I understand correctly) and only logs a critical message to it at the same time as a RuntimeError is raised. While I don't necessarily dislike that it uses a logger, I don't like that it's wired up to sys.stderr I rather think it's the application's duty to create a handler if it wants one. But given that it's only used at the same time as a RuntimeError it does seem redundant. The LOGGER is only use for impossible exceptions (exceptions in the machinery of the module itself) that won't be propagated because they occur in worker threads. Cheers, Brian Regards Floris PS: I've only looked at the threading part of the implementation. -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 26 May 2010, at 22:42, Nick Coghlan wrote: On 26/05/10 20:57, Greg Ewing wrote: Having read through the PEP again, here are my thoughts. * It seems unnecessarily verbose to tack Executor onto the end of every Executor subclass. They could simply be called ThreadPool and ProcessPool without losing anything. We would lose the ability to add general purpose thread and process pools under the obvious names later. * I don't see a strong reason to put this module inside a newly-created namespace. If there were a namespace called concurrent, I would expect to find other existing concurrency-related modules there as well, such as threading and multiprocessing. But we can't move them there without breaking existing code. (More generally, I'm inclined to think that introducing a namespace package for a category of modules having existing members in the stdlib is an anti-pattern, unless it's done during the kind of namespace refactoring that we won't get another chance to perform until Py4k.) _thread, threading, Queue and multiprocessing do likely belong here, but moving them isn't likely to be worth the pain. Does it help to know that at least Jesse and I (and probably others) would like to see concurrent.pool added eventually with robust general purpose ThreadPool and ProcessPool implementations? The specific reason the new package namespace was added was to help avoid confusion with stock market futures without using an unduly cumbersome module name, but I don't know how well the PEP explains that. It doesn't at all. Are these plans formalized anywhere that I can link to? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 26 May 2010, at 22:50, Antoine Pitrou wrote: On Wed, 26 May 2010 22:32:33 +1000 Nick Coghlan ncogh...@gmail.com wrote: Ha, I'm a bit surprised. Isn't it what futures already provides? (except that for some reason it insists on the SomeExecutor naming scheme) http://www.python.org/dev/peps/pep-3148/#processpoolexecutor Not really - a general purpose pool would be a lot more agnostic about how you give the pooled threads/processes work to do and get the results back. Executors are the kind of thing you would build on top of one though. If concurrent.pool was added, then the existing processing pools in multiprocessing and the executors in concurrent.future would be the first use cases for it. I think I'm a bit ignorant, but how is the Executor abstraction (and its proposed implementations) not generic enough? You have a pool, submit one or several tasks, and can either repeatedly poll for completion or do a blocking wait. (after all, Glyph pointed out that it should be quite easy to wrap the resulting Futures into Deferred objects) Interesting. Executor.submit() return a Future, which might not be useful in some ThreadPool fire-and-forget use cases but having them doesn't seem harmful. Java does take this approach and it gives you a lot more ways to customize the Executor thread pool i.e. the minimum number of threads running, the maximum number, the amount of time that a thread can be idle before it is killed, the queueing strategy to use (e.g. LIFO, FIFO, priority). Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 24, 2010, at 5:16 AM, Glyph Lefkowitz wrote: On May 23, 2010, at 2:37 AM, Brian Quinlan wrote: On May 23, 2010, at 2:44 PM, Glyph Lefkowitz wrote: On May 22, 2010, at 8:47 PM, Brian Quinlan wrote: Jesse, the designated pronouncer for this PEP, has decided to keep discussion open for a few more days. So fire away! As you wish! I retract my request ;-) May you get what you wish for, may you find what you are seeking :). The PEP should be consistent in its usage of terminology about callables. It alternately calls them callables, functions, and functions or methods. It would be nice to clean this up and be consistent about what can be called where. I personally like callables. Did you find the terminology confusing? If not then I propose not changing it. Yes, actually. Whenever I see references to the multiprocessing module, I picture a giant HERE BE (serialization) DRAGONS sign. When I saw that some things were documented as being functions, I thought that maybe there was intended to be a restriction like the these can only be top-level functions so they're easy for different executors to locate and serialize. I didn't realize that the intent was arbitrary callables until I carefully re-read the document and noticed that the terminology was inconsistent. ProcessPoolExecutor has the same serialization perils that multiprocessing does. My original plan was to link to the multiprocessing docs to explain them but I couldn't find them listed. But changing it in the user docs is probably a good idea. I like callables too. Great. Still, users will inevitably find the PEP and use it as documentation too. The execution context of callable code is not made clear. Implicitly, submit() or map() would run the code in threads or processes as defined by the executor, but that's not spelled out clearly. Any response to this bit? Did I miss something in the PEP? Yes, the execution context is Executor-dependent. The section under ProcessPoolExecutor and ThreadPoolExecutor spells this out, I think. More relevant to my own interests, the execution context of the callables passed to add_done_callback and remove_done_callback is left almost completely to the imagination. If I'm reading the sample implementation correctly, http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/futures/process.py#241 , it looks like in the multiprocessing implementation, the done callbacks are invoked in a random local thread. The fact that they are passed the future itself *sort* of implies that this is the case, but the multiprocessing module plays fast and loose with object identity all over the place, so it would be good to be explicit and say that it's *not* a pickled copy of the future sitting in some arbitrary process (or even on some arbitrary machine). The callbacks will always be called in a thread other than the main thread in the process that created the executor. Is that a strong enough contract? Sure. Really, almost any contract would work, it just needs to be spelled out. It might be nice to know whether the thread invoking the callbacks is a daemon thread or not, but I suppose it's not strictly necessary. Your concerns is that the thread will be killed when the interpreter exits? It won't be. This is really minor, I know, but why does it say NOTE: This method can be used to create adapters from Futures to Twisted Deferreds? First of all, what's the deal with NOTE; it's the only NOTE in the whole PEP, and it doesn't seem to add anything. This sentence would read exactly the same if that word were deleted. Without more clarity on the required execution context of the callbacks, this claim might not actually be true anyway; Deferred callbacks can only be invoked in the main reactor thread in Twisted. But even if it is perfectly possible, why leave so much of the adapter implementation up to the imagination? If it's important enough to mention, why not have a reference to such an adapter in the reference Futures implementation, since it *should* be fairly trivial to write? I'm a bit surprised that this doesn't allow for better interoperability with Deferreds given this discussion: discussion snipped I did not communicate that well. As implemented, it's quite possible to implement a translation layer which turns a Future into a Deferred. What I meant by that comment was, the specification in the PEP was to loose to be sure that such a layer would work with arbitrary executors. For what it's worth, the Deferred translator would look like this, if you want to include it in the PEP (untested though, you may want to run it first): from twisted.internet.defer import Deferred from twisted.internet.reactor import callFromThread def future2deferred(future): d = Deferred() def invoke_deferred
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 2:44 PM, Glyph Lefkowitz wrote: On May 22, 2010, at 8:47 PM, Brian Quinlan wrote: Jesse, the designated pronouncer for this PEP, has decided to keep discussion open for a few more days. So fire away! As you wish! I retract my request ;-) The PEP should be consistent in its usage of terminology about callables. It alternately calls them callables, functions, and functions or methods. It would be nice to clean this up and be consistent about what can be called where. I personally like callables. Did you find the terminology confusing? If not then I propose not changing it. But changing it in the user docs is probably a good idea. I like callables too. The execution context of callable code is not made clear. Implicitly, submit() or map() would run the code in threads or processes as defined by the executor, but that's not spelled out clearly. More relevant to my own interests, the execution context of the callables passed to add_done_callback and remove_done_callback is left almost completely to the imagination. If I'm reading the sample implementation correctly, http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/futures/process.py#241 , it looks like in the multiprocessing implementation, the done callbacks are invoked in a random local thread. The fact that they are passed the future itself *sort* of implies that this is the case, but the multiprocessing module plays fast and loose with object identity all over the place, so it would be good to be explicit and say that it's *not* a pickled copy of the future sitting in some arbitrary process (or even on some arbitrary machine). The callbacks will always be called in a thread other than the main thread in the process that created the executor. Is that a strong enough contract? This is really minor, I know, but why does it say NOTE: This method can be used to create adapters from Futures to Twisted Deferreds? First of all, what's the deal with NOTE; it's the only NOTE in the whole PEP, and it doesn't seem to add anything. This sentence would read exactly the same if that word were deleted. Without more clarity on the required execution context of the callbacks, this claim might not actually be true anyway; Deferred callbacks can only be invoked in the main reactor thread in Twisted. But even if it is perfectly possible, why leave so much of the adapter implementation up to the imagination? If it's important enough to mention, why not have a reference to such an adapter in the reference Futures implementation, since it *should* be fairly trivial to write? I'm a bit surprised that this doesn't allow for better interoperability with Deferreds given this discussion: At 02:36 PM 3/16/2010 -0700, Brian Quinlan wrote: From P.J Eby: On Mar 7, 2010, at 11:56 AM, P.J. Eby wrote: At 10:59 AM 3/7/2010 -0800, Jeffrey Yasskin wrote: Given a way to register on-done callbacks with the future, it would be straightforward to wait for a future without blocking, too. Yes, and with a few more additions besides that one, you might be on the way to an actual competitor for Deferreds. For example: retry support, chaining, logging, API for transparent result processing, coroutine support, co-ordination tools like locks, sempaphores and queues, etc. OK, but lets just think about making the APIs compatible e.g. you have some code that uses Futures and now you want to integrate it with some code that uses Deferreds. I think Jeff's suggestion of having a completion callback on Futures would make it possible to write a Future-to-Deferred adapter. Is that correct? As long as the callback signature included a way to pass in an error, then yes, that'd probably be sufficient. If add_done_callback doesn't help with twisted interoperability then I'd suggest removing it to allow for something that may be more useful to be added later. The fact that add_done_callback is implemented using a set is weird, since it means you can't add the same callback more than once. The set implementation also means that the callbacks get called in a semi-random order, potentially creating even _more_ hard-to-debug order of execution issues than you'd normally have with futures. And I think that this documentation will be unclear to a lot of novice developers: many people have trouble with the idea that a = Foo(); b = Foo(); a.bar_method != b.bar_method, but import foo_module; foo_module.bar_function == foo_module.bar_function. It's also weird that you can remove callbacks - what's the use case? Deferreds have no callback-removal mechanism and nobody has ever complained of the need for one, as far as I know. (But lots of people do add the same callback multiple times.) I suggest having have add_done_callback, implementing it with a list so that callbacks are always invoked
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 7:15 PM, geremy condra wrote: On Sun, May 23, 2010 at 2:37 AM, Brian Quinlan br...@sweetapp.com wrote: snip Finally, why isn't this just a module on PyPI? It doesn't seem like there's any particular benefit to making this a stdlib module and going through the whole PEP process - except maybe to prompt feedback like this :). We've already had this discussion before. Could you explain why this module should *not* be in the stdlib e.g. does it have significantly less utility than other modules in stdlib? Is it significantly higher risk? etc? Inclusion in the stdlib is the exception, not the rule, and every exception should be issued for a good reason. I'd like to know what that reason is in this case, This package eliminates the need to construct the boilerplate present in many Python applications i.e. a thread or process pool, a work queue and result queue. It also makes it easy to take an existing Python application that executes (e.g. IO operations) in sequence and execute them in parallel. It package provides common idioms for two existing modules i.e. multiprocessing offers map functionality while threading doesn't. Those idioms are well understood and already present in Java and C++. if only to get a clearer understanding of why the PEP was accepted. It hasn't been accepted. Issues like the ones I'm bringing up could be fixed pretty straightforwardly if it were just a matter of filing a bug on a small package, but fixing a stdlib module is a major undertaking. True but I don't think that is a convincing argument. A subset of the functionality provided by this module is already available in Java and C++ and (at least in Java) it is used extensively and without too much trouble. If there are implementation bugs then we can fix them just like we would with any other module. Guido made exactly the opposite argument during his keynote at PyCon. It seemed fairly reasonable at the time- why do you think it doesn't apply here? Could you be a little more specific about Guido's argument at PyCon? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 7:54 PM, Lennart Regebro wrote: On Sun, May 23, 2010 at 11:39, Brian Quinlan br...@sweetapp.com wrote: This package eliminates the need to construct the boilerplate present in many Python applications i.e. a thread or process pool, a work queue and result queue. It also makes it easy to take an existing Python application that executes (e.g. IO operations) in sequence and execute them in parallel. It package provides common idioms for two existing modules i.e. multiprocessing offers map functionality while threading doesn't. Those idioms are well understood and already present in Java and C++. It can do that as a separate package as well. You could make the same argument about any module in the stdlib. And not only that, it could then be available on PyPI for earlier versions of Python as well, making it much more likely to gain widespread acceptance. I doubt it. Simple modules are unlikely to develop a following because it is too easy to partially replicate their functionality. urlparse and os.path are very useful modules but I doubt that they would have been successful on PyPI. Could you be a little more specific about Guido's argument at PyCon? A module in stdlib has to be dead. After it's included in the stdlib it can not go through any major changes since that would mean loss of backwards incompatibility. The good news in this case is that the same API has been used successfully in Java and C++ for years so it is unlikely that any major changes will need to be made. Also, you can't fix bugs except by releasing new versions of Python. Therefore the API must be completely stable, and the product virtually bugfree before it should be in stdlib. The best way of ensuring that is to release it as a separate module on PyPI, and let it stabilize for a couple of years. Yeah but that model isn't likely to work with this package. Cheers, Brian___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 8:43 PM, Robert Collins wrote: On Sun, May 23, 2010 at 10:15 PM, Brian Quinlan br...@sweetapp.com wrote: Also, you can't fix bugs except by releasing new versions of Python. Therefore the API must be completely stable, and the product virtually bugfree before it should be in stdlib. The best way of ensuring that is to release it as a separate module on PyPI, and let it stabilize for a couple of years. Yeah but that model isn't likely to work with this package. Cheers, Brian Forgive my ignorance, but why do you say that that model won't work with this package? As I said in my last message: Simple modules are unlikely to develop a following because it is too easy to partially replicate their functionality. urlparse and os.path are very useful modules but I doubt that they would have been successful on PyPI. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 8:43 PM, Dirkjan Ochtman wrote: On Sun, May 23, 2010 at 12:15, Brian Quinlan br...@sweetapp.com wrote: I doubt it. Simple modules are unlikely to develop a following because it is too easy to partially replicate their functionality. urlparse and os.path are very useful modules but I doubt that they would have been successful on PyPI. simplejson was also fairly simple, but still developed a following. The API is simple but writing a JSON parser is hard enough that people will check to see if someone has already done the work for them (especially since JSON is fairly topical). If you are familiar with threads then writing a good enough solution without futures probably won't take you very long. Also, unless you are familiar with another futures implementation, you aren't likely to know where to look. The good news in this case is that the same API has been used successfully in Java and C++ for years so it is unlikely that any major changes will need to be made. I would agree that having prior versions in other languages should make the API more stable, but I wouldn't agree that it doesn't need changes (and even minor changes can be a PITA in the stdlib). Some changes are hard (i.e. changing the semantics of existing method) but some are pretty easy (i.e. adding new methods). Cheers, Brian Also, you can't fix bugs except by releasing new versions of Python. Therefore the API must be completely stable, and the product virtually bugfree before it should be in stdlib. The best way of ensuring that is to release it as a separate module on PyPI, and let it stabilize for a couple of years. Yeah but that model isn't likely to work with this package. Okay, I'll bite: why is your package different? In general, this reminds me of the ipaddr discussions. I read through the thread from March real quick to see if there was reasoning there why this package should be an exception from the normal standards track (that is, ripen on PyPI, then moving it in the stdlib when it's mature -- where mature is another word for dead, really). But then this is just another instance of the fat-stdlib vs lean-stdlib discussion, I guess, so we can go on at length. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 23 May 2010, at 21:17, Lennart Regebro wrote: On Sun, May 23, 2010 at 12:15, Brian Quinlan br...@sweetapp.com wrote: You could make the same argument about any module in the stdlib. Yeah, and that's exactly what I did. I doubt it. Simple modules are unlikely to develop a following because it is too easy to partially replicate their functionality. urlparse and os.path are very useful modules but I doubt that they would have been successful on PyPI. Are you saying your proposed module is so simple that anyone can easily replicate it with just a couple of lines of code? Parts of it, yes. Just like I can replace most operations in os.path and urlparse with a few lines of code. The good news in this case is that the same API has been used successfully in Java and C++ for years so it is unlikely that any major changes will need to be made. Good. Then the time it takes to mature on PyPI would be very short. How would you define very short? I've had the project on PyPI for about a year now: http://pypi.python.org/pypi/futures3 Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 22, 2010, at 5:30 AM, Dj Gilcrease wrote: On Fri, May 21, 2010 at 8:26 AM, Brian Quinlan br...@sweetapp.com wrote: Except that now isn't the time for that discussion. This PEP has discussed on-and-off for several months on both stdlib-sig and python-dev. I think any time till the PEP is accepted is a good time to discuss changes to the API I disagree. If a PEP is being updated continuously then there is nothing stable to pronounce on. Issues with the PEP: 1) Examples as written fail on windows. Patch to fix @ http://code.google.com/p/pythonfutures/issues/detail?id=5 Updated, thanks! Issues with Implementation: 1) Globals are used for tracking running threads (but not processes) and shutdown state, but are not accessed as a globals every where they are modified so it could be inconsistent. 2) The atexit handle randomly throws an exception on exit on windows when running the tests or examples. Error in atexit._run_exitfuncs: TypeError: print_exception(): Exception expected for value, str found Lets take this off-list. Cheers, Brian Issues 1 2 would be solved by moving thread tracking back into the executor responsible for the threads, or making a singleton that tracked threads / processes for all executors. http://code.google.com/p/pythonfutures/issues/detail?id=6 is one such implementation ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On 22 May 2010, at 23:59, R. David Murray wrote: On Sat, 22 May 2010 19:12:05 +1000, Brian Quinlan br...@sweetapp.com wrote: On May 22, 2010, at 5:30 AM, Dj Gilcrease wrote: On Fri, May 21, 2010 at 8:26 AM, Brian Quinlan br...@sweetapp.com wrote: Except that now isn't the time for that discussion. This PEP has discussed on-and-off for several months on both stdlib-sig and python-dev. I think any time till the PEP is accepted is a good time to discuss changes to the API I disagree. If a PEP is being updated continuously then there is nothing stable to pronounce on. Well, you've been making updates as a result of this round of discussion. Yes, I've been making documentation and PEP updates to clarify points that people found confusing and will continue to do so. If there is still discussion then perhaps the PEP isn't ready for pronouncement yet. At some point someone can decide it is all bikeshedding and ask for pronouncement on that basis, but I don't think it is appropriate to cut off discussion by saying it's ready for pronouncement unless you want increase the chances of its getting rejected. Here are the new proposed non-documentation changes that I've collected (let me know if I've missed any): Rename executor = executer Rename submit to apply Rename done to finished Rename not_finished to pending Rename FIRST_COMPLETED to ONE_COMPLETED We can discuss naming for all eternity and never reach a point where even half of the participants are satisfied. Since naming has been discussed extensively here and in stdlib-sig, I think that we have to decide that it is good enough and move on. Or decide that it isn't good enough and reject the PEP. Cheers, Brian The usual way of doing this (at least so far as I have observed, which granted hasn't been too many cases) is to say something like I think this PEP is ready for pronouncement and then wait for feedback on that assertion or for the pronouncement. It's especially good if you can answer any concerns that are raised with that was discussed already and we concluded X. Bonus points for finding a thread reference and adding it to the PEP :) -- R. David Murray www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
Hey all, Jesse, the designated pronouncer for this PEP, has decided to keep discussion open for a few more days. So fire away! Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 9:43 AM, Jeffrey Yasskin wrote: I think the PEP's overall API is good to go. On Sat, May 22, 2010 at 4:12 PM, Brian Quinlan br...@sweetapp.com wrote: On 22 May 2010, at 23:59, R. David Murray wrote: If there is still discussion then perhaps the PEP isn't ready for pronouncement yet. At some point someone can decide it is all bikeshedding and ask for pronouncement on that basis, but I don't think it is appropriate to cut off discussion by saying it's ready for pronouncement unless you want increase the chances of its getting rejected. Here are the new proposed non-documentation changes that I've collected (let me know if I've missed any): ... I propose to rename the Future.result method to Future.get. get is what Java (http://java.sun.com/javase/7/docs/api/java/util/concurrent/Future.html ) and C++ (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3092.pdf section 30.6.6 para 12) use, and the word result doesn't seem particularly better or worse than get for our purposes, which inclines me to stay consistent. In C++ and Java, there is only one result-retrieving method so get seems like a reasonable name. My implementation has a second method .exception(), which returns the exception raised by the submitted function (or None if no exception was raised). I thought that having multiple getter methods, where one is called .get() would be a bit confusing. But I don't really care so I'm -0. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 23, 2010, at 10:06 AM, Jeffrey Yasskin wrote: On Sat, May 22, 2010 at 4:12 PM, Brian Quinlan br...@sweetapp.com wrote: Rename executor = executer -1 for consistency with Java. -1 pending an explanation of why executer is better Rename submit to apply apply focuses attention on the function object, while submit focuses attention, properly I think, on the fact that you're handing something to the executor to run. So -1. -1 Rename done to finished done is nice and short, and I don't think finished or completed will be any less prone to people thinking the task actually ran. So -1. -0 Rename not_finished to pending +0.5. Doesn't matter that much, but pending is used elsewhere in the proposal for this concept. On the other hand, pending could be thought to refer to the state before running. Possibly finished should be renamed to done here, since it's described as 'finished, contains the futures that completed (finished or were cancelled)', which uses finished for two different concepts. I think that using finished is bad terminology here. So +1 to finished = done. I don't have a preference for not_done vs. pending. Rename FIRST_COMPLETED to ONE_COMPLETED ONE_COMPLETED could imply that the first result set must contain exactly one element, but in fact, if multiple tasks finish before the waiting thread has a chance to wake up, multiple futures could be returned as done. So -1. A logician would probably call it SOME_COMPLETED. What about ANY_COMPLETED? Though I think that FIRST_COMPLETED still reads better. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
Hey Mark, This really isn't the time to propose changes. The PEP has been discussed extensively on stdlib-sig and python-dev. On May 21, 2010, at 9:29 PM, Mark Summerfield wrote: On 2010-05-21, Brian Quinlan wrote: The PEP is here: http://www.python.org/dev/peps/pep-3148/ [snip] Hi Brian, Could I suggest a small subtle changing in naming: replace executor with executer? I guess this suggestion is doomed though since Java uses executor:-( I'd also be tempted to rename submit() to apply() in view of Python's history. Also, maybe change done() to finished() since the function returns True if the call was cancelled (so the job can't have been done), as well as if the call was finished. Actually, having read further, maybe the best name would be completed() since that's a term used throughout. Perhaps call the not_finished set pending since presumably these are still in progress? (My understanding is that if they were cancelled or finished they'd be in the finished set. I'd also rename finished to completed if you have a completed() method.) I think FIRST_COMPLETED is misleading since it implies (to me anyway) the first one passed. How about ONE_COMPLETED; and similarly ONE_EXCEPTION? I think it would be helpful to clarify whether the timout value (which you specify as being in seconds) can meaningfully accept a float, e.g., 0.5? I've updated the docs to clarify that float args are acceptable. Cheers, Brian Anyway, it looks like it will be a really nice addition to the standard library:-) -- Mark Summerfield, Qtrac Ltd, www.qtrac.eu C++, Python, Qt, PyQt - training and consultancy C++ GUI Programming with Qt 4 - ISBN 0132354160 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 21, 2010, at 9:44 PM, John Arbash Meinel wrote: Brian Quinlan wrote: The PEP is here: http://www.python.org/dev/peps/pep-3148/ I think the PEP is ready for pronouncement, and the code is pretty much ready for submission into py3k (I will have to make some minor changes in the patch like changing the copyright assignment): http://code.google.com/p/pythonfutures/source/browse/#svn/branches/ feedback/python3/futures%3Fstate%3Dclosed Your example here: for number, is_prime in zip(PRIMES, executor.map(is_prime, PRIMES)): print('%d is prime: %s' % (number, is_prime)) Overwrites the 'is_prime' function with the return value of the function. Probably better to use a different variable name. Good catch. I've updated the example. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3148 ready for pronouncement
On May 21, 2010, at 9:47 PM, John Arbash Meinel wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Brian Quinlan wrote: The PEP is here: http://www.python.org/dev/peps/pep-3148/ I think the PEP is ready for pronouncement, and the code is pretty much ready for submission into py3k (I will have to make some minor changes in the patch like changing the copyright assignment): http://code.google.com/p/pythonfutures/source/browse/#svn/branches/ feedback/python3/futures%3Fstate%3Dclosed The tests are here and pass on W2K, Mac OS X and Linux: http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/test_futures.py The docs (which also need some minor changes) are here: http://code.google.com/p/pythonfutures/source/browse/branches/feedback/docs/index.rst Cheers, Brian I also just noticed that your example uses: zip(PRIMES, executor.map(is_prime, PRIMES)) But your doc explicitly says: map(func, *iterables, timeout=None) Equivalent to map(func, *iterables) but executed asynchronously and possibly out-of-order. So it isn't safe to zip() against something that can return out of order. The docs don't say that the return value can be out-of-order, just that execution can be out-of-order. But I agree that the phrasing is confusing so I've changed it to: Equivalent to ``map(func, *iterables)`` but *func* is executed asynchronously and several calls to *func *may be made concurrently. Which opens up a discussion about how these things should be used. Except that now isn't the time for that discussion. This PEP has discussed on-and-off for several months on both stdlib-sig and python- dev. If you think that storing the args (e.g. with the future) is a good idea then you can propose a patch after the PEP is integrated (if it is rejected then it probably isn't worth discussing ;-)). Cheers, Brian Given that your other example uses a dict to get back to the original arguments, and this example uses zip() [incorrectly], it seems that the Futures object should have the arguments easily accessible. It certainly seems like a common use case that if things are going to be returned in arbitrary order, you'll want an easy way to distinguish which one you have. Having to write a dict map before each call can be done, but seems unoptimal. John =:- -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkv2cugACgkQJdeBCYSNAAPWzACdE6KepgEmjwhCD1M4bSSVrI97 NIYAn1z5U3CJqZnBSn5XgQ/DyLvcKtbf =TKO7 -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3148 ready for pronouncement
The PEP is here: http://www.python.org/dev/peps/pep-3148/ I think the PEP is ready for pronouncement, and the code is pretty much ready for submission into py3k (I will have to make some minor changes in the patch like changing the copyright assignment): http://code.google.com/p/pythonfutures/source/browse/#svn/branches/ feedback/python3/futures%3Fstate%3Dclosed The tests are here and pass on W2K, Mac OS X and Linux: http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/test_futures.py The docs (which also need some minor changes) are here: http://code.google.com/p/pythonfutures/source/browse/branches/feedback/docs/index.rst Cheers, Brian___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
I've updated the PEP to include: - completion callbacks (for interoperability with Twisted Deferreds) - a pointer to the discussion on stdlig-sig See: http://svn.python.org/view/peps/trunk/pep-3148.txt?r1=78618r2=80679 Rejected ideas: - Having a registration system for executors Not yet addressed: - where the package should live (someone in a concurrent package seems fine) - having global executors with unbounded worker counts as a convenience [1] [1] There are a few issues with global executors that need to be thought thought through i.e. when should workers be created and when should they be terminated. I'd be happy to defer this idea unless someone is passionate about it (in which case it would be great if they'd step in with concrete ideas). Cheers, Brian___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
Nick Coghlan wrote: You may want to consider providing global thread and process executors in the futures module itself. Code which just wants to say do this in the background without having to manage the lifecycle of its own executor instance is then free to do so. I've had a lot of experience with a framework that provides this and it is *very* convenient (it's also a good way to avoid deadlocks due to synchronous notification APIs). This seems like a reasonable idea to me. I take it that the thread/process pool should be unlimited in size. Should every thread/process exit when it finishes its job or should there be a smarter collection strategy? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 10 Mar 2010, at 23:32, Nick Coghlan wrote: Brian Quinlan wrote: Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state. I'm not sure what you mean, could you clarify? Assuming your question refers to the second sentence, Jean-Paul is referring to a trick of the CPython interpreter when it terminates. To maximise the chances of objects being deleted properly rather than just dumped from memory when the process exits, module dictionaries are filled with None values before the interpreter shuts down. This can cause weirdness (usually intermittent name errors during shutdown) when __del__ methods directly or indirectly reference module globals. Ah. I'm familiar with this problem. My approach was to install an exit handler that ensures that all pending futures are complete and all threads and processes exit before allowing the interpreter to exit. Cheers, Brian One of the easiest ways to avoid that is to put the state on a singleton object, then give the affected classes a reference to that object. Cheers, Nick. P.S. This problem is actually the reason we don't have a context manager for temporary directories yet. Something that should have been simple became a twisty journey down the rabbit hole: http://bugs.python.org/issue5178 -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 10 Mar 2010, at 08:32, Dj Gilcrease wrote: On Mon, Mar 8, 2010 at 2:11 PM, exar...@twistedmatrix.com wrote: Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state. Ok the new patch is submitted @ http://code.google.com/p/pythonfutures/issues/detail?id=1 Cool, thanks. *note there are 2 tests that fail and 1 test that dead locks on windows even without this patch, the deadlock test I am skipping in the patch and the two that fail do so for a reason that does not make sense to me. I'll investigate but I don't have convenient access to a windows machine. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 9 Mar 2010, at 08:39, Greg Ewing wrote: Terry Reedy wrote: Looking more close, I gather that the prime results will be printed 'in order' (waiting on each even if others are done) while the url results will be printed 'as available'. Seems to me that if you care about the order of the results, you should be able to just wait for each result separately in the order you want them. Something like task1 = start_task(proc1) task2 = start_task(proc2) task3 = start_task(proc3) result1 = task1.wait_for_result() result2 = task2.wait_for_result() result3 = task3.wait_for_result() You can write this as: executor = ... future1 = executor.submit(proc1) future2 = executor.submit(proc2) future3 = executor.submit(proc3) result1 = task1.result() result2 = task2.result() result3 = task3.result() This would also be a natural way to write things even if you don't care about the order, but you need all the results before proceeding. You're going to be held up until the longest-running task completes anyway, so it doesn't matter if some of them finish earlier and have to sit around waiting for you to collect the result. Often you don't want to continue if there is a failure. In the example that you gave, if proc3 raises an exception immediately, you still wait for proc1 and proc2 to complete even though you will end up discarding their results. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 9 Mar 2010, at 08:11, exar...@twistedmatrix.com wrote: On 08:56 pm, digitalx...@gmail.com wrote: On Mon, Mar 8, 2010 at 12:04 PM, Dj Gilcrease digitalx...@gmail.com wrote: A style I have used in my own code in the past is a Singleton class with register and create methods, where the register takes a name(string) and the class and the create method takes the name and *args, **kwargs and acts as a factory. So I decided to play with this design a little and since I made it a singleton I decided to place all the thread/process tracking and exit handle code in it instead of having the odd semi-global scoped _shutdown, _thread_references, _remove_dead_thread_references and _python_exit objects floating around in each executor file, seems to work well. The API would be from concurrent.futures import executors executor = executors.create(NAME, *args, **kwargs) # NAME is 'process' or 'thread' by default To create your own executor you create your executor class and add the following at the end Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state. I'm not sure what you mean, could you clarify? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 9 Mar 2010, at 03:21, Terry Reedy wrote: On 3/6/2010 4:20 AM, Brian Quinlan wrote: On 6 Mar 2010, at 03:21, Daniel Stutzbach wrote: On Fri, Mar 5, 2010 at 12:03 AM, Brian Quinlan br...@sweetapp.com mailto:br...@sweetapp.com wrote: import futures +1 on the idea, -1 on the name. It's too similar to from __future__ import Also, the PEP should probably link to the discussions on stdlib-sig? I thoug ht about that but this discussion is spread over many threads and many months. This is pretty typical. I would say just that, and link to the first. This PEP was discussed over many months in many threads in the stdlib-sig list. The first was . Python-dev discussion occured in this thread. I'll add that. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 6 Mar 2010, at 07:38, Brett Cannon wrote: The PEP says that futures.wait() should only use keyword arguments past its first positional argument, but the PEP has the function signature as ``wait(fs, timeout=None, return_when=ALL_COMPLETED)``. Should it be ``wait(fs, *, timeout=None, return_when=ALL_COMPLETED)``? Hi Brett, That recommendation was designed to make it easy to change the API without breaking code. I'd don't think that recommendation makes sense anymore any I'll update the PEP. Cheers, Brian On Thu, Mar 4, 2010 at 22:03, Brian Quinlan br...@sweetapp.com wrote: Hi all, I recently submitted a daft PEP for a package designed to make it easier to execute Python functions asynchronously using threads and processes. It lets the user focus on their computational problem without having to build explicit thread/process pools and work queues. The package has been discussed on stdlib-sig but now I'd like this group's feedback. The PEP lives here: http://python.org/dev/peps/pep-3148/ Here are two examples to whet your appetites: Determine if several numbers are prime. import futures import math PRIMES = [ 112272535095293, 112582705942171, 112272535095293, 115280095190773, 115797848077099, 1099726899285419] def is_prime(n): if n % 2 == 0: return False sqrt_n = int(math.floor(math.sqrt(n))) for i in range(3, sqrt_n + 1, 2): if n % i == 0: return False return True # Uses as many CPUs as your machine has. with futures.ProcessPoolExecutor() as executor: for number, is_prime in zip(PRIMES, executor.map(is_prime, PRIMES)): print('%d is prime: %s' % (number, is_prime)) Print out the size of the home pages of various new sites (and Fox News). import futures import urllib.request URLS = ['http://www.foxnews.com/', 'http://www.cnn.com/', 'http://europe.wsj.com/', 'http://www.bbc.co.uk/', 'http://some-made-up-domain.com/'] def load_url(url, timeout): return urllib.request.urlopen(url, timeout=timeout).read() with futures.ThreadPoolExecutor(max_workers=5) as executor: # Create a future for each URL load. future_to_url = dict((executor.submit(load_url, url, 60), url) for url in URLS) # Iterate over the futures in the order that they complete. for future in futures.as_completed(future_to_url): url = future_to_url[future] if future.exception() is not None: print('%r generated an exception: %s' % (url, future.exception())) else: print('%r page is %d bytes' % (url, len(future.result( Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 6 Mar 2010, at 08:42, Jesse Noller wrote: If people agree with this; do you feel the proposal of said namespace should be a separate PEP, or piggy back on this? I don't want to piggy back on Brian's hard work. It doesn't really matter to me. We can either update this PEP to propose the concurrent.futures name or you can draft a more complete PEP that describes what other functionality should live in the concurrent package. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 6 Mar 2010, at 09:54, Antoine Pitrou wrote: Le Fri, 5 Mar 2010 17:03:02 +1100, Brian Quinlan br...@sweetapp.com a écrit : The PEP lives here: http://python.org/dev/peps/pep-3148/ Ok, here is my take on it: cancel() Attempt to cancel the call. If the call is currently being executed then it cannot be cancelled and the method will return False, otherwise the call will be cancelled and the method will return True. I think it shouldn't return anything, and raise an exception if cancelling failed. It is really an error condition, and ignoring the result doesn't seem right. In my experience with futures, canceling them is a best-effort optimization that people use when another future fails. For example: futures = [executor.submit(CopyDirectory, src, dest) for dest in ...] finished, unfinished = wait(futures, return_when=FIRST_EXCEPTION) # If there are unfinished futures then there must have been a failure for f in unfinished: # No reason to waste bandwidth copying files if the operation has already failed. f.cancel() for f in finished(): if f.exception(): raise f.exception() Future.running() Return True if the call is currently being executed and cannot be cancelled. Future.done() Return True if the call was successfully cancelled or finished running. These don't really make sense since the future is executing concurrently. By the time the result is returned, it can already be wrong. I advocate removing those two methods. There methods are useful for logging - by displaying the count of pending, running and completed futures you can estimate the progress of the system. The following Future methods are meant for use in unit tests and Executor implementations. Their names should then be preceded by an underscore '_'. We don't want people to think they are public APIs and start relying on them. Actually, as discussed on the stdlib-sig, these methods are designed to make it possible for users to implement their own Executors so we'll have keep the interface stable. wait(fs, timeout=None, return_when=ALL_COMPLETED) [...] This method should always be called using keyword arguments I don't think this is right. Keyword arguments are nice, but mandating them too often is IMO a nuisance (after all, it makes things longer to type and requires you to remember the exact parameter names). Especially when the method only takes at most 3 arguments. IMO, keyword-only arguments are mostly useful when there are a lot of positional arguments before, and you want to help the user use the right calling signature. I agree, I'll change this. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously
On 7 Mar 2010, at 03:04, Phillip J. Eby wrote: At 05:32 AM 3/6/2010, Brian Quinlan wrote: Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not. Twisted's Deferred API has nothing to do with I/O. I see, you just mean the API and not the underlying model. We discussed the Deferred API on the stdlib-sig and I don't think that anyone expressed a preference for it over the one described in the PEP. Do you have any concrete criticism? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [PEP 3148] futures - execute computations asynchronously
Hi all, I recently submitted a daft PEP for a package designed to make it easier to execute Python functions asynchronously using threads and processes. It lets the user focus on their computational problem without having to build explicit thread/process pools and work queues. The package has been discussed on stdlib-sig but now I'd like this group's feedback. The PEP lives here: http://python.org/dev/peps/pep-3148/ Here are two examples to whet your appetites: Determine if several numbers are prime. import futures import math PRIMES = [ 112272535095293, 112582705942171, 112272535095293, 115280095190773, 115797848077099, 1099726899285419] def is_prime(n): if n % 2 == 0: return False sqrt_n = int(math.floor(math.sqrt(n))) for i in range(3, sqrt_n + 1, 2): if n % i == 0: return False return True # Uses as many CPUs as your machine has. with futures.ProcessPoolExecutor() as executor: for number, is_prime in zip(PRIMES, executor.map(is_prime, PRIMES)): print('%d is prime: %s' % (number, is_prime)) Print out the size of the home pages of various new sites (and Fox News). import futures import urllib.request URLS = ['http://www.foxnews.com/', 'http://www.cnn.com/', 'http://europe.wsj.com/', 'http://www.bbc.co.uk/', 'http://some-made-up-domain.com/'] def load_url(url, timeout): return urllib.request.urlopen(url, timeout=timeout).read() with futures.ThreadPoolExecutor(max_workers=5) as executor: # Create a future for each URL load. future_to_url = dict((executor.submit(load_url, url, 60), url) for url in URLS) # Iterate over the futures in the order that they complete. for future in futures.as_completed(future_to_url): url = future_to_url[future] if future.exception() is not None: print('%r generated an exception: %s' % (url, future.exception())) else: print('%r page is %d bytes' % (url, len(future.result( Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible py3k io wierdness
I've added a new proposed patch to: http://bugs.python.org/issue5700 The idea is: - only IOBase implements close() (though a subclass can override close without causing problems so long as it calls super().close() or calls .flush() and ._close() directly) - change IOBase.close to call .flush() and then ._close() - .flush() invokes super().flush() in every class except IOBase - ._close() invokes super()._close() in every class except IOBase - FileIO is implemented in Python in _pyio.py so that it can have the same base class as the other Python-implemented files classes - tests verify that .flush() is not called after the file is closed - tests verify that ._close()/.flush() calls are propagated correctly On nice side effect is that inheritance is a lot easier and MI works as expected i.e. class DebugClass(IOBase): def flush(self): print(some debug info) super().flush() def _close(self): print(some debug info super()._close() class MyClass(FileIO, DebugClass): # whatever order makes sense ... m = MyClass(...) m.close() # Will call: # IOBase.close() # DebugClass.flush() # FileIO has no .flush method # IOBase.flush() # FileIO._close() # DebugClass._close() # IOBase._close() Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible py3k io wierdness
Nick Coghlan wrote: Brian Quinlan wrote: - you need the cooperation of your subclasses i.e. they must call super().flush() in .flush() to get correct close behavior (and this represents a backwards-incompatible semantic change) Are you sure about that? Going by the current _pyio semantics that Antoine posted, it looks to me that it is already the case that subclasses need to invoke the parent flush() call correctly to avoid breaking the base class semantics (which really isn't an uncommon problem when it comes to writing correct subclasses). As it is now, if you didn't call super().flush() in your flush override, then a buffer won't be flushed at the time that you expected. With the proposed change, if you don't call super().flush() in your flush override, then the buffer will never get flushed and you will lose data when you close the file. I'm not saying that it is a big deal, but it is a difference. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible py3k io wierdness
Hey Antoine, Thanks for the clarification! I see that the C implementation matches the Python implementation but I don't see how the semantics of either are useful in this case. If a subclass implements flush then, as you say, it must also implement close and call flush itself before calling its superclass' close method. But then _RawIOBase will pointlessly call the subclass' flush method a second time. This second call should raise (because the file is closed) and the exception will be caught and suppressed. I don't see why this is helpful. Could you explain why _RawIOBase.close() calling self.flush() is useful? Cheers, Brian Antoine Pitrou wrote: Hi! brian at sweetapp.com writes: class _RawIOBase(_FileIO): FileIO is a subclass of _RawIOBase, not the reverse: issubclass(_io._RawIOBase, _io.FileIO) False issubclass(_io.FileIO, _io._RawIOBase) True I do understand your surprise, but the Python implementation of IOBase.close() in _pyio.py does the same thing: def close(self) - None: Flush and close the IO object. This method has no effect if the file is already closed. if not self.__closed: try: self.flush() except IOError: pass # If flush() fails, just give up self.__closed = True Note how it calls `self.flush()` and not `IOBase.flush(self)`. When writing the C version of the I/O stack, we tried to keep the semantics the same as in the Python version, although there are a couple of subtleties. Your problem here is that it's IOBase.close() which calls your flush() method, but FileIO.close() has already done its job before and the internal file descriptor has been closed (hence `self.closed` is True). In this particular case, I advocate overriding close() as well and call your flush() method manually from there. Thanks for your feedback! Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible py3k io wierdness
Antoine Pitrou wrote: Brian Quinlan brian at sweetapp.com writes: I don't see why this is helpful. Could you explain why _RawIOBase.close() calling self.flush() is useful? I could not explain it for sure since I didn't write the Python version. I suppose it's so that people who only override flush() automatically get the flush-on-close behaviour. But the way that the code is currently written, flush only gets called *after* the file has been closed (see my original example). It seems very unlikely that this is the behavior that the subclass would want/expect. So any objections to me changing IOBase (and the C implementation) to: def close(self): Flush and close the IO object. This method has no effect if the file is already closed. if not self.__closed: try: -self.flush() +IOBase.flush(self) except IOError: pass # If flush() fails, just give up self.__closed = True Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] dict(keys, values)
George Sakkis wrote: Perhaps this has been brought up in the past but I couldn't find it in the archives: far too often I use the idiom dict(zip(keys,values)), or the same with izip. How does letting dict take two positional arguments sound ? Pros: - Pretty obvious semantics, no mental overhead to learn and remember it. - More concise (especially if one imports itertools just to use izip). - At least as efficient as the current alternatives. - Backwards compatible. Cons: - Yet Another Way To Do It - Marginal benefit Also note that the keyword variant is longer than the zip variant e.g. dict(zip(keys, values)) dict(keys=keys, values=values) and the relationship between the keys and values seems far less obvious to me in the keyword variant. Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] dict(keys, values)
George Sakkis wrote: Um, you do realize that dict(keys=keys, values=values) is already valid and quite different from dict(zip(keys, values)), don't you ? :) Sorry, minor misreading on my part. Like that time in Sunday school when I missed the not in Though shall not kill. That was a rough week for everyone involved. OK, the non-zip variant saves you 5 characters i.e. dict(zip(keys, values)) vs. dict(keys, values) I still don't like it :-) Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Any reason that any()/all() do not take a predicateargument?
seq = [1,2,3,4,5] if any(seq, lambda x: x==5): ... which is clearly more readable than reduce(seq, lambda x,y: x or y==5, False) How about this? if any(x==5 for x in seq): Aren't all of these equivalent to: if 5 in seq: ... ? Cheers, Brian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com