Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Jesse Noller wrote: I am looking for any questions, concerns or benchmarks python-dev has regarding the possible inclusion of the pyprocessing module to the standard library - preferably in the 2.6 timeline. I think for inclusion in 2.6 it's to late. For 3.0, it's definitely too late - the PEP acceptance deadline was a year ago (IIRC). As I am trying to finish up the PEP, I want to see if I can address any questions or include any other useful data (including benchmarks) in the PEP prior to publishing it. I am also intending to include basic benchmarks for both the processing module against the threading module as a comparison. I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Martin v. Löwis schrieb: I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. I'm on Martin's side here. Although I like to see some sort of multi processing mechanism in Python 'cause I need it for lots of projects I'm against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't old and mature enough and it has some competitors like pp (parallel processing). On the one hand the inclusion of a package gives it an unfair advantage over similar packages. On the other hand it slows down future development because a new feature release must be synced with Python releases about every 1.5 years. -0.5 from me Christian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Christian Heimes wrote: Martin v. Löwis schrieb: I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. I'm on Martin's side here. Although I like to see some sort of multi processing mechanism in Python 'cause I need it for lots of projects I'm against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't old and mature enough and it has some competitors like pp (parallel processing). On the one hand the inclusion of a package gives it an unfair advantage over similar packages. On the other hand it slows down future development because a new feature release must be synced with Python releases about every 1.5 years. -0.5 from me It also isn't something which needs to be done *right now*. Leaving this for 3.x/2.7 seems like a much better idea to me. With the continuing rise of multi-processor desktop machines, the parallel processing approaches that are out there should see a fair amount of use and active development over the next 18 months. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On Wed, May 14, 2008 at 6:58 AM, Nick Craig-Wood [EMAIL PROTECTED] wrote: Jesse Noller [EMAIL PROTECTED] wrote: I am looking for any questions, concerns or benchmarks python-dev has regarding the possible inclusion of the pyprocessing module to the standard library - preferably in the 2.6 timeline. In March, I began working on the PEP for the inclusion of the pyprocessing (processing) module into the python standard library[1]. The original email to the stdlib-sig can be found here, it includes a basic overview of the module: http://mail.python.org/pipermail/stdlib-sig/2008-March/000129.html The processing module mirrors/mimics the API of the threading module - and with simple import/subclassing changes depending on the code, allows you to leverage multi core machines via an underlying forking mechanism. The module also supports the sharing of data across groups of networked machines - a feature obviously not part of the core threading module, but useful in a distributed environment. I think processing looks interesting and useful, especially since it works on Windows as well as Un*x. However I'd like to see a review of the security - anything which can run across networks of machines has security implications and I didn't see these spelt out in the documentation. Networked running should certainly be disabled by default and need explicitly enabling by the user - I'd hate for a new version of python to come with a remote exploit by default... -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick See the Manager documentation: http://pyprocessing.berlios.de/doc/manager-objects.html And the Listener/Client documentation: http://pyprocessing.berlios.de/doc/connection-ref.html Remote access is not implicit - it is explicit - you must spawn a Manager/Client instance and tell it to use the network instead of it being always networked. I'll add a security audit to the list of open issues though - that's a good point. -jesse ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] availability of httplib.HTTPResponse.close
Thomas, I think this is related to issue 1348. Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On 2008-05-14 14:15, Jesse Noller wrote: On Wed, May 14, 2008 at 5:45 AM, Christian Heimes [EMAIL PROTECTED] wrote: Martin v. Löwis schrieb: I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. I'm on Martin's side here. Although I like to see some sort of multi processing mechanism in Python 'cause I need it for lots of projects I'm against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't old and mature enough and it has some competitors like pp (parallel processing). On the one hand the inclusion of a package gives it an unfair advantage over similar packages. On the other hand it slows down future development because a new feature release must be synced with Python releases about every 1.5 years. -0.5 from me Christian I said this in reply to Martin - but the competitors (in my mind) are not as compelling due to the alternative paradigm for application construction they propose. The processing module is an easy win for us if included. Personally - I don't see how inclusion in the stdlib would slow down development - yes, you have to stick with the same release cycle as python-core, but if the module is feature complete and provides a stable API as it stands I don't see following python-core timelines as overly onerous. The module itself doesn't change that frequently - the last release in April was a bugfix release and API consistency change (the API would have to be locked for inclusion obviously - targeting a 2.7/3.1 release may be advantageous to achieve this). Why don't you start a parallel-sig and then hash this out with other distributed computing users ? You could then reach a decision by the time 2.7 is scheduled for release and then add the chosen module to the stdlib. The API of the processing module does look simple and nice, but parallel processing is a minefield - esp. when it comes to handling error situations (e.g. a worker failing, network going down, fail-over, etc.). What I'm missing with the processing module is a way to spawn processes on clusters (rather than just on a single machine). In the scientific world, MPI is the standard API of choice for doing parallel processing, so if we're after standards, supporting MPI would seem to be more attractive than the processing module. http://pypi.python.org/pypi/mpi4py In the enterprise world, you often find CORBA based solutions. http://omniorb.sourceforge.net/ And then, of course, you have a gazillion specialized solutions such as PyRO: http://pyro.sourceforge.net/ OTOH, perhaps the stdlib should just include entry-level support for some form of parallel processing, in which case processing does look attractive. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 14 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On Wed, May 14, 2008 at 11:46 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2008-05-14 14:15, Jesse Noller wrote: On Wed, May 14, 2008 at 5:45 AM, Christian Heimes [EMAIL PROTECTED] wrote: Martin v. Löwis schrieb: I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. I'm on Martin's side here. Although I like to see some sort of multi processing mechanism in Python 'cause I need it for lots of projects I'm against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't old and mature enough and it has some competitors like pp (parallel processing). On the one hand the inclusion of a package gives it an unfair advantage over similar packages. On the other hand it slows down future development because a new feature release must be synced with Python releases about every 1.5 years. -0.5 from me Christian I said this in reply to Martin - but the competitors (in my mind) are not as compelling due to the alternative paradigm for application construction they propose. The processing module is an easy win for us if included. Personally - I don't see how inclusion in the stdlib would slow down development - yes, you have to stick with the same release cycle as python-core, but if the module is feature complete and provides a stable API as it stands I don't see following python-core timelines as overly onerous. The module itself doesn't change that frequently - the last release in April was a bugfix release and API consistency change (the API would have to be locked for inclusion obviously - targeting a 2.7/3.1 release may be advantageous to achieve this). Why don't you start a parallel-sig and then hash this out with other distributed computing users ? You could then reach a decision by the time 2.7 is scheduled for release and then add the chosen module to the stdlib. The API of the processing module does look simple and nice, but parallel processing is a minefield - esp. when it comes to handling error situations (e.g. a worker failing, network going down, fail-over, etc.). What I'm missing with the processing module is a way to spawn processes on clusters (rather than just on a single machine). In the scientific world, MPI is the standard API of choice for doing parallel processing, so if we're after standards, supporting MPI would seem to be more attractive than the processing module. http://pypi.python.org/pypi/mpi4py In the enterprise world, you often find CORBA based solutions. http://omniorb.sourceforge.net/ And then, of course, you have a gazillion specialized solutions such as PyRO: http://pyro.sourceforge.net/ OTOH, perhaps the stdlib should just include entry-level support for some form of parallel processing, in which case processing does look attractive. -- Marc-Andre Lemburg eGenix.com Thanks for bringing up something I was going to mention - I am not attempting to solve the distributed computing problem with this proposal - you are right in mentioning there's a variety of technologies out there for achieving true loosely-coupled distributed computing, including all of those which you pointed out. I am proposing exactly what you mentioned: Entry level parallel processing. The fact that the processing module does have remote capabilities is a bonus: Not core to the proposal. While in a perfect world - a system might exist which truly insulates programmers from the difference between local concurrency and distributed systems - the two are really different problems. My concern is the taking advantage of the 8 core machine sitting under my desk (or the 10 or so I have in the lab)- the processing module allows me to do that - easily. The module is basic enough to be flexible in other technologies to use with it for highly distributed systems but it is also simple enough to act as an entry point for those people just starting out in the domain. Think of it like the difference between asyncore and Twisted. I could easily see more loosely-coupled-highly-distributed tools being built on top of the basics it has provided. -jesse ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On Wed, May 14, 2008 at 05:46:25PM +0200, M.-A. Lemburg wrote: What I'm missing with the processing module is a way to spawn processes on clusters (rather than just on a single machine). In the scientific world, MPI is the standard API of choice for doing parallel processing, so if we're after standards, supporting MPI would seem to be more attractive than the processing module. Think of the processing module as an alternative to the threading module, not as an alternative to MPI. In Python, multi-threading can be extremely slow. The processing module gives you a way to convert from using multiple threads to using multiple processes. If it made people feel better, maybe it should be called threading2 instead of multiprocessing. The word processing seems to make people think of parallel processing and clusters, which is missing the point. Anyway, I would love to see the processing module included in the standard library. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 pgp42e26gDfkd.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On Wed, May 14, 2008 at 08:06:15AM -0400, Jesse Noller wrote: Overwhelmingly, many of the python programmers I spoke to are looking for a solution that does not require the alteration of a known programming paradigm (i.e: threads) to allow them to take advantage of systems which are not getting faster - instead, they are getting wider. Simply put due to the GIL - pure python applications can not take advantage of these machines which are now more common than not without switching to an alternative interpreter - which for many - myself included - is not an attractive option. On Newegg, there are currently 15 single core processors listed, but there are 57 dual core processors and 52 quad core processors. By the time Python 2.6 comes out, it will be hard to buy a new computer without multiple cores. In my opinion, the sooner Python has a nice simple library for inter-process communication, the better. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 pgp4U92STA7VZ.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
In the scientific world, MPI is the standard API of choice for doing parallel processing, so if we're after standards, supporting MPI would seem to be more attractive than the processing module. http://pypi.python.org/pypi/mpi4py Of course, for MPI, pyprocessing's main functionality (starting new activities) isn't needed - you use the vendor's mpirun binary, which will create as many processes as you wish, following a policy that was set up by the cluster administration, or that you chose in a product-specific manner (e.g. what nodes to involve in the job). If my task was high-performance computing, I would indeed use MPI and ignore pyprocessing. In the enterprise world, you often find CORBA based solutions. http://omniorb.sourceforge.net/ Same here: I would prefer CORBA of pyprocessing when I want a componentized, distributed application. And then, of course, you have a gazillion specialized solutions such as PyRO: http://pyro.sourceforge.net/ I personally would not use that library, although I know others are very fond of it. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
I really do feel that inclusion of this library offers us the best of both worlds - it gives us (as a community) an easy answer to those people who would dismiss python due to the GIL and it also allows users to easily implement their applications. I really feel that you can get the best of both worlds even without inclusion of the module in the standard library. It should be fairly easy to install, assuming pre-compiled packages are available. For inclusion into Python, a number of things need to be changed, in particular with respect to the C code. It duplicates a lot of code from the socket and os modules, and should (IMO) be merged into these, rather than being separate - or perhaps merged into the subprocess module. Why isn't it possible to implement all this in pure Python, on top of the subprocess module? If there are limitations in Python that make it impossible to implement such functionality in pure Python - then *those* limitations should be overcome, rather than including the code wholesale. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On May 14, 2008, at 12:32 PM, Andrew McNabb wrote: Think of the processing module as an alternative to the threading module, not as an alternative to MPI. In Python, multi-threading can be extremely slow. The processing module gives you a way to convert from using multiple threads to using multiple processes. If it made people feel better, maybe it should be called threading2 instead of multiprocessing. The word processing seems to make people think of parallel processing and clusters, which is missing the point. Anyway, I would love to see the processing module included in the standard library. Is the goal of the pyprocessing module to be exactly drop in compatible with threading, Queue and friends? I guess the idea would be that if my problem is compute bound I'd use pyprocessing and if it was I/O bound I might just use the existing threading library? Can I write a program using only threading and Queue interfaces for inter-thread communication and just change my import statements and have my program work? Currently, it looks like the pyprocessing.Queue interface is slightly different than than Queue.Queue for example (no task_done() etc). Perhaps a stdlib version of pyprocessing could be simplified down to not try to be a cross-machine computing environment and just be a same- machine threading replacement? This would make the maintenance easier and reduce confusion with users about how they should do cross-machine multiprocessing. By the way, a thread-safe simple dict in the style of Queue would be extremely helpful in writing multi-threaded programs, whether using the threading or pyprocessing modules. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Andrew McNabb [EMAIL PROTECTED] wrote: Think of the processing module as an alternative to the threading module, not as an alternative to MPI. In Python, multi-threading can be extremely slow. The processing module gives you a way to convert from using multiple threads to using multiple processes. Indeed; pyprocessing was exactly what I wanted, and worked exactly as it said it did. It had essentially no learning curve at all because of its implementation of essentially the same interface as the threading module. It's remoting capabilities are almost surplus to requirements; if I wanted that, I might use MPI or something else. If it made people feel better, maybe it should be called threading2 instead of multiprocessing. The word processing seems to make people think of parallel processing and clusters, which is missing the point. threading is to threads as processing is to processes; that's why it was named processing. But the choice of name shouldn't affect the decision as to whether it should be included or not. Anyway, I would love to see the processing module included in the standard library. I would as well; I'm using it in a current project, and can see opportunities for it to be useful in other projects. My only note is that based on my experience, the code needs a little cleanup for portability reasons before it is included with the Python standard library -- it worked fine as-is on Linux, but needed some hackery to get running on a slightly elderly Solaris 8 box. I didn't test it on anything else. Charles -- --- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
M.-A. Lemburg wrote: The API of the processing module does look simple and nice, but parallel processing is a minefield - esp. when it comes to handling error situations (e.g. a worker failing, network going down, fail-over, etc.). What I'm missing with the processing module is a way to spawn processes on clusters (rather than just on a single machine). Perhaps one-size-fits-all isn't the right approach here. I think there's room for more than one module -- a simple one for people who just want to spawn some extra processes on the same CPU to take advantage of multiple cores, and a fancier one (maybe based on MPI) for people who want grid-computing style distribution with error handling, fault tolerance, etc. (I didn't set out to justify that paragraph, btw -- it just happened!) -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Andrew McNabb wrote: If it made people feel better, maybe it should be called threading2 instead of multiprocessing. I think that errs in the other direction, making it sound like just another way of doing single-process threading, which it's not. Maybe multicore would help give the right impression? (Still allows for networking -- nobody says all the cores have to be on the same machine.:-) -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Charles Cazabon wrote: threading is to threads as processing is to processes; that's why it was named processing. Unfortunately, the word processing is already used in the field of computing with a very general meaning -- any kind of data transfomation at all can be, and is, referred to as processing. So the intended meaning fails to come across. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
Greg Maybe multicore would help give the right impression? multiproc, multiprocess? Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
At 12:19 PM 5/15/2008 +1200, Greg Ewing wrote: Andrew McNabb wrote: If it made people feel better, maybe it should be called threading2 instead of multiprocessing. I think that errs in the other direction, making it sound like just another way of doing single-process threading, which it's not. Maybe multicore would help give the right impression? Sounds like a marketing win to me, since it directly addresses the python doesn't do multicore meme. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On Wed, May 14, 2008 at 6:48 PM, Phillip J. Eby [EMAIL PROTECTED] wrote: At 12:19 PM 5/15/2008 +1200, Greg Ewing wrote: Andrew McNabb wrote: If it made people feel better, maybe it should be called threading2 instead of multiprocessing. I think that errs in the other direction, making it sound like just another way of doing single-process threading, which it's not. Maybe multicore would help give the right impression? Sounds like a marketing win to me, since it directly addresses the python doesn't do multicore meme. -1 on multicore - multiprocess or multiprocessing are a fine names. cores are irrelevant. systems have multiple cpus real or virtual regardless of how many dies, sockets and cores there are. +0.5 on inclusion. that means i am happy if it does but don't think it needs to make it into 2.6/3.0. leave inclusion for 2.7/3.1. its easy for people to install from an external source for now if they want it. -gps ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com