Re: Parallel python in the cloud

2014-05-24 Thread Robert Kern

On 2014-05-24 07:46, Charles Gagnon wrote:

We were happily using PiCloud for several long calculations and we very happy 
with with it. With their realtime cores, we could take really large 
calculations set and run through fairly quickly.

Now that PiCloud is going away, we ran a few tests on Mutlyvac but so far, we 
are struggling to accomplish the same thing we had on PiCloud.

I have several "pieces" of my puzzle but can't seem to be able to put it 
together. I've seen and tried StarCluster and also various parallel python options but 
all options seem challenging to put together.

The goal is to mimic PiCloud, ie. loop through a function:

def some_NP_func(x, y):
...
return z

some_cloud.call(some_NP_func, a1, a2)

Which computes the function on the cloud. We use this often in for loops with 
arrays of arguments. The other scenario is:

some_cloud.map(some_NP_intense_func, [...], [...])

Which iterates through and returns results. We need to run a lot of this in 
batch from a scheduler so I always try to avoid interactive environment (how 
does iPython parallel work in batch?).


IPython parallel works just fine "in batch". As far as your client code (i.e. 
what you wrote above) is concerned, it's just another library. E.g.


https://github.com/ipython/ipython/blob/master/examples/Parallel%20Computing/nwmerge.py
https://github.com/ipython/ipython/blob/master/examples/Parallel%20Computing/itermapresult.py

etc.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-27 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:42, Jurko Gospodnetić wrote:

   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location & uninstall. That leaves us with a working
Python folder sans the start menu and registry items, both of which we
do not need for this. Everything I've played around with so far seems to
use the correct Python data depending on the interpreter executable
invoked, whether or not there is a regular Windows installation
somewhere on the same machine.

   We can use the script suggested by Ned Batchelder to temporarily
change the 'current installation' if needed for some external installer
package to correctly recognize where to install its content.

   I'm still playing around with this, and will let you know how it goes.


  Just wanted to let you know that the usage I described above seems to 
work in all the cases I tried out.


  I added some batch scripts for running a specific Python interpreter 
as a convenience and everything works 'naturally' in our development 
environment.


  Packages can be easily installed to a specific targeted environment 
using for example:

  py243 -m easy_install pip
  py332 -m pip install pytest
[not mentioning tweaks needed for specific ancient Python versions]

  Thank you all for all the suggestions.

  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 17:38, Terry Reedy wrote:

   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location & uninstall.


If the dummy folder had 3.3.0, you should not need to uninstall to
install 3.3.1 on top of it. But it is easy and probably safest.


  Without the uninstall step you get stuck with invalid registry and 
start menu items refering to an invalid path until you install another 
matching major.minor.X version.




Just a reminder: you can run one file or set of files with multiple
Pythons by putting 'project.pth' containing the same 'path-to-project'
in the Lib/site-packages of each Python directory. I do this to test one
file with 2.7 and 3.3 (and just added 3.4) without copying the file.


  Thanks for the tip. That might come in useful. At the moment I just 
run the pytest framework using different python interpreters, without 
having to install the package at all (possibly first running 'setup.py 
build' to get the sources converted to Python 3 format).


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Terry Reedy

On 11/25/2013 8:42 AM, Jurko Gospodnetić wrote:


   So far all tests seem to indicate that things work out fine if we
install to some dummy target folder, copy the target folder to some
version specific location & uninstall.


If the dummy folder had 3.3.0, you should not need to uninstall to 
install 3.3.1 on top of it. But it is easy and probably safest.



That leaves us with a working
Python folder sans the start menu and registry items, both of which we
do not need for this. Everything I've played around with so far seems to
use the correct Python data depending on the interpreter executable
invoked, whether or not there is a regular Windows installation
somewhere on the same machine.


Just a reminder: you can run one file or set of files with multiple 
Pythons by putting 'project.pth' containing the same 'path-to-project' 
in the Lib/site-packages of each Python directory. I do this to test one 
file with 2.7 and 3.3 (and just added 3.4) without copying the file.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 15:15, Albert-Jan Roskam wrote:

  > Are you sure? 
http://stackoverflow.com/questions/1534210/use-different-python-version-with-virtualenv


  Yup, I'm pretty sure by now (based on reading the docs, not trying it 
out though).


  Virtualenv allows you to set up different environments, each of them 
having a separate Python folder structure and each possibly connected to 
a separate Python interpreter executable. However, it does not solve the 
problem of getting those separate Python interpreter executables 
installed in the first place, which is the problem I was attacking. :-)


  Still playing around with my multiple installations setup here. Will 
let you know how it goes...


  So far, one important thing I noticed is that you need to run all 
your installations 'for the current user only', or otherwise it moves at 
least one DLL file (python24.dll) into a Windows system folder and then 
the next installation deletes it from there, and overwrites it with its 
own. :-( But I can live with that... :-)


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Tue, Nov 26, 2013 at 1:15 AM, Albert-Jan Roskam  wrote:
> Below is a little terminal session.  I often switch between python 3.3 and 
> python 2.7. My virtualenv for python 3.3 is called "python33". "workon" is a 
> virtualenv wrapper command. And check out the envlist in tox.ini on 
> http://tox.readthedocs.org/en/latest/example/basic.html

That's two different minor versions, though. Can you have 3.3.1 and
3.3.2 installed, by that method?

Incidentally, if this were on Linux, I would just build the different
versions in different directories, and then run them without
installing. But the OP seems to have a solution that works, and I
think it'll be the simplest.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Albert-Jan Roskam

On Mon, 11/25/13, Jurko Gospodnetić  wrote:

 Subject: Re: Parallel Python x.y.A and x.y.B installations on a single Windows 
machine
 To: python-list@python.org
 Date: Monday, November 25, 2013, 2:57 PM
 
   Hi.
 
 On 25.11.2013. 14:20, Albert-Jan Roskam wrote:
 > Check out the following packages: virtualenv,
 virtualenvwrapper, tox
 > virtualenv + wrapper make it very easy to switch from
 one python
 > version to another. Stricly speaking you don't need
 > virtualenvwrapper, but it makes working with virtualenv
 a whole lot
 > easier.Tox also uses virtualenv. You can configure it
 to sdist your
 > package under different python versions. Also, you can
 make it run
 > nosetests for each python version and/or implementation
 (pypy and
 > jython are supported)
 
   I'll look into using virtualenv and possibly tox once
 I get into issues with mismatched installed Python package
 versions, but for now I'm dealing with installing different
 Python interpreter versions and, unless I'm overlooking
 something here, virtualenv does not help with that. :-(
 
 > Are you sure? 
http://stackoverflow.com/questions/1534210/use-different-python-version-with-virtualenv

Below is a little terminal session.  I often switch between python 3.3 and 
python 2.7. My virtualenv for python 3.3 is called "python33". "workon" is a 
virtualenv wrapper command. And check out the envlist in tox.ini on 
http://tox.readthedocs.org/en/latest/example/basic.html

antonia@antonia-HP-2133 ~ $ workon python3.3
ERROR: Environment 'python3.3' does not exist. Create it with 'mkvirtualenv 
python3.3'.
antonia@antonia-HP-2133 ~ $ workon python33
(python33)antonia@antonia-HP-2133 ~ $ python
Python 3.3.2 (default, Sep  1 2013, 22:59:57) 
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()
(python33)antonia@antonia-HP-2133 ~ $ deactivate
antonia@antonia-HP-2133 ~ $ python
Python 2.7.3 (default, Sep 26 2013, 16:38:10) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:20, Albert-Jan Roskam wrote:

Check out the following packages: virtualenv, virtualenvwrapper, tox
virtualenv + wrapper make it very easy to switch from one python
version to another. Stricly speaking you don't need
virtualenvwrapper, but it makes working with virtualenv a whole lot
easier.Tox also uses virtualenv. You can configure it to sdist your
package under different python versions. Also, you can make it run
nosetests for each python version and/or implementation (pypy and
jython are supported)


  I'll look into using virtualenv and possibly tox once I get into 
issues with mismatched installed Python package versions, but for now 
I'm dealing with installing different Python interpreter versions and, 
unless I'm overlooking something here, virtualenv does not help with 
that. :-(


  Thanks for the suggestion though, I'm definitely going to read up on 
those packages soon. :-)


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 13:46, Ned Batchelder wrote:

IIRC, Python itself doesn't read those registry entries, except when installing 
pre-compiled .msi or .exe kits.  Once you have Python installed, you can move 
the directory someplace else, then install another version of Python.

If you need to use many different Pythons of the same version, this script 
helps manage the registry: 
http://nedbatchelder.com/blog/201007/installing_python_packages_from_windows_installers_into.html


  Thank you for the information!

  As I mentioned in another reply, so far I think we can use this 
script to temporarily change the 'current installation' if needed for 
some external installer package to correctly recognize where to install 
its content.


  If we do use it, I'll most likely modify it to first 
make a backup copy of the original registry key and use that later on to 
restore the original registry state instead of reconstructing its 
content to what the script assumes it should be.


  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Tue, Nov 26, 2013 at 12:42 AM, Jurko Gospodnetić
 wrote:
>   Yup, we could do that, but at first glance it really smells like an
> overkill. Not to mention the potential licensing issues with Windows and an
> unlimited number of Windows installations. :-)

Ah, heh... didn't think of that. When I spin up arbitrary numbers of
VMs, they're always Linux, so licensing doesn't come into it :)

>   So far all tests seem to indicate that things work out fine if we install
> to some dummy target folder, copy the target folder to some version specific
> location & uninstall. That leaves us with a working Python folder sans the
> start menu and registry items, both of which we do not need for this.
> Everything I've played around with so far seems to use the correct Python
> data depending on the interpreter executable invoked, whether or not there
> is a regular Windows installation somewhere on the same machine.

Okay! That sounds good. Underkill is better than overkill if you can
get away with it!

Good luck. You'll need it, if you're trying to support Python 2.4 and
all newer versions AND manage issues across patch releases...

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Jurko Gospodnetić

  Hi.

On 25.11.2013. 14:04, Chris Angelico wrote:

Is it possible to set up virtualization to help you out? Create a
virtual machine in something like VirtualBox, then clone it for every
Python patch you want to support (you could have one VM that handles
all the .0 releases and another that handles all the .1 releases, or
you could have a separate VM for every Python you want to test).
...


  Thank you for the suggestion.

  Yup, we could do that, but at first glance it really smells like an 
overkill. Not to mention the potential licensing issues with Windows and 
an unlimited number of Windows installations. :-)


  So far all tests seem to indicate that things work out fine if we 
install to some dummy target folder, copy the target folder to some 
version specific location & uninstall. That leaves us with a working 
Python folder sans the start menu and registry items, both of which we 
do not need for this. Everything I've played around with so far seems to 
use the correct Python data depending on the interpreter executable 
invoked, whether or not there is a regular Windows installation 
somewhere on the same machine.


  We can use the script suggested by Ned Batchelder to temporarily 
change the 'current installation' if needed for some external installer 
package to correctly recognize where to install its content.


  I'm still playing around with this, and will let you know how it goes.

  Thank you again for replying!

  Best regards,
Jurko Gospodnetić


--
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Albert-Jan Roskam

On Mon, 11/25/13, Jurko Gospodnetić  wrote:

 Subject: Parallel Python x.y.A and x.y.B installations on a single Windows 
machine
 To: python-list@python.org
 Date: Monday, November 25, 2013, 1:32 PM
 
   Hi all.
 
   I was wondering what is the best way to install
 multiple Python installations on a single Windows machine.
 
   Regular Windows installer works great as long as all
 your installations have a separate major.minor version
 identifier. However, if you want to have let's say 2.4.3
 & 2.4.4 installed at the same time it does not seem to
 work.
 
   I have not been able to find any prepackaged Python
 installation or really any solution to this. Most of the
 advice seems to boil down to 'do not use such versions
 together, use only the latest'.
 
   We would like to run automated tests on one of our
 projects (packaged as a Python library) with different
 Python versions, and since our code contains workarounds for
 several problems with specific Python patch versions, we'd
 really like to be able to run the tests with those specific
 versions and with as little fuss as possible.
 
   Looking at what the Python installer does, the only
 problematic part for working around this manually seems to
 be the registry entries under
 'Software\Python\PythonCore\M.m' where 'M.n' is the
 major.minor version identifier. If Python interpreter
 expects to always find its entries there, then I guess there
 is no way to do what we need without building customized
 Python executables. Is there a way to force a specific
 Python interpreter to not read in this information, read it
 from an .ini file or something similar?
 
HI Jurko,

Check out the following packages: virtualenv, virtualenvwrapper, tox
virtualenv + wrapper make it very easy to switch from one python version to 
another. Stricly speaking you don't need virtualenvwrapper, but it makes 
working with virtualenv a whole lot easier.Tox also uses virtualenv. You can 
configure it to sdist your package under different python versions. Also, you 
can make it run nosetests for each python version and/or implementation (pypy 
and jython are supported)

Albert-Jan 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Chris Angelico
On Mon, Nov 25, 2013 at 11:32 PM, Jurko Gospodnetić
 wrote:
> Most of the advice seems to boil down to 'do not use such versions together,
> use only the latest'.
>
>   We would like to run automated tests on one of our projects (packaged as a
> Python library) with different Python versions, and since our code contains
> workarounds for several problems with specific Python patch versions, we'd
> really like to be able to run the tests with those specific versions and
> with as little fuss as possible.

What this says to me is that you're doing something very unusual here
- most people won't be doing that. So maybe you need an unusual
solution.

Is it possible to set up virtualization to help you out? Create a
virtual machine in something like VirtualBox, then clone it for every
Python patch you want to support (you could have one VM that handles
all the .0 releases and another that handles all the .1 releases, or
you could have a separate VM for every Python you want to test). You
could then have a centralized master that each VM registers itself
with, and it feeds out jobs to them. Assuming your tests can be fully
automated, this could work out fairly efficiently - each VM has a
script that establishes a socket connection to the master, the master
hands out a job, the VMs run the test suite, the master collects up a
series of Pass/Fail reports. You could run everything on a single
physical computer, even.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python x.y.A and x.y.B installations on a single Windows machine

2013-11-25 Thread Ned Batchelder
On Monday, November 25, 2013 7:32:30 AM UTC-5, Jurko Gospodnetić wrote:
> Hi all.
> 
>I was wondering what is the best way to install multiple Python 
> installations on a single Windows machine.
> 
>Regular Windows installer works great as long as all your 
> installations have a separate major.minor version identifier. However, 
> if you want to have let's say 2.4.3 & 2.4.4 installed at the same time 
> it does not seem to work.
> 
>I have not been able to find any prepackaged Python installation or 
> really any solution to this. Most of the advice seems to boil down to 
> 'do not use such versions together, use only the latest'.
> 
>We would like to run automated tests on one of our projects (packaged 
> as a Python library) with different Python versions, and since our code 
> contains workarounds for several problems with specific Python patch 
> versions, we'd really like to be able to run the tests with those 
> specific versions and with as little fuss as possible.
> 
>Looking at what the Python installer does, the only problematic part 
> for working around this manually seems to be the registry entries under 
> 'Software\Python\PythonCore\M.m' where 'M.n' is the major.minor version 
> identifier. If Python interpreter expects to always find its entries 
> there, then I guess there is no way to do what we need without building 
> customized Python executables. Is there a way to force a specific Python 
> interpreter to not read in this information, read it from an .ini file 
> or something similar?
> 
>Many thanks.
> 
>Best regards,
>  Jurko Gospodnetiďż˝

IIRC, Python itself doesn't read those registry entries, except when installing 
pre-compiled .msi or .exe kits.  Once you have Python installed, you can move 
the directory someplace else, then install another version of Python.

If you need to use many different Pythons of the same version, this script 
helps manage the registry: 
http://nedbatchelder.com/blog/201007/installing_python_packages_from_windows_installers_into.html

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallel python + ??

2008-06-11 Thread Thor
Gerhard Häring wrote:

> This is of course OS-specific. On Linux, you can parse the proc
> filesystem:
> 
>  >>> open("/proc/%i/stat" % os.getpid()).read().split()[39]
> 
> You can use the "taskset" utility to query or set CPU affinity on Linux.
> 
It is going to be in Linux (mainly) I was thinking about something like
this:

import Module

def process(self):
  print "I am running on processor", Module.cpu,"core", Module.core
  


Checking the raskset right now...:) Thanks.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel python + ??

2008-06-11 Thread Gerhard Häring

Thor wrote:

Hi,

I am running a program using Parallel Python and I wonder if there is a
way/module to know in which CPU/core the process is running in. Is that
possible?


This is of course OS-specific. On Linux, you can parse the proc filesystem:

>>> open("/proc/%i/stat" % os.getpid()).read().split()[39]

You can use the "taskset" utility to query or set CPU affinity on Linux.

-- Gerhard

--
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-07 Thread Thorsten Kampe
* bruce (Tue, 6 Nov 2007 13:43:10 -0800)
> if i have python 2.4.3 installed, it gets placed in the python2.4 dir.. if i
> don't do anything different, and install python 2.4.2, it too will get
> placed in the python2.4 tree... which is not what i want.
> 
> i'm running rhel4/5...

So you're using rpm as a packet manager. I suggest you RTFM to see if 
there are options for slots or dual installations.
 
> so.. i still need to know what to do/change in order to be able to run
> multiple versions of python, and to switch back/forth between the versions.

Unpack the Python rpm to ~/bin or compile Python yourself. And be more 
specific about what you mean with "switching back/forth". 

On the other hand - as Gabriel pointed out: there is almost a 100% 
certainty that the problem you want to solve by having Python 2.4.2 
*and* 2.4.3 simultaneously exists only in your head or cannot be 
solved this way.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
hi gabriel...

i have my reasons, for some testing that i'm doing on a project.

that said, i'm still trying to figure out how to make this occur...

thanks



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of Gabriel Genellina
Sent: Tuesday, November 06, 2007 2:07 PM
To: python-list@python.org
Subject: Re: Parallel Python environments..


En Tue, 06 Nov 2007 18:43:10 -0300, bruce <[EMAIL PROTECTED]>
escribió:

> if i have python 2.4.3 installed, it gets placed in the python2.4 dir..
> if i
> don't do anything different, and install python 2.4.2, it too will get
> placed in the python2.4 tree... which is not what i want.

Any reason you want to keep 2.4.2 *and* 2.4.3 separate? The latter is only
a bugfix over the 2.4 version - anything working on 2.4.2 should work on
2.4.3. And 2.4.4, the latest bugfix on that series. Binaries, shared
libraries, extensions, etc. targetted to 2.4x should work with 2.4.4
You may want to have separate directories for 2.4 and 2.5, yes; binaries,
shared libraries and extensions do NOT work across versions changing the
SECOND digit. But changes on the THIRD digit should not have compatibility
problems.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Gabriel Genellina
En Tue, 06 Nov 2007 18:43:10 -0300, bruce <[EMAIL PROTECTED]>  
escribió:

> if i have python 2.4.3 installed, it gets placed in the python2.4 dir..  
> if i
> don't do anything different, and install python 2.4.2, it too will get
> placed in the python2.4 tree... which is not what i want.

Any reason you want to keep 2.4.2 *and* 2.4.3 separate? The latter is only  
a bugfix over the 2.4 version - anything working on 2.4.2 should work on  
2.4.3. And 2.4.4, the latest bugfix on that series. Binaries, shared  
libraries, extensions, etc. targetted to 2.4x should work with 2.4.4
You may want to have separate directories for 2.4 and 2.5, yes; binaries,  
shared libraries and extensions do NOT work across versions changing the  
SECOND digit. But changes on the THIRD digit should not have compatibility  
problems.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
thorsten...

if i have python 2.4.3 installed, it gets placed in the python2.4 dir.. if i
don't do anything different, and install python 2.4.2, it too will get
placed in the python2.4 tree... which is not what i want.

i'm running rhel4/5...

so.. i still need to know what to do/change in order to be able to run
multiple versions of python, and to switch back/forth between the versions.

thanks


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of Thorsten Kampe
Sent: Tuesday, November 06, 2007 8:19 AM
To: python-list@python.org
Subject: Re: Parallel Python environments..


* bruce (Tue, 6 Nov 2007 07:13:43 -0800)
> If I wanted to be able to build/test/use parallel python versions, what
> would I need to do/set (paths/libs/etc...)

nothing

> and where would I need to place the 2nd python version, so as not to
> screw up my initial python dev env.

Anywhere you like (probably ~/bin would be best)

> Any sites/pointers describing the process would be helpuful. In
particular,
> any changfes to the bashrc/profile/etc... files to allow me to accomplish
> this would be helpful.

Nothing like that. Just change the shebang.

Thorsten
--
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Parallel Python environments..

2007-11-06 Thread bruce
i'm running rhel...

so there isn't a python-config script as far as i know..


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf
Of [EMAIL PROTECTED]
Sent: Tuesday, November 06, 2007 8:26 AM
To: python-list@python.org
Subject: Re: Parallel Python environments..


In Gentoo Linux you can select between installed python version using
python-config script.

-- 
http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Diez B. Roggisch
bruce wrote:

> Hi..
> 
> If I wanted to be able to build/test/use parallel python versions, what
> would I need to do/set (paths/libs/etc...) and where would I need to place
> the 2nd python version, so as not to screw up my initial python dev env.
> 
> I'd like to be able to switch back/forth between the different versions if
> possible. I know it should be, but I haven't been able to find what I'm
> looking for via the 'net...
> 
> Any sites/pointers describing the process would be helpuful. In
> particular, any changfes to the bashrc/profile/etc... files to allow me to
> accomplish this would be helpful.

Installation of several python versions is easy on at least Windows &
unixish platforms. For the latter, ususally your package-management offers
several versions. If your's doesn't or you use windows, just install as
required by python itself.

Only what gets chosen as default python version in case of e.g. *.py-files
clicking in windows explorer depends on the python version installed as
last and must be changed by the OS's means for it if other behavior is
desired.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread [EMAIL PROTECTED]
In Gentoo Linux you can select between installed python version using
python-config script.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python environments..

2007-11-06 Thread Thorsten Kampe
* bruce (Tue, 6 Nov 2007 07:13:43 -0800)
> If I wanted to be able to build/test/use parallel python versions, what
> would I need to do/set (paths/libs/etc...)

nothing

> and where would I need to place the 2nd python version, so as not to
> screw up my initial python dev env.

Anywhere you like (probably ~/bin would be best)
 
> Any sites/pointers describing the process would be helpuful. In particular,
> any changfes to the bashrc/profile/etc... files to allow me to accomplish
> this would be helpful.

Nothing like that. Just change the shebang.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-02-04 Thread parallelpython
On Jan 12, 11:52 am, Neal Becker <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > Has anybody tried to runparallelpythonapplications?
> > It appears that if your application is computation-bound using 'thread'
> > or 'threading' modules will not get you any speedup. That is because
> >pythoninterpreter uses GIL(Global Interpreter Lock) for internal
> > bookkeeping. The later allows only onepythonbyte-code instruction to
> > be executed at a time even if you have a multiprocessor computer.
> > To overcome this limitation, I've created ppsmp module:
> > http://www.parallelpython.com
> > It provides an easy way to runparallelpythonapplications on smp
> > computers.
> > I would appreciate any comments/suggestions regarding it.
> > Thank you!
>
> Looks interesting, but is there any way to use this for a cluster of
> machines over a network (not smp)?

There are 2 major updates regarding Parallel Python: http://
www.parallelpython.com

1) Now (since version 1.2) parallel python software could be used for
cluster-wide parallelization (or even Internet-wide). It's also
renamed accordingly: pp (module is backward compatible with ppsmp)

2) Parallel Python became open source (under BSD license): http://
www.parallelpython.com/content/view/18/32/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-13 Thread parallelpython
> Looks interesting, but is there any way to use this for a cluster of
> machines over a network (not smp)?

Networking capabilities will be included in the next release of
Parallel Python software (http://www.parallelpython.com), which is
coming soon.


> Couldn't you just provide similar conveniences on top of MPI? Searching
> for "Python MPI" yields a lot of existing work (as does "Python PVM"),
> so perhaps someone has already done so.

Yes, it's possible to do it on the top of any environment which
supports IPC.

> That's one more project... It seems that there is significant
> interest in parallel computing in Python. Perhaps we should start a
> special interest group? Not so much in order to work on a single
> project; I believe that at the current state of parallel computing we
> still need many different approaches to be tried. But an exchange of
> experience could well be useful for all of us.
Well, I may just add that everybody is welcome to start discussion
regarding any parallel python project or idea in this forum:
http://www.parallelpython.com/component/option,com_smf/Itemid,29/board,2.0

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread mheslep

Konrad Hinsen wrote:
 Perhaps we should start a
> special interest group? Not so much in order to work on a single
> project; I believe that at the current state of parallel computing we
> still need many different approaches to be tried. But an exchange of
> experience could well be useful for all of us.
> 
+ 1

-Mark

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Konrad Hinsen
On Jan 12, 2007, at 15:08, Paul Boddie wrote:

> It seems to me that a more useful first step would be to create an
> overview of the different modules and put it on the python.org Wiki:
>
> http://wiki.python.org/moin/FrontPage
> http://wiki.python.org/moin/UsefulModules (a reasonable entry point)
>
> If no-one beats me to it, I may write something up over the weekend.

That sounds like a good idea. I won't beat you to it, but I'll have a  
look next week and perhaps add information that I have.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
Konrad Hinsen wrote:
>
> That's one more project... It seems that there is significant
> interest in parallel computing in Python. Perhaps we should start a
> special interest group? Not so much in order to work on a single
> project; I believe that at the current state of parallel computing we
> still need many different approaches to be tried. But an exchange of
> experience could well be useful for all of us.

I think a special interest group might be productive, but I've seen
varying levels of special interest in the different mailing lists
associated with such groups: the Web-SIG list started with enthusiasm,
produced a cascade of messages around WSGI, then dried up; the XML-SIG
list seems to be a sorry indication of how Python's XML scene has
drifted onto other matters; other such groups have also lost their
momentum.

It seems to me that a more useful first step would be to create an
overview of the different modules and put it on the python.org Wiki:

http://wiki.python.org/moin/FrontPage
http://wiki.python.org/moin/UsefulModules (a reasonable entry point)

If no-one beats me to it, I may write something up over the weekend.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Konrad Hinsen
On Jan 12, 2007, at 11:21, Paul Boddie wrote:

> done. My own experience with regard to releasing software is that even
> with an open source licence, most people are likely to ignore your
> projects than to suddenly jump on board and take control, and even if

My experience is exactly the same. And looking into the big world of  
Open Source programs, the only case I ever heard of in which a  
project was forked by someone else is the Emacs/XEmacs split. I'd be  
happy if any of my projects ever reached that level of interest.

> Related to your work, I've released a parallel execution solution
> called parallel/pprocess [1] under the LGPL and haven't really heard
> about anyone really doing anything with it, let alone forking it and

That's one more project... It seems that there is significant  
interest in parallel computing in Python. Perhaps we should start a  
special interest group? Not so much in order to work on a single  
project; I believe that at the current state of parallel computing we  
still need many different approaches to be tried. But an exchange of  
experience could well be useful for all of us.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
robert wrote:
> Paul Boddie wrote:
> >
> > [1] http://www.python.org/pypi/parallel
>
> I'd be interested in an overview.

I think we've briefly discussed the above solution before, and I don't
think you're too enthusiastic about anything using interprocess
communication, which is what the above solution uses. Moreover, it's
intended as a threading replacement for SMP/multicore architectures
where one actually gets parallel execution (since it uses processes).

> For ease of use a major criterion for me would be a pure python
> solution, which also does the job of starting and controlling the
> other process(es) automatically right (by default) on common
> platforms.
> Which of the existing (RPC) solutions are that nice?

Many people have nice things to say about Pyro, and there seem to be
various modules attempting parallel processing, or at least some kind
of job control, using that technology. See Konrad Hinsen's
ScientificPython solution for an example of this - I'm sure I've seen
others, too.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Neal Becker
[EMAIL PROTECTED] wrote:

> Has anybody tried to run parallel python applications?
> It appears that if your application is computation-bound using 'thread'
> or 'threading' modules will not get you any speedup. That is because
> python interpreter uses GIL(Global Interpreter Lock) for internal
> bookkeeping. The later allows only one python byte-code instruction to
> be executed at a time even if you have a multiprocessor computer.
> To overcome this limitation, I've created ppsmp module:
> http://www.parallelpython.com
> It provides an easy way to run parallel python applications on smp
> computers.
> I would appreciate any comments/suggestions regarding it.
> Thank you!
> 

Looks interesting, but is there any way to use this for a cluster of
machines over a network (not smp)?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread robert
Paul Boddie wrote:
> [EMAIL PROTECTED] wrote:
>> The main difference between MPI python solutions and ppsmp is that with
>> MPI you have to organize both computations
>> {MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else } and
>> data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
>> you just submit a function with arguments to the execution server and
>> retrieve the results later.
> 
> Couldn't you just provide similar conveniences on top of MPI? Searching
> for "Python MPI" yields a lot of existing work (as does "Python PVM"),
> so perhaps someone has already done so. Also, what about various grid
> toolkits?
> 
> [...]
> 
>> Overall ppsmp is still work in progress and there are other interesting
>> features which I would like to implement. This is the main reason why I
>> do not open the source of ppsmp - to have better control of its future
>> development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)
> 
> Despite various probable reactions from people who will claim that
> they're comfortable with binary-only products from a single vendor, I
> think more people would be inclined to look at your software if you did
> distribute the source code, even if they then disregarded what you've
> done. My own experience with regard to releasing software is that even
> with an open source licence, most people are likely to ignore your
> projects than to suddenly jump on board and take control, and even if
> your project somehow struck a chord and attracted a lot of interested
> developers, would it really be such a bad thing? Many developers have
> different experiences and insights which can only make your project
> better, anyway.
> 
> Related to your work, I've released a parallel execution solution
> called parallel/pprocess [1] under the LGPL and haven't really heard
> about anyone really doing anything with it, let alone forking it and
> showing my original efforts in a bad light. Perhaps most of the
> downloaders believe me to be barking up the wrong tree (or just
> barking) with the approach I've taken, but I think the best thing is to
> abandon any fears of not doing things the best possible way and just be
> open to improvements and suggestions.
> 
> Paul
> 
> [1] http://www.python.org/pypi/parallel

I'd be interested in an overview.
For ease of use a major criterion for me would be a pure python 
solution, which also does the job of starting and controlling the 
other process(es) automatically right (by default) on common 
platforms.
Which of the existing (RPC) solutions are that nice?


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
"Paul Boddie" <[EMAIL PROTECTED]> writes:
|> [EMAIL PROTECTED] wrote:
|> >
|> > The main difference between MPI python solutions and ppsmp is that with
|> > MPI you have to organize both computations
|> > {MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else } and
|> > data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
|> > you just submit a function with arguments to the execution server and
|> > retrieve the results later.
|> 
|> Couldn't you just provide similar conveniences on top of MPI? Searching
|> for "Python MPI" yields a lot of existing work (as does "Python PVM"),
|> so perhaps someone has already done so. 

Yes.  No problem.

|> Also, what about various grid toolkits?

If you can find one that is robust enough for real work by someone who
is not deeply into developing Grid software, I will be amazed.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-12 Thread Paul Boddie
[EMAIL PROTECTED] wrote:
>
> The main difference between MPI python solutions and ppsmp is that with
> MPI you have to organize both computations
> {MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else } and
> data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
> you just submit a function with arguments to the execution server and
> retrieve the results later.

Couldn't you just provide similar conveniences on top of MPI? Searching
for "Python MPI" yields a lot of existing work (as does "Python PVM"),
so perhaps someone has already done so. Also, what about various grid
toolkits?

[...]

> Overall ppsmp is still work in progress and there are other interesting
> features which I would like to implement. This is the main reason why I
> do not open the source of ppsmp - to have better control of its future
> development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)

Despite various probable reactions from people who will claim that
they're comfortable with binary-only products from a single vendor, I
think more people would be inclined to look at your software if you did
distribute the source code, even if they then disregarded what you've
done. My own experience with regard to releasing software is that even
with an open source licence, most people are likely to ignore your
projects than to suddenly jump on board and take control, and even if
your project somehow struck a chord and attracted a lot of interested
developers, would it really be such a bad thing? Many developers have
different experiences and insights which can only make your project
better, anyway.

Related to your work, I've released a parallel execution solution
called parallel/pprocess [1] under the LGPL and haven't really heard
about anyone really doing anything with it, let alone forking it and
showing my original efforts in a bad light. Perhaps most of the
downloaders believe me to be barking up the wrong tree (or just
barking) with the approach I've taken, but I think the best thing is to
abandon any fears of not doing things the best possible way and just be
open to improvements and suggestions.

Paul

[1] http://www.python.org/pypi/parallel

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread jairodsl
Hi,

You guys forgot pyMPI, http://pympi.sourceforge.net/  It works fine !!!
A little hard installation and configuration but finally works !!!

Cordially,

Jairo Serrano
Bucaramanga, Colombia

[EMAIL PROTECTED] wrote:
> >
> > Thus there are different levels of parallelization:
> >
> > 1 file/database based; multiple batch jobs
> > 2 Message Passing, IPC, RPC, ...
> > 3 Object Sharing
> > 4 Sharing of global data space (Threads)
> > 5 Local parallelism / Vector computing, MMX, 3DNow,...
> >
> > There are good reasons for all of these levels.
> > Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :-) ), 
> > while its just a level 2
> > system, where "passing", "remote", "inter-process" ... are the right 
> > vocables.
> In one of the previous posts I've mentioned that ppsmp is based on
> processes + IPC, which makes it a system with level 2 parallelization,
> the same level where MPI is.
> Also it's obvious from the fact that it's written completely in python,
> as python objects cannot be shared due to GIL (POSH can do sharing
> because it's an extension written in C).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread parallelpython
>
> Thus there are different levels of parallelization:
>
> 1 file/database based; multiple batch jobs
> 2 Message Passing, IPC, RPC, ...
> 3 Object Sharing
> 4 Sharing of global data space (Threads)
> 5 Local parallelism / Vector computing, MMX, 3DNow,...
>
> There are good reasons for all of these levels.
> Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :-) ), 
> while its just a level 2
> system, where "passing", "remote", "inter-process" ... are the right vocables.
In one of the previous posts I've mentioned that ppsmp is based on
processes + IPC, which makes it a system with level 2 parallelization,
the same level where MPI is.
Also it's obvious from the fact that it's written completely in python,
as python objects cannot be shared due to GIL (POSH can do sharing
because it's an extension written in C).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread parallelpython
sturlamolden wrote:
> [EMAIL PROTECTED] wrote:
>
> >That's right. ppsmp starts multiple interpreters in separate
> > processes and organize communication between them through IPC.
>
> Thus you are basically reinventing MPI.
>
> http://mpi4py.scipy.org/
> http://en.wikipedia.org/wiki/Message_Passing_Interface

Thanks for bringing that into consideration.

I am well aware of MPI and have written several programs in C/C++ and
Fortran which use it.
I would agree that MPI is the most common solution to run software on a
cluster (computers connected by network). Although there is another
parallelization approach: PVM (Parallel Virtual Machine)
http://www.csm.ornl.gov/pvm/pvm_home.html. I would say ppsmp is more
similar to the later.

By the way there are links to different python parallelization
techniques (including MPI) from PP site:
http://www.parallelpython.com/component/option,com_weblinks/catid,14/Itemid,23/

The main difference between MPI python solutions and ppsmp is that with
MPI you have to organize both computations
{MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else } and
data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
you just submit a function with arguments to the execution server and
retrieve the results later.
That makes transition from serial python software to parallel much
simpler with ppsmp than with MPI.

To make this point clearer here is a short example:
serial code 2 lines--
for input in inputs:
print "Sum of primes below", input, "is", sum_primes(input)
parallel code 3 lines
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,),
("math",))) for input in inputs]
for input, job in jobs:
print "Sum of primes below", input, "is", job()
---
In this example parallel execution was added at the cost of 1 line of
code!

The other difference with MPI is that ppsmp dynamically decides where
to run each given job. For example if there are other active processes
running in the system ppsmp will use in the bigger extent the
processors which are free. Since in MPI the whole tasks is usually
divided  between processors equally at the beginning, the overall
runtime will be determined by the slowest running process (the one
which shares processor with another running program). In this
particular case ppsmp will outperform MPI.

The third, probably less important, difference is that with MPI based
parallel python code you must have MPI installed in the system.

Overall ppsmp is still work in progress and there are other interesting
features which I would like to implement. This is the main reason why I
do not open the source of ppsmp - to have better control of its future
development, as advised here: http://en.wikipedia.org/wiki/Freeware :-)

Best regards,
Vitalii

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Konrad Hinsen
On Jan 8, 2007, at 11:33, Duncan Booth wrote:

> The 'parallel python' site seems very sparse on the details of how  
> it is
> implemented but it looks like all it is doing is spawning some  
> subprocesses
> and using some simple ipc to pass details of the calls and results.  
> I can't
> tell from reading it what it is supposed to add over any of the other
> systems which do the same.
>
> Combined with the closed source 'no redistribution' license I can't  
> really
> see anyone using it.

I'd also like to see more details - even though I'd probably never  
use any Python module distributed in .pyc form only.

 From the bit of information there is on the Web site, the  
distribution strategy looks quite similar to my own master-slave  
distribution model (based on Pyro) which is part of ScientificPython.  
There is an example at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/? 
f=08361040f00a;file=Examples/master_slave_demo.py

and the code itself can be consulted at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/? 
f=bce321680116;file=Scientific/DistributedComputing/MasterSlave.py


The main difference seems to be that my implementation doesn't start  
compute jobs itself; it leaves it to the user to start any number he  
wants by any means that works for his setup, but it allows a lot of  
flexibility. In particular, it can work with a variable number of  
slave jobs and even handles disappearing slave jobs gracefully.

Konrad.
--
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: [EMAIL PROTECTED]
-


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
robert <[EMAIL PROTECTED]> writes:
|> 
|> Thus there are different levels of parallelization:
|> 
|> 1 file/database based; multiple batch jobs
|> 2 Message Passing, IPC, RPC, ...
|> 3 Object Sharing 
|> 4 Sharing of global data space (Threads)
|> 5 Local parallelism / Vector computing, MMX, 3DNow,...
|> 
|> There are good reasons for all of these levels.

Well, yes, but to call them "levels" is misleading, as they are closer
to communication methods of a comparable level.

|> > This does not mean that MPI is inherently slower than threads however,
|> > as there are overhead associated with thread synchronization as well.
|> 
|> level 2 communication is slower. Just for selected apps it won't matter a 
lot.

That is false.  It used to be true, but that was a long time ago.  The
reasons why what seems to be a more heavyweight mechanism (message
passing) can be faster than an apparently lightweight one (data sharing)
are both subtle and complicated.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
sturlamolden wrote:
> robert wrote:
> 
>> Thus communicated data is "serialized" - not directly used as with threads 
>> or with custom shared memory techniques like POSH object sharing.
> 
> Correct, and that is precisely why MPI code is a lot easier to write
> and debug than thread code. The OP used a similar technique in his
> 'parallel python' project.

Thus there are different levels of parallelization:

1 file/database based; multiple batch jobs
2 Message Passing, IPC, RPC, ...
3 Object Sharing 
4 Sharing of global data space (Threads)
5 Local parallelism / Vector computing, MMX, 3DNow,...

There are good reasons for all of these levels.
Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :-) ), while 
its just a level 2 system, where "passing", "remote", "inter-process" ... are 
the right vocables.

With all this fakes popping up - a GIL free CPython is a major feature request 
for Py3K - a name at least promising to run 3rd millenium CPU's ...


> This does not mean that MPI is inherently slower than threads however,
> as there are overhead associated with thread synchronization as well.

level 2 communication is slower. Just for selected apps it won't matter a lot.

> With 'shared memory' between threads, a lot more fine grained
> synchronization ans scheduling is needed, which impair performance and
> often introduce obscure bugs.

Its a question of chances and costs and nature of application.
Yet one can easily restrict inter-thread communcation to be as simple and 
modular or even simpler as IPC. Search e.g. "Python CallQueue" and 
"BackgroundCall" on Google.
Thread programming is less complicated as it seems. (Just Python's stdlib 
offers cumbersome 'non-functional' classes)


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Sergei Organov
[EMAIL PROTECTED] (Nick Maclaren) writes:
[...]
> I mean precisely the first.
>
> The C99 standard uses a bizarre consistency model, which requires serial
> execution, and its consistency is defined in terms of only volatile
> objects and external I/O.  Any form of memory access, signalling or
> whatever is outside that, and is undefined behaviour.
>
> POSIX uses a different but equally bizarre one, based on some function
> calls being "thread-safe" and others forcing "consistency" (which is
> not actually defined, and there are many possible, incompatible,
> interpretations).  It leaves all language aspects (including allowed
> code movement) to C.
>
> There are no concepts in common between C's and POSIX's consistency
> specifications (even when they are precise enough to use), and so no
> way of mapping the two standards together.

Ah, now I see what you mean. Even though I only partly agree with what
you've said above, I'll stop arguing as it gets too off-topic for this
group.

Thank you for explanations.

-- Sergei.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread sturlamolden

robert wrote:

> Thus communicated data is "serialized" - not directly used as with threads or 
> with custom shared memory techniques like POSH object sharing.

Correct, and that is precisely why MPI code is a lot easier to write
and debug than thread code. The OP used a similar technique in his
'parallel python' project.

This does not mean that MPI is inherently slower than threads however,
as there are overhead associated with thread synchronization as well.
With 'shared memory' between threads, a lot more fine grained
synchronization ans scheduling is needed, which impair performance and
often introduce obscure bugs.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
robert <[EMAIL PROTECTED]> writes:
|> 
|> Most threads on this planet are not used for number crunching jobs,
|> but for "organization of execution".

That is true, and it is effectively what POSIX and Microsoft threads
are suitable for.  With reservations, even there.

|> Things like MPI, IPC are just for the area of "small message, big job"
|> - typically sci number crunching, where you collect the results "at
|> the end of day". Its more a slow network technique.

That is completely false.  Most dedicated HPC systems use MPI for high
levels of message passing over high-speed networks.

|> > They use it for the communication, but don't expose it to the
|> > programmer.  It is therefore easy to put the processes on different
|> > CPUs, and get the memory consistency right.
|> 
|> Thus communicated data is "serialized" - not directly used as with
|> threads or with custom shared memory techniques like POSH object
|> sharing.

It is not used as directly with threads as you might think.  Even
POSIX and Microsoft threads require synchronisation primitives, and
threading models like OpenMP and BSP have explicit control.

Also, MPI has asynchronous (non-blocking) communication.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
Sergei Organov <[EMAIL PROTECTED]> writes:
|> 
|> OK, then I don't think the POSIX threads were "perpetrated" to be idle
|> most of time.

Perhaps I was being unclear.  I should have added "In the case where
there are more threads per system than CPUs per system".  The reasons
are extremely obscure and are to do with the scheduling, memory access
and communication.

I am in full agreement that the above effect was not INTENDED.

|> > That is why many POSIX threads programs work until the genuinely
|> > shared memory accesses become frequent enough that you get some to the
|> > same location in a single machine cycle.
|> 
|> Sorry, I don't understand. Are you saying that it's inherently
|> impossible to write an application that uses POSIX threads and that
|> doesn't have bugs accessing shared state? I thought that pthreads
|> mutexes guarantee sequential access to shared data. Or do you mean
|> something entirely different? Lock-free algorithms maybe?

I mean precisely the first.

The C99 standard uses a bizarre consistency model, which requires serial
execution, and its consistency is defined in terms of only volatile
objects and external I/O.  Any form of memory access, signalling or
whatever is outside that, and is undefined behaviour.

POSIX uses a different but equally bizarre one, based on some function
calls being "thread-safe" and others forcing "consistency" (which is
not actually defined, and there are many possible, incompatible,
interpretations).  It leaves all language aspects (including allowed
code movement) to C.

There are no concepts in common between C's and POSIX's consistency
specifications (even when they are precise enough to use), and so no
way of mapping the two standards together.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread Sergei Organov
[EMAIL PROTECTED] (Nick Maclaren) writes:

> In article <[EMAIL PROTECTED]>,
> Sergei Organov <[EMAIL PROTECTED]> writes:
> |> 
> |> Do you mean that POSIX threads are inherently designed and implemented
> |> to stay idle most of the time?! If so, I'm afraid those guys that
> |> designed POSIX threads won't agree with you. In particular, as far as I
> |> remember, David R. Butenhof said a few times in comp.programming.threads
> |> that POSIX threads were primarily designed to meet parallel programming
> |> needs on SMP, or at least that was how I understood him.
>
> I do mean that, and I know that they don't agree.  However, the word
> "designed" doesn't really make a lot of sense for POSIX threads - the
> one I tend to use is "perpetrated".

OK, then I don't think the POSIX threads were "perpetrated" to be idle
most of time.

> The people who put the specification together were either unaware of
> most of the experience of the previous 30 years, or chose to ignore it.
> In particular, in this context, the importance of being able to control
> the scheduling was well-known, as was the fact that it is NOT possible
> to mix processes with different scheduling models on the same set of
> CPUs.  POSIX's facilities are completely hopeless for that purpose, and
> most of the systems I have used effectively ignore them.

I won't argue that. On the other hand, POSIX threads capabilities in the
field of I/O-bound and real-time threads are also limited, and that's
where "threads that are idle most of time" idiom comes from, I
think. What I argue, is that POSIX were "perpetrated" to support
I/O-bound or real-time apps any more than to support parallel
calculations apps. Besides, pthreads real-time extensions came later
than pthreads themselves.

What I do see, is that Microsoft designed their system so that it's
almost impossible to implement an interactive application without using
threads, and that fact leads to the current situation where threads are
considered to be beasts that are sleeping most of time.

> I could go on at great length, and the performance aspects are not even
> the worst aspect of POSIX threads.  The fact that there is no usable
> memory model, and the synchronisation depends on C to handle the
> low-level consistency, but there are no CONCEPTS in common between
> POSIX and C's memory consistency 'specifications' is perhaps the worst.

I won't argue that either. However, I don't see how does it make POSIX
threads to be "perpetrated" to be idle most of time.

> That is why many POSIX threads programs work until the genuinely
> shared memory accesses become frequent enough that you get some to the
> same location in a single machine cycle.

Sorry, I don't understand. Are you saying that it's inherently
impossible to write an application that uses POSIX threads and that
doesn't have bugs accessing shared state? I thought that pthreads
mutexes guarantee sequential access to shared data. Or do you mean
something entirely different? Lock-free algorithms maybe?

-- Sergei.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
Nick Maclaren wrote:
> In article <[EMAIL PROTECTED]>,
> Paul Rubin  writes:
> |>
> |> > Yes, I know that it is a bit Irish for the best way to use a shared
> |> > memory system to be to not share memory, but that's how it is.
> |> 
> |> But I thought serious MPI implementations use shared memory if they
> |> can.  That's the beauty of it, you can run your application on SMP
> |> processors getting the benefit of shared memory, or split it across
> |> multiple machines using ethernet or infiniband or whatever, without
> |> having to change the app code.
> 
> They use it for the communication, but don't expose it to the
> programmer.  It is therefore easy to put the processes on different
> CPUs, and get the memory consistency right.
> 

Thus communicated data is "serialized" - not directly used as with threads or 
with custom shared memory techniques like POSH object sharing.


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-11 Thread robert
sturlamolden wrote:
> Nick Maclaren wrote:
> 
> I wonder if too much emphasis is put on thread programming these days.
> Threads may be nice for programming web servers and the like, but not
> for numerical computing. Reading books about thread programming, one
> can easily get the impression that it is 'the' way to parallelize
> numerical tasks on computers with multiple CPUs (or multiple CPU


Most threads on this planet are not used for number crunching jobs, but for 
"organization of execution".

Also if one wants to exploit the speed of upcoming multi-core CPUs for all 
kinds of fine grained programs, things need fast fine grained communication - 
and most important: huge data trees in memory have to be shared effectively.
CPU frequencies will not grow anymore in the future, but we will see 
multi-cores/SMP. How to exploit them in a manner as if we had really faster 
CPU's: threads and thread-like techniques.

Things like MPI, IPC are just for the area of "small message, big job" - 
typically sci number crunching, where you collect the results "at the end of 
day". Its more a slow network technique.

A most challenging example on this are probably games - not to discuss about 
gaming here, but as tech example to the point: Would you do MPI, RPC etc. while 
30fps 3D and real time physics simulation is going on?


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
"Carl J. Van Arsdall" <[EMAIL PROTECTED]> writes:
|> 
|> Just as something to note, but many HPC applications will use a 
|> combination of both MPI and threading (OpenMP usually, as for the 
|> underlying thread implementation i don't have much to say).  Its 
|> interesting to see on this message board this huge "anti-threading" 
|> mindset, but the HPC community seems to be happy using a little of both 
|> depending on their application and the topology of their parallel 
|> machine.  Although if I was doing HPC applications, I probably would not 
|> choose to use Python but I would write things in C or FORTRAN. 

That is a commonly quoted myth.

Some of the ASCI community did that, but even they have backed off
to a great extent.  Such code is damn near impossible to debug, let
alone tune.  To the best of my knowledge, no non-ASCI application
has ever done that, except for virtuosity.  I have several times
asked claimants to name some examples of code that does that and is
used in the general research community, and have so far never had a
response.

I managed the second-largest HPC system in UK academia for a decade,
ending less than a year ago, incidentally, and was and am fairly well
in touch with what is going on in HPC world-wide.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Carl J. Van Arsdall
Just as something to note, but many HPC applications will use a 
combination of both MPI and threading (OpenMP usually, as for the 
underlying thread implementation i don't have much to say).  Its 
interesting to see on this message board this huge "anti-threading" 
mindset, but the HPC community seems to be happy using a little of both 
depending on their application and the topology of their parallel 
machine.  Although if I was doing HPC applications, I probably would not 
choose to use Python but I would write things in C or FORTRAN. 

What I liked about python threads was that they were easy whereas using 
processes and IPC is a real pain in the butt sometimes.  I don't 
necessarily think this module is the end-all solution to all of our 
problems but I do think that its a good thing and I will toy with it 
some in my spare time.  I think that any effort to making python 
threading better is a good thing and I'm happy to see the community 
attempt to make improvements.  It would also be cool if this would be 
open sourced and I'm not quite sure why its not.

-carl
 

-- 

Carl J. Van Arsdall
[EMAIL PROTECTED]
Build and Release
MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
Sergei Organov <[EMAIL PROTECTED]> writes:
|> 
|> Do you mean that POSIX threads are inherently designed and implemented
|> to stay idle most of the time?! If so, I'm afraid those guys that
|> designed POSIX threads won't agree with you. In particular, as far as I
|> remember, David R. Butenhof said a few times in comp.programming.threads
|> that POSIX threads were primarily designed to meet parallel programming
|> needs on SMP, or at least that was how I understood him.

I do mean that, and I know that they don't agree.  However, the word
"designed" doesn't really make a lot of sense for POSIX threads - the
one I tend to use is "perpetrated".

The people who put the specification together were either unaware of
most of the experience of the previous 30 years, or chose to ignore it.
In particular, in this context, the importance of being able to control
the scheduling was well-known, as was the fact that it is NOT possible
to mix processes with different scheduling models on the same set of
CPUs.  POSIX's facilities are completely hopeless for that purpose, and
most of the systems I have used effectively ignore them.

I could go on at great length, and the performance aspects are not even
the worst aspect of POSIX threads.  The fact that there is no usable
memory model, and the synchronisation depends on C to handle the
low-level consistency, but there are no CONCEPTS in common between
POSIX and C's memory consistency 'specifications' is perhaps the worst.
That is why many POSIX threads programs work until the genuinely
shared memory accesses become frequent enough that you get some to the
same location in a single machine cycle.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Sergei Organov
[EMAIL PROTECTED] (Nick Maclaren) writes:
> In article <[EMAIL PROTECTED]>,
> "sturlamolden" <[EMAIL PROTECTED]> writes:
[...]
> |> I wonder if too much emphasis is put on thread programming these days.
> |> Threads may be nice for programming web servers and the like, but not
> |> for numerical computing. Reading books about thread programming, one
> |> can easily get the impression that it is 'the' way to parallelize
> |> numerical tasks on computers with multiple CPUs (or multiple CPU
> |> cores). But if threads are inherently designed and implemented to stay
> |> idle most of the time, that is obviously not the case.
>
> You have to distinguish "lightweight processes" from "POSIX threads"
> from the generic concept.  It is POSIX and Microsoft threads that are
> inherently like that,

Do you mean that POSIX threads are inherently designed and implemented
to stay idle most of the time?! If so, I'm afraid those guys that
designed POSIX threads won't agree with you. In particular, as far as I
remember, David R. Butenhof said a few times in comp.programming.threads
that POSIX threads were primarily designed to meet parallel programming
needs on SMP, or at least that was how I understood him.

-- Sergei.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
Paul Rubin  writes:
|>
|> > Yes, I know that it is a bit Irish for the best way to use a shared
|> > memory system to be to not share memory, but that's how it is.
|> 
|> But I thought serious MPI implementations use shared memory if they
|> can.  That's the beauty of it, you can run your application on SMP
|> processors getting the benefit of shared memory, or split it across
|> multiple machines using ethernet or infiniband or whatever, without
|> having to change the app code.

They use it for the communication, but don't expose it to the
programmer.  It is therefore easy to put the processes on different
CPUs, and get the memory consistency right.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
"sturlamolden" <[EMAIL PROTECTED]> writes:
|> 
|> In any case, this means that Python can happily keep its GIL, as the
|> CPU bound 'HPC' tasks for which the GIL does matter should be done
|> using multiple processes (not threads) anyway. That leaves threads as a
|> tool for programming certain i/o tasks and maintaining 'responsive'
|> user interfaces, for which the GIL incidentally does not matter.

Yes.  That is the approach being taken at present by almost everyone.

|> I wonder if too much emphasis is put on thread programming these days.
|> Threads may be nice for programming web servers and the like, but not
|> for numerical computing. Reading books about thread programming, one
|> can easily get the impression that it is 'the' way to parallelize
|> numerical tasks on computers with multiple CPUs (or multiple CPU
|> cores). But if threads are inherently designed and implemented to stay
|> idle most of the time, that is obviously not the case.

You have to distinguish "lightweight processes" from "POSIX threads"
from the generic concept.  It is POSIX and Microsoft threads that are
inherently like that, and another kind of thread model might be very
different.  Don't expect to see one provided any time soon, even by
Linux.

OpenMP is the current leader for SMP parallelism, and it would be
murder to produce a Python binding that had any hope of delivering
useful performance.  I think that it could be done, but implementing
the result would be a massive task.  The Spruce Goose and Project
Habbakuk (sic) spring to my mind, by comparison[*] :-)

|> I like MPI. Although it is a huge API with lots of esoteric functions,
|> I only need to know a handfull to cover my needs. Not to mention the
|> fact that I can use MPI with Fortran, which is frowned upon by computer
|> scientists but loved by scientists and engineers specialized in any
|> other field.

Yup.  MPI is also debuggable and tunable (with difficulty).  Debugging
and tuning OpenMP and POSIX threads are beyond anyone except the most
extreme experts; I am only on the borderline of being able to.

The ASCI bunch favour Co-array Fortran, and its model matches Python
like a steam turbine is a match for a heart transplant.


[*] They are worth looking up, if you don't know about them.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Paul Rubin
[EMAIL PROTECTED] (Nick Maclaren) writes:
> Yes, I know that it is a bit Irish for the best way to use a shared
> memory system to be to not share memory, but that's how it is.

But I thought serious MPI implementations use shared memory if they
can.  That's the beauty of it, you can run your application on SMP
processors getting the benefit of shared memory, or split it across
multiple machines using ethernet or infiniband or whatever, without
having to change the app code.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread sturlamolden

Nick Maclaren wrote:

> as the ones that you have to play for threaded programs.  Yes, I know
> that it is a bit Irish for the best way to use a shared memory system
> to be to not share memory, but that's how it is.

Thank you for clearing that up.

In any case, this means that Python can happily keep its GIL, as the
CPU bound 'HPC' tasks for which the GIL does matter should be done
using multiple processes (not threads) anyway. That leaves threads as a
tool for programming certain i/o tasks and maintaining 'responsive'
user interfaces, for which the GIL incidentally does not matter.

I wonder if too much emphasis is put on thread programming these days.
Threads may be nice for programming web servers and the like, but not
for numerical computing. Reading books about thread programming, one
can easily get the impression that it is 'the' way to parallelize
numerical tasks on computers with multiple CPUs (or multiple CPU
cores). But if threads are inherently designed and implemented to stay
idle most of the time, that is obviously not the case.

I like MPI. Although it is a huge API with lots of esoteric functions,
I only need to know a handfull to cover my needs. Not to mention the
fact that I can use MPI with Fortran, which is frowned upon by computer
scientists but loved by scientists and engineers specialized in any
other field.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread Nick Maclaren

In article <[EMAIL PROTECTED]>,
"sturlamolden" <[EMAIL PROTECTED]> writes:
|> 
|> MPI is becoming the de facto standard for high-performance parallel
|> computing, both on shared memory systems (SMPs) and clusters.

It has been for some time, and is still gaining ground.

|> Spawning
|> threads or processes is not recommended way to do numerical parallel
|> computing.

Er, MPI works by getting SOMETHING to spawn processes, which then
communicate with each other.

|> Threading makes programming certain tasks more convinient
|> (particularly GUI and I/O, for which the GIL does not matter anyway),
|> but is not a good paradigm for dividing CPU bound computations between
|> multiple processors. MPI is a high level API based on a concept of
|> "message passing", which allows the programmer to focus on solving the
|> problem, instead on irrelevant distractions such as thread managament
|> and synchronization.

Grrk.  That's not quite it.

The problem is that the current threading models (POSIX threads and
Microsoft's equivalent) were intended for running large numbers of
semi-independent, mostly idle, threads: Web servers and similar.
Everything about them, including their design (such as it is), their
interfaces and their implementations, are unsuitable for parallel HPC
applications.  One can argue whether that is insoluble, but let's not,
at least not here.

Now, Unix and Microsoft processes are little better but, because they
are more separate (and, especially, because they don't share memory)
are MUCH easier to run effectively on shared memory multi-CPU systems.
You still have to play administrator tricks, but they aren't as foul
as the ones that you have to play for threaded programs.  Yes, I know
that it is a bit Irish for the best way to use a shared memory system
to be to not share memory, but that's how it is.


Regards,
Nick Maclaren.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread sturlamolden

[EMAIL PROTECTED] wrote:

>That's right. ppsmp starts multiple interpreters in separate
> processes and organize communication between them through IPC.

Thus you are basically reinventing MPI.


http://mpi4py.scipy.org/
http://en.wikipedia.org/wiki/Message_Passing_Interface

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread sturlamolden

robert wrote:

> Thats true. IPC through sockets or (somewhat faster) shared memory -  cPickle 
> at least - is usually the maximum of such approaches.
> See 
> http://groups.google.de/group/comp.lang.python/browse_frm/thread/f822ec289f30b26a
>
> For tasks really requiring threading one can consider IronPython.
> Most advanced technique I've see for CPython ist posh : 
> http://poshmodule.sourceforge.net/


In SciPy there is an MPI-binding project, mpi4py.

MPI is becoming the de facto standard for high-performance parallel
computing, both on shared memory systems (SMPs) and clusters. Spawning
threads or processes is not recommended way to do numerical parallel
computing. Threading makes programming certain tasks more convinient
(particularly GUI and I/O, for which the GIL does not matter anyway),
but is not a good paradigm for dividing CPU bound computations between
multiple processors. MPI is a high level API based on a concept of
"message passing", which allows the programmer to focus on solving the
problem, instead on irrelevant distractions such as thread managament
and synchronization.

Although MPI has standard APIs for C and Fortran, it may be used with
any programming language. For Python, an additional advantage of using
MPI is that the GIL has no practical consequence for performance. The
GIL can lock a process but not prevent MPI from using multiple
processors as MPI is always using multiple processes. For IPC, MPI will
e.g. use shared-memory segments on SMPs and tcp/ip on clusters, but all
these details are hidden.

It seems like 'ppsmp' of parallelpython.com is just an reinvention of a
small portion of MPI.


http://mpi4py.scipy.org/
http://en.wikipedia.org/wiki/Message_Passing_Interface

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-10 Thread parallelpython
> I always thought that if you use multiple processes (e.g. os.fork) then
> Python can take advantage of multiple processors. I think the GIL locks
> one processor only. The problem is that one interpreted can be run on
> one processor only. Am I not right? Is your ppm module runs the same
> interpreter on multiple processors? That would be very interesting, and
> something new.
>
>
> Or does it start multiple interpreters? Another way to do this is to
> start multiple processes and let them communicate through IPC or a local
> network.

   That's right. ppsmp starts multiple interpreters in separate
processes and organize communication between them through IPC.

   Originally ppsmp was designed to speedup an existent application
which is written in pure python but is quite computationally expensive
(the other ways to optimize it were used too). It was also required
that the application will run out of the box on the most standard Linux
distributions (they all contain CPython).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-08 Thread Parallel Python The team

On 1/8/07, Laszlo Nagy <[EMAIL PROTECTED]> wrote:


I always thought that if you use multiple processes (e.g. os.fork) then
Python can take advantage of multiple processors. I think the GIL locks
one processor only. The problem is that one interpreted can be run on
one processor only. Am I not right? Is your ppm module runs the same
interpreter on multiple processors? That would be very interesting, and
something new.


Or does it start multiple interpreters? Another way to do this is to
start multiple processes and let them communicate through IPC or a local
network.


  Laszlo

You are right. ppsmp start multiple interpreters in separate processes and

organize communication between them through IPC.

So far ppsmp features load balancing (distributes workload evenly between
worker processes.) and low overhead (example
http://www.parallelpython.com/content/view/17/31/#REVERSE_MD5 submits a 100
jobs to the system with no noticeable overhead).
Of coerce there is always room for growth and I am considering adding new
features/functionality.
Do you have any functionality in mind which you want to see in this system?

Best regards,
Vitalii
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parallel Python

2007-01-08 Thread robert
Duncan Booth wrote:
> Laszlo Nagy <[EMAIL PROTECTED]> wrote:
> 
> 
> The 'parallel python' site seems very sparse on the details of how it is 
> implemented but it looks like all it is doing is spawning some subprocesses 
> and using some simple ipc to pass details of the calls and results. I can't 
> tell from reading it what it is supposed to add over any of the other 
> systems which do the same.
> 
> Combined with the closed source 'no redistribution' license I can't really 
> see anyone using it.


Thats true. IPC through sockets or (somewhat faster) shared memory -  cPickle 
at least - is usually the maximum of such approaches.
See 
http://groups.google.de/group/comp.lang.python/browse_frm/thread/f822ec289f30b26a

For tasks really requiring threading one can consider IronPython.
Most advanced technique I've see for CPython ist posh : 
http://poshmodule.sourceforge.net/ 

I'd say Py3K should just do the locking job for dicts / collections, obmalloc 
and refcount (or drop the refcount mechanism) and do the other minor things in 
order to enable free threading. Or at least enable careful sharing of 
Py-Objects between multiple separated Interpreter instances of one process.
.NET and Java have shown that the speed costs for this technique are no so 
extreme. I guess less than 10%. 
And Python is a VHLL with less focus on speed anyway.
Also see discussions in 
http://groups.google.de/group/comp.lang.python/browse_frm/thread/f822ec289f30b26a
 .


Robert
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-08 Thread Duncan Booth
Laszlo Nagy <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] wrote:
>> Has anybody tried to run parallel python applications?
>> It appears that if your application is computation-bound using 'thread'
>> or 'threading' modules will not get you any speedup. That is because
>> python interpreter uses GIL(Global Interpreter Lock) for internal
>> bookkeeping. The later allows only one python byte-code instruction to
>> be executed at a time even if you have a multiprocessor computer.
>> To overcome this limitation, I've created ppsmp module:
>> http://www.parallelpython.com
>> It provides an easy way to run parallel python applications on smp
>> computers.
>> I would appreciate any comments/suggestions regarding it.
>>   
> I always thought that if you use multiple processes (e.g. os.fork) then 
> Python can take advantage of multiple processors. I think the GIL locks 
> one processor only. The problem is that one interpreted can be run on 
> one processor only. Am I not right? Is your ppm module runs the same 
> interpreter on multiple processors? That would be very interesting, and 
> something new.
> 
The GIL locks all processors, but just for one process. So, yes, if you 
spawn off multiple processes then Python will take advantage of this. For 
example we run Zope on a couple of dual processor dual core systems, so we 
use squid and pound to ensure that the requests are spread across 4 
instances of Zope on each machine. That way we do get a fairly even cpu 
usage.

For some applications it is much harder to split the tasks across separate 
processes rather than just separate threads, but there is a benefit once 
you've done it since you can then distribute the processing across cpus on 
separate machines.

The 'parallel python' site seems very sparse on the details of how it is 
implemented but it looks like all it is doing is spawning some subprocesses 
and using some simple ipc to pass details of the calls and results. I can't 
tell from reading it what it is supposed to add over any of the other 
systems which do the same.

Combined with the closed source 'no redistribution' license I can't really 
see anyone using it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python

2007-01-08 Thread Laszlo Nagy
[EMAIL PROTECTED] wrote:
> Has anybody tried to run parallel python applications?
> It appears that if your application is computation-bound using 'thread'
> or 'threading' modules will not get you any speedup. That is because
> python interpreter uses GIL(Global Interpreter Lock) for internal
> bookkeeping. The later allows only one python byte-code instruction to
> be executed at a time even if you have a multiprocessor computer.
> To overcome this limitation, I've created ppsmp module:
> http://www.parallelpython.com
> It provides an easy way to run parallel python applications on smp
> computers.
> I would appreciate any comments/suggestions regarding it.
>   
I always thought that if you use multiple processes (e.g. os.fork) then 
Python can take advantage of multiple processors. I think the GIL locks 
one processor only. The problem is that one interpreted can be run on 
one processor only. Am I not right? Is your ppm module runs the same 
interpreter on multiple processors? That would be very interesting, and 
something new.


Or does it start multiple interpreters? Another way to do this is to 
start multiple processes and let them communicate through IPC or a local 
network.


  Laszlo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python on PowerMac?

2004-11-29 Thread Robert Kern
Alan Kennedy wrote:
Although, iff your prospective machine supports System V IPC, you might 
want to check out PoSH.

http://poshmodule.sourceforge.net
It uses inline assembly, so that's a no-go on the PPC unless someone 
ports the assembly code.

--
Robert Kern
[EMAIL PROTECTED]
"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter
--
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python on PowerMac?

2004-11-29 Thread Wolfgang Keller
Hello,

and thanks for your reply.

> But I would venture to say that, *in the general case*, the "most 
> efficient" way to benefit from a second cpu, both in terms of coding 
> time and execution efficiency, is to use either jython

*cough* *choke*

Err, no, sorry, not for me.

> Although, iff your prospective machine supports System V IPC,

No clue whether MacOS X does so. Afaik it's basically a FreeBSD based on
Mach (from CMU) with a proprietary layer above.

> you might want to check out PoSH.

Where's the binary installer for MacOS X?

Not having to use compilers and linkers and makefiles and the like was one
of the major reasons which made Python interesting for me...

Best regards,

Wolfgang Keller
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parallel Python on PowerMac?

2004-11-29 Thread Alan Kennedy
[Wolfgang Keller]
> as I might get a dual-G5 PowerMac someday in the not to distant
> future, I was wondering what options are available for making Python
> benefit from the second CPU? Running two interpreters and using Pyro
> would not be the most efficient (and easiest) way, I guess?
Qualifier: obviously efficiency is relative to the application.
But I would venture to say that, *in the general case*, the "most 
efficient" way to benefit from a second cpu, both in terms of coding 
time and execution efficiency, is to use either jython on a suitable jvm 
or ironpython on mono (when it catches up with the .net CLR in efficiency).

I say "most efficient in execution efficiency" because all of the 
de/serialization involved with communicating between two independent 
cpython interpreters, using something like pyro, would outweigh whatever 
performance advantage cpython might have over jython or ironpython. This 
becomes more pronounced as you add more and more processors into the 
picture.

I say "most efficient in coding time" because cpython would require you 
to specially write code for inter-interpreter communications, and 
possibly restructure your application accordingly, whereas jython and 
ironpython won't: the same interpreter can have threads on multiple 
processors, all executing simultaneously.

Although, iff your prospective machine supports System V IPC, you might 
want to check out PoSH.

http://poshmodule.sourceforge.net
running-to-find-my-flame-retardant-suit-ly'yrs
--
alan kennedy
--
email alan:  http://xhaus.com/contact/alan
--
http://mail.python.org/mailman/listinfo/python-list