Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.

2009/3/4, Vincent Schut <[email protected]>:
> John Barham wrote:
>
> > On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera <[email protected]> wrote:
> >
> >
> > > I have to launch many tasks running in parallel (~5000) in a
> > > cluster running linux. Each of the task performs some astronomical
> > > calculations and I am not pretty sure if using fork is the best answer
> > > here.
> > > First of all, all the programming is done in python and c...
> > >
> >
> > Take a look at the multiprocessing package
> > (http://docs.python.org/library/multiprocessing.html),
> newly
> > introduced with Python 2.6 and 3.0:
> >
> > "multiprocessing is a package that supports spawning processes using
> > an API similar to the threading module. The multiprocessing package
> > offers both local and remote concurrency, effectively side-stepping
> > the Global Interpreter Lock by using subprocesses instead of threads."
> >
> > It should be a quick and easy way to set up a cluster-wide job
> > processing system (provided all your jobs are driven by Python).
> >
>
>  Better: use parallelpython (www.parallelpython.org). Afaik multiprocessing
> is geared towards multi-core systems (one machine), while pp is also
> suitable for real clusters with more pc's. No special cluster software
> needed. It will start (here's your fork) a (some) python interpreters on
> each node, and then you can submit jobs to those 'workers'. The interpreters
> are kept alive between jobs, so the startup penalty becomes neglectibly when
> the number of jobs is large enough.
>  Using it here to process massive amounts of satellite data, works like a
> charm.
>
>  Vincent.
>
>
> >
> > It also looks like it's been (partially?) back-ported to Python 2.4
> > and 2.5: http://pypi.python.org/pypi/processing.
> >
> >  John
> >
> >
> >
>
>
>


-- 
Hugo

Reply via email to