Re: best parallelisation strategy on python

raymond . nusbaum Sat, 21 Apr 2018 13:08:33 -0700

On Wednesday, April 18, 2018 at 7:16:19 PM UTC-6, simona bellavista wrote:
> I have a code fortran 90 that is parallelised with MPI. I would like to 
> traslate it in python, but I am not sure on the parallelisation strategy and 
> libraries. I work on clusters, with each node with 5GB memory and 12 
> processors or 24 processors (depending on the cluster I am using). Ideally I 
> would like to split the computation on several nodes.
> 
> Let me explain what this code does: It read ~100GB data, they are divided in 
> hdf5 files of ~25GB each. The code should read the data, go through it and 
> then select a fraction of the data, ~1GB and then some CPU intensive work on 
> it, and repeat this process many times, say 1000 times, then write the 
> results to a single final file.
> 
> I was thinking that the CPU intensive part would be written as a shared 
> object in C.
> 
> Do you have suggestions about which library to use?


I would suggest the Python multiprocessing package. In Python you have to use 
processes to get full parallelism as there is a single lock on the Python 
interpreter. The multiprocessing package supports this computing model.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: best parallelisation strategy on python

Reply via email to