Hi

If any use: I have commonly used the C/C++ and Java threaded code to create
parallel processes.  Python threading is badly implemented and best avoided,
use sub-processes instead as these are forked.

1) CPU intensive code with no dependency scales close to N where N < cores available, for all things I have tried. Even to 64 threads on a 128 core machine (but see 2) 2) Be careful with IO as you can exceed the buffer space for the machine since
each thread will assign its own buffer space and linux
does not recover from this (the machine crashes).
3) This is of course-grain parallel processing, fine grain processing takes a little
more work as you have to be careful with semaphores and yields

If your entire process is amiable slicing up the data - this might be much easier to implement - just run your program on multiple machines using redis to allow processes
on each machine to fetch work.

Regards
Tom


On 20/12/13 16:46, Adam Ralph wrote:

I agree with George and Marcin that getting significant speedups is a
non-trivial business. Unfortunately as CPU clock speeds drop, to
become more energy efficient, the only way to keep performance up
is to work in parallel. Accelerator technology and Hadoop make
many-core environments more accessible.

OpenMP and OpenACC directives provide a relatively simple way
of exploiting modest gains. Compilers are already performing unseen
optimizations on your code and in the future they may use any
accelerators you have without any changes.

I have not benchmarked any CCP4 progs, so any gains maybe
disappointing. All I know is that it is possible.

Adam

Reply via email to