Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Glasser, Matthew Tue, 14 Jun 2016 12:29:20 -0700

That’s correct, everything else is single threaded.

Peace,

Matt.

From: "Timothy B. Brown" <[email protected]<mailto:[email protected]>>
Reply-To: "Brown, Tim" <[email protected]<mailto:[email protected]>>
Date: Tuesday, June 14, 2016 at 10:44 AM
To: Gengyan Zhao <[email protected]<mailto:[email protected]>>, Timothy Coalson 
<[email protected]<mailto:[email protected]>>
Cc: Matt Glasser <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Hi Gengyan,

As far as I know, there is only one binary that is run as part of the Diffusion 
Preprocessing Pipeline, that is parallelized. That is the eddy binary and the 
latest versions of that binary are written do their parallel processing on a 
GPU.  We have broken the Diffusion Preprocessing Pipeline out into three 
separate phases (Pre Eddy, Eddy, and Post Eddy) with a separate script for each 
phase so that 3 sequential jobs can be scheduled in such a way that the 2nd 
job, the Eddy job, is scheduled to run on a processing node that has a GPU 
available.  (The Eddy job is scheduled only to run upon successful completion 
of the Pre Eddy job, and the Post Eddy job is scheduled only to run upon 
successful completion of the Eddy job.) This significantly speeds up the Eddy 
phase of the processing. However, the rest of the processing (both before and 
after the Eddy phase) is, as far as I know, single threaded.

The other significant parellelization we get is by doing just as Tim Coalson 
described and scheduling the runs of many subjects simultaneously.

  Tim

On Tue, Jun 14, 2016, at 10:24, Gengyan Zhao wrote:
Hi Tim,

Thank you. However, the code I ran was "DiffusionPreprocessingBatch.sh", which 
should be able to involve multiple cores parallel computation within one 
subject, right? Because there is one line in the code saying:
#Assume that submission nodes have OPENMP enabled (needed for eddy - at least 8 
cores suggested for HCP data)

Thanks,
Gengyan

________________________________
From: Timothy Coalson <[email protected]<mailto:[email protected]>>
Sent: Monday, June 13, 2016 9:13:54 PM
To: Gengyan Zhao
Cc: Glasser, Matthew; [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Many of the executables and operations in the pipelines do not use 
parallelization across cores within a single subject.  You may be better off 
setting up a queue with fewer cores available per slot, and submitting many 
subjects at once to the queue.  Make sure you set the queue name correctly in 
each "Batch" script.

Note that some pipelines take much more memory than others, so some may allow 
you to run 32 subjects in parallel, while others may only be able to handle 2 
or 3 in parallel.  Someone else will have to comment on the peak memory 
requirements per subject in each of the pipelines.

Tim

On Mon, Jun 13, 2016 at 8:53 PM, Gengyan Zhao 
<[email protected]<mailto:[email protected]>> wrote:

Hi Matt, Tim Coalson and Tim Brown,

Thank you very much for all your answers, and I'm sorry for my delayed response.

I ran the "DiffusionPreprocessingBatch.sh" to process the example data of the 
subject 100307 on a 32-cores Ubuntu 14.04 Server. There is no variable of 
OMP_NUM_THREADS being set, and the task was submitted by an SGE queue with the 
setting of 16 cores and 16 slots. However, when I used the command "top" to 
monitor the usage of the cpu, only one core was occupied. Is there anything 
wrong? What shall I do to involve multiple cores (OpenMP) then?

Thank you very much.

Best,

Gengyan

________________________________
From: Timothy B. Brown <[email protected]<mailto:[email protected]>>
Sent: Monday, May 23, 2016 3:45 PM
To: Glasser, Matthew; Gengyan Zhao; 
[email protected]<mailto:[email protected]>
Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Gengyan,

My understanding is as follows.  (Any OpenMP expert who sees holes in my 
understanding should feel free to correct me...please.)

If the compiled program/binary in use (e.g. eddy or wb_command) has been 
compiled using the correct OpenMP related switches, then by default, that 
program will use multi-threading in the places that multi-threading was called 
for in the source code. It will use a maximum of as many threads as there are 
are processing cores on the system on which the program is running.

So, if the machine you are using has 8 cores, then a properly compiled OpenMP 
program will use up to 8 threads (parallel executions).  But this assumes that 
the code has been written with that many potential threads of independent 
execution and compiled and linked with the correct OpenMP switches and OpenMP 
libraries.

For programs like eddy and wb_command, this proper compiling and linking to use 
OpenMP should already have been done for you.

The only other thing that I know of that can limit the number of threads 
(besides the actual source code) is the setting of the environment variable 
OMP_NUM_THREADS. If this variable is set to a numeric value (e.g. 4), then the 
number of threads is limited to that maximum regardless of how many threads the 
code is written to support.

In reality, I believe the behavior when the OMP_NUM_THREADS variable is not set 
of running "as many threads as available cores" is dependent upon the compiler 
used. But the GNU compiler collection and (I believe the Intel compiler family) 
have this behavior.  The Visual Studio 2015 compilers have a similar behavior.

So...if the machine you are running on has multiple cores, and OMP_NUM_THREADS 
is not set, the code should be automatically using multi-threading for you.

There is another caveat here. If you are submitting jobs to a cluster with a 
job scheduler (like a Sun Grid Engine or a PBS scheduler) you should be careful 
to request multiple cores for your job.  If the "hardware" you request for 
running your job is 1 node with 1 processor (i.e. core), then even if the 
actual machine has multiple cores, only 1 of those cores will be allocated to 
your job.  So the running environment will "look like" a single core processor. 
This would mean that only 1 thread at a time could run.

As an example for PBS, if you were to specify the following in the PBS header 
for your job:

#PBS -l nodes=1:ppn=1

Then you would only get single threading because Processors Per Node (ppn) is 
specified as 1.

Whereas specifying

#PBS -l nodes=1:ppn=4

would allow up to 4 threads to run simultaneously.

I'm not as familiar with specifying the number of cores for an SGE cluster job, 
but I think the -pe smp <num_slots> option for the qsub command is how the 
number of cores is specified.

  Tim

On Mon, May 23, 2016, at 14:15, Glasser, Matthew wrote:
You could try running a wb_command like smoothing on a random dense timeseries 
dataset.  It should use multiple cores if everything is working correctly with 
that.

Peace,

Matt.

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of Gengyan Zhao <[email protected]<mailto:[email protected]>>
Date: Monday, May 23, 2016 at 10:57 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Hello HCP Masters,

I have a question about the "DiffusionPreprocessingBatch.sh". I want to run it 
in a multi-thread manner and involve as many cores as possible. There is a line 
in the script saying:

#Assume that submission nodes have OPENMP enabled (needed for eddy - at least 8 
cores suggested for HCP data)

What shall I do to enable OPENMP? or OPENMP is ready to go? My current state is 
that the pipeline has just been run with SGE on a Ubuntu 14.04 machine having 
32 cores. Thank you very much.

Best,

Gengyan

Research Assistant

Medical Physics, UW-Madison

_______________________________________________
HCP-Users mailing list
[email protected]<mailto:[email protected]>
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

________________________________

The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.

_______________________________________________
HCP-Users mailing list
[email protected]<mailto:[email protected]>
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

________________________________

The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.

--
Timothy B. Brown
Business & Technology Application Analyst III
Pipeline Developer (Human Connectome Project)
tbbrown(at)wustl.edu<http://wustl.edu>
________________________________________
The material in this message is private and may contain Protected Healthcare 
Information (PHI).
If you are not the intended recipient, be advised that any unauthorized use, 
disclosure, copying
or the taking of any action in reliance on the contents of this information is 
strictly prohibited.
If you have received this email in error, please immediately notify the sender 
via telephone or
return mail.
--
Timothy B. Brown
Business & Technology Application Analyst III
Pipeline Developer (Human Connectome Project)
tbbrown(at)wustl.edu
________________________________________
The material in this message is private and may contain Protected Healthcare 
Information (PHI).
If you are not the intended recipient, be advised that any unauthorized use, 
disclosure, copying
or the taking of any action in reliance on the contents of this information is 
strictly prohibited.
If you have received this email in error, please immediately notify the sender 
via telephone or
return mail.

________________________________
The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.

_______________________________________________
HCP-Users mailing list
[email protected]
http://lists.humanconnectome.org/mailman/listinfo/hcp-users

Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly

Reply via email to