That’s correct, everything else is single threaded. Peace,
Matt. From: "Timothy B. Brown" <[email protected]<mailto:[email protected]>> Reply-To: "Brown, Tim" <[email protected]<mailto:[email protected]>> Date: Tuesday, June 14, 2016 at 10:44 AM To: Gengyan Zhao <[email protected]<mailto:[email protected]>>, Timothy Coalson <[email protected]<mailto:[email protected]>> Cc: Matt Glasser <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly Hi Gengyan, As far as I know, there is only one binary that is run as part of the Diffusion Preprocessing Pipeline, that is parallelized. That is the eddy binary and the latest versions of that binary are written do their parallel processing on a GPU. We have broken the Diffusion Preprocessing Pipeline out into three separate phases (Pre Eddy, Eddy, and Post Eddy) with a separate script for each phase so that 3 sequential jobs can be scheduled in such a way that the 2nd job, the Eddy job, is scheduled to run on a processing node that has a GPU available. (The Eddy job is scheduled only to run upon successful completion of the Pre Eddy job, and the Post Eddy job is scheduled only to run upon successful completion of the Eddy job.) This significantly speeds up the Eddy phase of the processing. However, the rest of the processing (both before and after the Eddy phase) is, as far as I know, single threaded. The other significant parellelization we get is by doing just as Tim Coalson described and scheduling the runs of many subjects simultaneously. Tim On Tue, Jun 14, 2016, at 10:24, Gengyan Zhao wrote: Hi Tim, Thank you. However, the code I ran was "DiffusionPreprocessingBatch.sh", which should be able to involve multiple cores parallel computation within one subject, right? Because there is one line in the code saying: #Assume that submission nodes have OPENMP enabled (needed for eddy - at least 8 cores suggested for HCP data) Thanks, Gengyan ________________________________ From: Timothy Coalson <[email protected]<mailto:[email protected]>> Sent: Monday, June 13, 2016 9:13:54 PM To: Gengyan Zhao Cc: Glasser, Matthew; [email protected]<mailto:[email protected]>; [email protected]<mailto:[email protected]> Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly Many of the executables and operations in the pipelines do not use parallelization across cores within a single subject. You may be better off setting up a queue with fewer cores available per slot, and submitting many subjects at once to the queue. Make sure you set the queue name correctly in each "Batch" script. Note that some pipelines take much more memory than others, so some may allow you to run 32 subjects in parallel, while others may only be able to handle 2 or 3 in parallel. Someone else will have to comment on the peak memory requirements per subject in each of the pipelines. Tim On Mon, Jun 13, 2016 at 8:53 PM, Gengyan Zhao <[email protected]<mailto:[email protected]>> wrote: Hi Matt, Tim Coalson and Tim Brown, Thank you very much for all your answers, and I'm sorry for my delayed response. I ran the "DiffusionPreprocessingBatch.sh" to process the example data of the subject 100307 on a 32-cores Ubuntu 14.04 Server. There is no variable of OMP_NUM_THREADS being set, and the task was submitted by an SGE queue with the setting of 16 cores and 16 slots. However, when I used the command "top" to monitor the usage of the cpu, only one core was occupied. Is there anything wrong? What shall I do to involve multiple cores (OpenMP) then? Thank you very much. Best, Gengyan ________________________________ From: Timothy B. Brown <[email protected]<mailto:[email protected]>> Sent: Monday, May 23, 2016 3:45 PM To: Glasser, Matthew; Gengyan Zhao; [email protected]<mailto:[email protected]> Subject: Re: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly Gengyan, My understanding is as follows. (Any OpenMP expert who sees holes in my understanding should feel free to correct me...please.) If the compiled program/binary in use (e.g. eddy or wb_command) has been compiled using the correct OpenMP related switches, then by default, that program will use multi-threading in the places that multi-threading was called for in the source code. It will use a maximum of as many threads as there are are processing cores on the system on which the program is running. So, if the machine you are using has 8 cores, then a properly compiled OpenMP program will use up to 8 threads (parallel executions). But this assumes that the code has been written with that many potential threads of independent execution and compiled and linked with the correct OpenMP switches and OpenMP libraries. For programs like eddy and wb_command, this proper compiling and linking to use OpenMP should already have been done for you. The only other thing that I know of that can limit the number of threads (besides the actual source code) is the setting of the environment variable OMP_NUM_THREADS. If this variable is set to a numeric value (e.g. 4), then the number of threads is limited to that maximum regardless of how many threads the code is written to support. In reality, I believe the behavior when the OMP_NUM_THREADS variable is not set of running "as many threads as available cores" is dependent upon the compiler used. But the GNU compiler collection and (I believe the Intel compiler family) have this behavior. The Visual Studio 2015 compilers have a similar behavior. So...if the machine you are running on has multiple cores, and OMP_NUM_THREADS is not set, the code should be automatically using multi-threading for you. There is another caveat here. If you are submitting jobs to a cluster with a job scheduler (like a Sun Grid Engine or a PBS scheduler) you should be careful to request multiple cores for your job. If the "hardware" you request for running your job is 1 node with 1 processor (i.e. core), then even if the actual machine has multiple cores, only 1 of those cores will be allocated to your job. So the running environment will "look like" a single core processor. This would mean that only 1 thread at a time could run. As an example for PBS, if you were to specify the following in the PBS header for your job: #PBS -l nodes=1:ppn=1 Then you would only get single threading because Processors Per Node (ppn) is specified as 1. Whereas specifying #PBS -l nodes=1:ppn=4 would allow up to 4 threads to run simultaneously. I'm not as familiar with specifying the number of cores for an SGE cluster job, but I think the -pe smp <num_slots> option for the qsub command is how the number of cores is specified. Tim On Mon, May 23, 2016, at 14:15, Glasser, Matthew wrote: You could try running a wb_command like smoothing on a random dense timeseries dataset. It should use multiple cores if everything is working correctly with that. Peace, Matt. From: <[email protected]<mailto:[email protected]>> on behalf of Gengyan Zhao <[email protected]<mailto:[email protected]>> Date: Monday, May 23, 2016 at 10:57 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [HCP-Users] Questions about Run Diffusion Preprocessing Parallelly Hello HCP Masters, I have a question about the "DiffusionPreprocessingBatch.sh". I want to run it in a multi-thread manner and involve as many cores as possible. There is a line in the script saying: #Assume that submission nodes have OPENMP enabled (needed for eddy - at least 8 cores suggested for HCP data) What shall I do to enable OPENMP? or OPENMP is ready to go? My current state is that the pipeline has just been run with SGE on a Ubuntu 14.04 machine having 32 cores. Thank you very much. Best, Gengyan Research Assistant Medical Physics, UW-Madison _______________________________________________ HCP-Users mailing list [email protected]<mailto:[email protected]> http://lists.humanconnectome.org/mailman/listinfo/hcp-users ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ HCP-Users mailing list [email protected]<mailto:[email protected]> http://lists.humanconnectome.org/mailman/listinfo/hcp-users ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -- Timothy B. Brown Business & Technology Application Analyst III Pipeline Developer (Human Connectome Project) tbbrown(at)wustl.edu<http://wustl.edu> ________________________________________ The material in this message is private and may contain Protected Healthcare Information (PHI). If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -- Timothy B. Brown Business & Technology Application Analyst III Pipeline Developer (Human Connectome Project) tbbrown(at)wustl.edu ________________________________________ The material in this message is private and may contain Protected Healthcare Information (PHI). If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users
