Hi Gengyan, As far as I know, there is only one binary that is run as part of the Diffusion Preprocessing Pipeline, that is parallelized. That is the eddy binary and the latest versions of that binary are written do their parallel processing on a GPU. We have broken the Diffusion Preprocessing Pipeline out into three separate phases (Pre Eddy, Eddy, and Post Eddy) with a separate script for each phase so that 3 sequential jobs can be scheduled in such a way that the 2nd job, the Eddy job, is scheduled to run on a processing node that has a GPU available. (The Eddy job is scheduled only to run upon successful completion of the Pre Eddy job, and the Post Eddy job is scheduled only to run upon successful completion of the Eddy job.) This significantly speeds up the Eddy phase of the processing. However, the rest of the processing (both before and after the Eddy phase) is, as far as I know, single threaded. The other significant parellelization we get is by doing just as Tim Coalson described and scheduling the runs of many subjects simultaneously. Tim On Tue, Jun 14, 2016, at 10:24, Gengyan Zhao wrote: > Hi Tim, > > Thank you. However, the code I ran was > "DiffusionPreprocessingBatch.sh", which should be able to involve > multiple cores parallel computation within one subject, right? Because > there is one line in the code saying: > #Assume that submission nodes have OPENMP enabled (needed for eddy - > at least 8 cores suggested for HCP data) > > Thanks, > Gengyan > > From: Timothy Coalson <[email protected]> Sent: Monday, June 13, 2016 > 9:13:54 PM To: Gengyan Zhao Cc: Glasser, Matthew; [email protected]; > [email protected] Subject: Re: [HCP-Users] Questions about > Run Diffusion Preprocessing Parallelly > > Many of the executables and operations in the pipelines do not use > parallelization across cores within a single subject. You may be > better off setting up a queue with fewer cores available per slot, and > submitting many subjects at once to the queue. Make sure you set the > queue name correctly in each "Batch" script. > > Note that some pipelines take much more memory than others, so some > may allow you to run 32 subjects in parallel, while others may only be > able to handle 2 or 3 in parallel. Someone else will have to comment > on the peak memory requirements per subject in each of the pipelines. > > Tim > > > On Mon, Jun 13, 2016 at 8:53 PM, Gengyan Zhao > <[email protected]> wrote: >> Hi Matt, Tim Coalson and Tim Brown, >> >> Thank you very much for all your answers, and I'm sorry for my >> delayed response. >> >> I ran the "DiffusionPreprocessingBatch.sh" to process the example >> data of the subject 100307 on a 32-cores Ubuntu 14.04 Server. There >> is no variable of OMP_NUM_THREADS being set, and the task was >> submitted by an SGE queue with the setting of 16 cores and 16 slots. >> However, when I used the command "top" to monitor the usage of the >> cpu, only one core was occupied. Is there anything wrong? What shall >> I do to involve multiple cores (OpenMP) then? >> >> Thank you very much. >> >> Best, >> Gengyan >>
>> >> From: Timothy B. Brown <[email protected]> Sent: Monday, May 23, 2016 >> 3:45 PM To: Glasser, Matthew; Gengyan Zhao; hcp- >> [email protected] Subject: Re: [HCP-Users] Questions about >> Run Diffusion Preprocessing Parallelly >> >> Gengyan, >> >> My understanding is as follows. (Any OpenMP expert who sees holes in >> my understanding should feel free to correct me...please.) >> >> If the compiled program/binary in use (e.g. eddy or wb_command) has >> been compiled using the correct OpenMP related switches, then by >> default, that program will use multi-threading in the places that multi- >> threading was called for in the source code. It will use a maximum of >> as many threads as there are are processing cores on the system on >> which the program is running. >> >> So, if the machine you are using has 8 cores, then a properly >> compiled OpenMP program will use up to 8 threads (parallel >> executions). But this assumes that the code has been written with >> that many potential threads of independent execution and compiled and >> linked with the correct OpenMP switches and OpenMP libraries. >> >> For programs like eddy and wb_command, this proper compiling and >> linking to use OpenMP should already have been done for you. >> >> The only other thing that I know of that can limit the number of >> threads (besides the actual source code) is the setting of the >> environment variable OMP_NUM_THREADS. If this variable is set to a >> numeric value (e.g. 4), then the number of threads is limited to that >> maximum regardless of how many threads the code is written to >> support. >> >> In reality, I believe the behavior when the OMP_NUM_THREADS variable >> is not set of running "as many threads as available cores" is >> dependent upon the compiler used. But the GNU compiler collection and >> (I believe the Intel compiler family) have this behavior. The Visual >> Studio 2015 compilers have a similar behavior. >> >> So...if the machine you are running on has multiple cores, and >> OMP_NUM_THREADS is not set, the code should be automatically using >> multi-threading for you. >> >> There is another caveat here. If you are submitting jobs to a cluster >> with a job scheduler (like a Sun Grid Engine or a PBS scheduler) you >> should be careful to request multiple cores for your job. If the >> "hardware" you request for running your job is 1 node with 1 >> processor (i.e. core), then even if the actual machine has multiple >> cores, only 1 of those cores will be allocated to your job. So the >> running environment will "look like" a single core processor. This >> would mean that only 1 thread at a time could run. >> >> As an example for PBS, if you were to specify the following in the >> PBS header for your job: >> >> #PBS -l nodes=1:ppn=1 >> >> Then you would only get single threading because Processors Per Node >> (ppn) is specified as 1. >> >> Whereas specifying >> >> #PBS -l nodes=1:ppn=4 >> >> would allow up to 4 threads to run simultaneously. >> >> I'm not as familiar with specifying the number of cores for an SGE >> cluster job, but I *think* the -pe smp <num_slots> option for the >> qsub command is how the number of cores is specified. >> >> Tim >> >> On Mon, May 23, 2016, at 14:15, Glasser, Matthew wrote: >>> You could try running a wb_command like smoothing on a random dense >>> timeseries dataset. It should use multiple cores if everything is >>> working correctly with that. >>> >>> Peace, >>> >>> Matt. >>> >>> From: <[email protected]> on behalf of Gengyan >>> Zhao <[email protected]> Date: Monday, May 23, 2016 at 10:57 AM To: >>> "[email protected]" <[email protected]> >>> Subject: [HCP-Users] Questions about Run Diffusion Preprocessing >>> Parallelly >>> >>> Hello HCP Masters, >>> >>> I have a question about the "DiffusionPreprocessingBatch.sh". I want >>> to run it in a multi-thread manner and involve as many cores as >>> possible. There is a line in the script saying: >>> >>> #Assume that submission nodes have OPENMP enabled (needed for eddy - >>> at least 8 cores suggested for HCP data) >>> >>> What shall I do to enable OPENMP? or OPENMP is ready to go? My >>> current state is that the pipeline has just been run with SGE on a >>> Ubuntu 14.04 machine having 32 cores. Thank you very much. >>> >>> Best, >>> Gengyan >>> >>> Research Assistant >>> Medical Physics, UW-Madison >>> >>> _______________________________________________ >>> HCP-Users mailing list [email protected] >>> http://lists.humanconnectome.org/mailman/listinfo/hcp-users >>> >>> >>> The materials in this message are private and may contain Protected >>> Healthcare Information or other information of a sensitive nature. >>> If you are not the intended recipient, be advised that any >>> unauthorized use, disclosure, copying or the taking of any action in >>> reliance on the contents of this information is strictly prohibited. >>> If you have received this email in error, please immediately notify >>> the sender via telephone or return mail. >>> _______________________________________________ >>> HCP-Users mailing list [email protected] >>> http://lists.humanconnectome.org/mailman/listinfo/hcp-users >>> >>> >>> The materials in this message are private and may contain Protected >>> Healthcare Information or other information of a sensitive nature. >>> If you are not the intended recipient, be advised that any >>> unauthorized use, disclosure, copying or the taking of any action in >>> reliance on the contents of this information is strictly prohibited. >>> If you have received this email in error, please immediately notify >>> the sender via telephone or return mail. >> -- >> Timothy B. Brown >> Business & Technology Application Analyst III >> Pipeline Developer (Human Connectome Project) >> tbbrown(at)wustl.edu >> ________________________________________ >> The material in this message is private and may contain Protected >> Healthcare Information (PHI). >> If you are not the intended recipient, be advised that any >> unauthorized use, disclosure, copying >> or the taking of any action in reliance on the contents of this >> information is strictly prohibited. >> If you have received this email in error, please immediately notify >> the sender via telephone or >> return mail. -- Timothy B. Brown Business & Technology Application Analyst III Pipeline Developer (Human Connectome Project) tbbrown(at)wustl.edu ________________________________________ The material in this message is private and may contain Protected Healthcare Information (PHI). If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. _______________________________________________ HCP-Users mailing list [email protected] http://lists.humanconnectome.org/mailman/listinfo/hcp-users
