Hi Szilárd, Our OS is RHEL 7.6.
Thank you for your test results. It's nice to see consistent results on a POWER9 system. Your suggestion of allocating the whole node was a good one. I did this in two ways. The first was to bypass the Slurm scheduler by ssh-ing to an empty node and running the benchmark. The second way was through Slurm using the --exclusive directive (which allocates the entire node indepedent of job size). In both cases, which used 32 hardware threads and one V100 GPU for ADH (PME, cubic, 40k steps), the performance was about 132 ns/day which is significantly better than the 90 ns/day from before (without --exclusive). Links to the md.log files are below. Here is the Slurm script with --exclusive: -------------------------------------------------------------------------------------------------- #!/bin/bash #SBATCH --job-name=gmx # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=1 # total number of tasks across all nodes #SBATCH --cpus-per-task=32 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --mem=8G # memory per node (4G per cpu-core is default) #SBATCH --time=00:10:00 # total run time limit (HH:MM:SS) #SBATCH --gres=gpu:1 # number of gpus per node #SBATCH --exclusive # TASK AFFINITIES SET CORRECTLY BUT ENTIRE NODE ALLOCATED TO JOB module purge module load cudatoolkit/10.2 BCH=../adh_cubic gmx grompp -f $BCH/pme_verlet.mdp -c $BCH/conf.gro -p $BCH/topol.top -o bench.tpr srun gmx mdrun -nsteps 40000 -pin on -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -s bench.tpr -------------------------------------------------------------------------------------------------- Here are the log files: md.log with --exclusive: https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log.with-exclusive md.log without --exclusive: https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log.without-exclusive Szilárd, what is your reading of these two files? This is a shared cluster so I can't use --exclusive for all jobs. Our nodes have four GPUs and 128 hardware threads (SMT4 so 32 cores over 2 sockets). Any thoughts on how to make a job behave like it is being run with --exclusive? The task affinities are apparently not being set properly in that case. To solve this I tried experimenting with the --cpu-bind settings. When --exclusive is not used, I find a slight performance gain by using --cpu-bind=cores: srun --cpu-bind=cores gmx mdrun -nsteps 40000 -pin on -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -s bench.tpr In this case it still gives "NOTE: Thread affinity was not set" and performance is still poor. The --exclusive result suggests that the failed hardware unit test can be ignored, I believe. Here's a bit about our Slurm configuration: $ grep -i affinity /etc/slurm/slurm.conf TaskPlugin=affinity,cgroup ldd shows that gmx is linked against libhwloc.so.5. I have not heard from my contact at ORNL. All I can find online is that they offer GROMACS 5.1 (https://www.olcf.ornl.gov/software_package/gromacs/) and apparently nothing special is done about thread affinities. Jon ________________________________ From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se <gromacs.org_gmx-users-boun...@maillist.sys.kth.se> on behalf of Szilárd Páll <pall.szil...@gmail.com> Sent: Friday, April 24, 2020 6:06 PM To: Discussion list for GROMACS users <gmx-us...@gromacs.org> Cc: gromacs.org_gmx-users@maillist.sys.kth.se <gromacs.org_gmx-users@maillist.sys.kth.se> Subject: Re: [gmx-users] GROMACS performance issues on POWER9/V100 node Hi, Affinity settings on the Talos II with Ubuntu 18.04 kernel 5.0 works fine. I get threads pinned where they should be (hwloc confirmed) and consistent results. I also get reasonable thread placement even without pinning (i.e. the kernel scatters first until #threads <= #hwthreads). I see only a minor penalty to not pinning -- not too surprising given that I have a single NUMA node and the kernel is doing its job. Here are my quick the test results run on an 8-core Talos II Power9 + a GPU, using the adh_cubic input: $ grep Perf *.log test_1x1_rep1.log:Performance: 16.617 test_1x1_rep2.log:Performance: 16.479 test_1x1_rep3.log:Performance: 16.520 test_1x2_rep1.log:Performance: 32.034 test_1x2_rep2.log:Performance: 32.389 test_1x2_rep3.log:Performance: 32.340 test_1x4_rep1.log:Performance: 62.341 test_1x4_rep2.log:Performance: 62.569 test_1x4_rep3.log:Performance: 62.476 test_1x8_rep1.log:Performance: 97.049 test_1x8_rep2.log:Performance: 96.653 test_1x8_rep3.log:Performance: 96.889 This seems to point towards some issue with the OS or setup on the IBM machines you have -- and the unit test error may be one of the symptoms of it (as it suggests something is off with the hardware topology and a NUMA node is missing from it). I'd still suggest checking if a full not allocation with all threads, memory, etc passed to the job results in successful affinity settings i) in mdrun ii) in some other tool. Please update this thread if you have further findings. Cheers, -- Szilárd On Fri, Apr 24, 2020 at 10:52 PM Szilárd Páll <pall.szil...@gmail.com> wrote: > > The following lines are found in md.log for the POWER9/V100 run: >> >> Overriding thread affinity set outside gmx mdrun >> Pinning threads with an auto-selected logical core stride of 128 >> NOTE: Thread affinity was not set. >> >> The full md.log is available here: >> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log > > > I glanced over that at first, will see if I can reproduce it, though I > only have access to a Raptor Talos, not an IBM machine with Ubuntu. > > What OS are you using? > > > -- >> Gromacs Users mailing list >> >> * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before >> posting! >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> >> * For (un)subscribe requests visit >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or >> send a mail to gmx-users-requ...@gromacs.org. >> > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.