Re: [gmx-users] simulation on 2 gpus
Hi, I've done a lot of research/experimentation on this, so I can maybe get you started - if anyone has any questions about the essay to follow, feel free to email me personally, and I'll link it to the email thread if it ends up being pertinent. First, there's some more internet resources to checkout. See Mark's talk at - https://bioexcel.eu/webinar-performance-tuning-and-optimization-of-gromacs/ Gromacs development moves fast, but a lot of it is still relevant. I'll expand a bit here, with the caveat that Gromacs GPU development is moving very fast and so the correct commands for optimal performance are both system-dependent and a moving target between versions. This is a good thing - GPUs have revolutionized the field, and with each iteration we make better use of them. The downside is that it's unclear exactly what sort of CPU-GPU balance you should look to purchase to take advantage of future developments, though the trend is certainly that more and more computation is being offloaded to the GPUs. The most important consideration is that to get maximum total throughput performance, you should be running not one but multiple simulations simultaneously. You can do this through the -multidir option, but I don't recommend that in this case, as it requires compiling with MPI and limits some of your options. My run scripts usually use "gmx mdrun ... &" to initiate subprocesses, with combinations of -ntomp, -ntmpi, -pin -pinoffset, and -gputasks. I can give specific examples if you're interested. Another important point is that you can run more simulations than the number of GPUs you have. Depending on CPU-GPU balance and quality, you won't double your throughput by e.g. putting 4 simulations on 2 GPUs, but you might increase it up to 1.5x. This would involve targeting the same GPU with -gputasks. Within a simulation, you should set up a benchmarking script to figure out the best combination of thread-mpi ranks and open-mp threads - this can have pretty drastic effects on performance. For example, if you want to use your entire machine for one simulation (not recommended for maximal efficiency), you have a lot of decomposition options (ignoring PME - which is important, see below): -ntmpi 2 -ntomp 32 -gputasks 01 -ntmpi 4 -ntomp 16 -gputasks 0011 -ntmpi 8 -ntomp 8 -gputasks -ntmpi 16 -ntomp 4 -gputasks 111 (and a few others - note that ntmpi * ntomp = total threads available) In my experience, you need to scan the options in a benchmarking script for each simulation size/content you want to simulate, and the difference between the best and the worst can be up to a factor of 2-4 in terms of performance. If you're splitting your machine among multiple simulations, I suggest running 1 mpi thread (-ntmpi 1) per simulation, unless your benchmarking suggests that the optimal performance lies elsewhere. Things get more complicated when you start putting PME on the GPUs. For the machines I work on, putting PME on GPUs absolutely improves performance, but I'm not fully confident in that assessment without testing your specific machine - you have a lot of cores with that threadripper, and this is another area where I expect Gromacs 2020 might shift the GPU-CPU optimal balance. The issue with PME on GPUs is that we can (currently) only have one rank doing GPU PME work. So, if we have a machine with say 20 cores and 2 gpus, if I run the following gmx mdrun -ntomp 10 -ntmpi 2 -pme gpu -npme 1 -gputasks 01 , two ranks will be started - one with cores 0-9, will work on the short-range interactions, offloading where it can to GPU 0, and the PME rank (cores 10-19) will offload to GPU 1. There is one significant problem (and one minor problem) with this setup. First, it is massively inefficient in terms of load balance. In a typical system (there are exceptions), PME takes up ~1/3 of the computation that short-range interactions take. So, we are offloading 1/4 of our interactions to one GPU and 3/4 to the other, which leads to imbalance. In this specific case (2 GPUs and sufficient cores), the most optimal solution is often (but not always) to run with -ntmpi 4 (in this example, then -ntomp 5), as the PME rank then gets 1/4 of the GPU instructions, proportional to the computation needed. The second(less critical - don't worry about this unless you're CPU-limited) problem is that PME-GPU mpi ranks only use 1 CPU core in their calculations. So, with a node of 20 cores and 2 GPUs, if I run a simulation with -ntmpi 4 -ntmpi 5 -pme gpu -npme 1 -pme gpu, each one of those ranks will have 5 CPUs, but the PME rank will only use one of them. You can specify the number of PME cores per rank with -ntomp_pme. This is useful in restricted cases. For example, given the above architecture setup (20 cores, 2 GPUs), I could maximally exploit my CPUs with the following commands: gmx mdrun -ntmpi 4 -ntomp 3 -ntomp_pme 1 -pme gpu -npme 1 -gputasks -pin on -pinoffset 0 & gmx mdrun -ntmpi 4
Re: [gmx-users] gmx insert-molecules question
Thanks! (and sorry about all the typos in my original email -- just re-read it and saw it was barely intelligible!) M On Thu, Jul 25, 2019 at 4:25 PM Justin Lemkul wrote: > > > On 7/25/19 4:11 PM, Mala L Radhakrishnan wrote: > > Hi, > > > > I am trying to crod snapshots with multiple copies of a molecule. When I > > run gmx insert-molecules and I have a box of a certain size, does it make > > sure that there is no overlap of molecules even considering pbc? What I > > mean by this is if I do a trjconv on the resulting crowded snapshot, > using > > -pbc atom, will any of the atoms/molecules overlap? Or maybe another way > > of saying it is, if a molecule, when places, falls partially outside the > > box does it ensure that another molecule wouldn't overlap with the > periodic > > image of that first molecule? > > Yes, the insert-molecules code uses PBC in its neighbor searching. > > -Justin > > -- > == > > Justin A. Lemkul, Ph.D. > Assistant Professor > Office: 301 Fralin Hall > Lab: 303 Engel Hall > > Virginia Tech Department of Biochemistry > 340 West Campus Dr. > Blacksburg, VA 24061 > > jalem...@vt.edu | (540) 231-3129 > http://www.thelemkullab.com > > == > > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Mala L. Radhakrishnan Associate Professor of Chemistry Director, Biochemistry Program Wellesley College 106 Central Street Wellesley, MA 02481 (781)283-2981 -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] gmx insert-molecules question
On 7/25/19 4:11 PM, Mala L Radhakrishnan wrote: Hi, I am trying to crod snapshots with multiple copies of a molecule. When I run gmx insert-molecules and I have a box of a certain size, does it make sure that there is no overlap of molecules even considering pbc? What I mean by this is if I do a trjconv on the resulting crowded snapshot, using -pbc atom, will any of the atoms/molecules overlap? Or maybe another way of saying it is, if a molecule, when places, falls partially outside the box does it ensure that another molecule wouldn't overlap with the periodic image of that first molecule? Yes, the insert-molecules code uses PBC in its neighbor searching. -Justin -- == Justin A. Lemkul, Ph.D. Assistant Professor Office: 301 Fralin Hall Lab: 303 Engel Hall Virginia Tech Department of Biochemistry 340 West Campus Dr. Blacksburg, VA 24061 jalem...@vt.edu | (540) 231-3129 http://www.thelemkullab.com == -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Angular distribution
On 7/25/19 8:05 AM, Omkar Singh wrote: Hi everyone, I have a protein Water simulated system and I want to calculate Angular distribution function. How can I find can anyone help me with this? What angles do you want to calculate? -Justin -- == Justin A. Lemkul, Ph.D. Assistant Professor Office: 301 Fralin Hall Lab: 303 Engel Hall Virginia Tech Department of Biochemistry 340 West Campus Dr. Blacksburg, VA 24061 jalem...@vt.edu | (540) 231-3129 http://www.thelemkullab.com == -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] gmx insert-molecules question
Hi, I am trying to crod snapshots with multiple copies of a molecule. When I run gmx insert-molecules and I have a box of a certain size, does it make sure that there is no overlap of molecules even considering pbc? What I mean by this is if I do a trjconv on the resulting crowded snapshot, using -pbc atom, will any of the atoms/molecules overlap? Or maybe another way of saying it is, if a molecule, when places, falls partially outside the box does it ensure that another molecule wouldn't overlap with the periodic image of that first molecule? Hope this question makes sense -- thanks! Mala -- Mala L. Radhakrishnan Associate Professor of Chemistry Director, Biochemistry Program Wellesley College 106 Central Street Wellesley, MA 02481 (781)283-2981 -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] simulation on 2 gpus
Dear all, I am trying to run simulation with Gromacs 2019.2 on a workstation with an amd Threadripper cpu (32 core, 64 threads, 128 GB RAM and with two rtx 2080 ti with nvlink bridge. I read user's guide section regarding performance and I am exploring some possibile combinations of cpu/gpu work to run as fast as possible. I was wondering if some of you has experience of running on more than one gpu with several cores and can give some hints as starting point. Thanks Stefano -- Stefano GUGLIELMO PhD Assistant Professor of Medicinal Chemistry Department of Drug Science and Technology Via P. Giuria 9 10125 Turin, ITALY ph. +39 (0)11 6707178 -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Angular distribution
Hi everyone, I have a protein Water simulated system and I want to calculate Angular distribution function. How can I find can anyone help me with this? Thanks -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] Sun Solaris
On Thu, Jul 25, 2019 at 11:31 AM amitabh jayaswal wrote: > > Dear All, > *Namaskar!* > Can GROMACS be installed and run on a Sun Solaris system? Hi, As long as you have modern C++ compilers and toolchain, you should be able to do so. > We have a robust IBM Desktop which we intend to dedicatedly use for > GROMACS; however we are facing difficulties in installing it. Without specifics of your difficulties, I do not think we can help out. -- Szilárd > The machine specifications are: > PRODUCT NAME: IBM System x3400 > MACHINE TYPE: 7973 > SERIAL NO.: 99A8370 > PRODUCT ID:7973PAA > > Is there a way to progress ahead? > Kindly provide a solution asap. > Best > > *Dr. Amitabh Jayaswal* > *PhD Bioinformatics* > *IIT(BHU), Varanasi, India* > *M: +91 9868 33 00 88 *(also on WhatsApp), and > * +91 7376 019 155* > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] Sun Solaris
Dear All, *Namaskar!* Can GROMACS be installed and run on a Sun Solaris system? We have a robust IBM Desktop which we intend to dedicatedly use for GROMACS; however we are facing difficulties in installing it. The machine specifications are: PRODUCT NAME: IBM System x3400 MACHINE TYPE: 7973 SERIAL NO.: 99A8370 PRODUCT ID:7973PAA Is there a way to progress ahead? Kindly provide a solution asap. Best *Dr. Amitabh Jayaswal* *PhD Bioinformatics* *IIT(BHU), Varanasi, India* *M: +91 9868 33 00 88 *(also on WhatsApp), and * +91 7376 019 155* -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] remd error
This is an MPI / job scheduler error: you are requesting 2 nodes with 20 processes per node (=40 total), but starting 80 ranks. -- Szilárd On Thu, Jul 18, 2019 at 8:33 AM Bratin Kumar Das <177cy500.bra...@nitk.edu.in> wrote: > > Hi, >I am running remd simulation in gromacs-2016.5. After generating the > multiple .tpr file in each directory by the following command > *for i in {0..7}; do cd equil$i; gmx grompp -f equil${i}.mdp -c em.gro -p > topol.top -o remd$i.tpr -maxwarn 1; cd ..; done* > I run *mpirun -np 80 gmx_mpi mdrun -s remd.tpr -multi 8 -replex 1000 > -reseed 175320 -deffnm remd_equil* > It is giving the following error > There are not enough slots available in the system to satisfy the 40 slots > that were requested by the application: > gmx_mpi > > Either request fewer slots for your application, or make more slots > available > for use. > -- > -- > There are not enough slots available in the system to satisfy the 40 slots > that were requested by the application: > gmx_mpi > > Either request fewer slots for your application, or make more slots > available > for use. > -- > I am not understanding the error. Any suggestion will be highly > appriciated. The mdp file and the qsub.sh file is attached below > > qsub.sh... > #! /bin/bash > #PBS -V > #PBS -l nodes=2:ppn=20 > #PBS -l walltime=48:00:00 > #PBS -N mdrun-serial > #PBS -j oe > #PBS -o output.log > #PBS -e error.log > #cd /home/bratin/Downloads/GROMACS/Gromacs_fibril > cd $PBS_O_WORKDIR > module load openmpi3.0.0 > module load gromacs-2016.5 > NP='cat $PBS_NODEFILE | wc -1' > # mpirun --machinefile $PBS_PBS_NODEFILE -np $NP 'which gmx_mpi' mdrun -v > -s nvt.tpr -deffnm nvt > #/apps/gromacs-2016.5/bin/mpirun -np 8 gmx_mpi mdrun -v -s remd.tpr -multi > 8 -replex 1000 -deffnm remd_out > for i in {0..7}; do cd equil$i; gmx grompp -f equil${i}.mdp -c em.gro -r > em.gro -p topol.top -o remd$i.tpr -maxwarn 1; cd ..; done > > for i in {0..7}; do cd equil${i}; mpirun -np 40 gmx_mpi mdrun -v -s > remd.tpr -multi 8 -replex 1000 -deffnm remd$i_out ; cd ..; done > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
Re: [gmx-users] performance issues running gromacs with more than 1 gpu card in slurm
Hi, It is not clear to me how are you trying to set up your runs, so please provide some details: - are you trying to run multiple simulations concurrently on the same node or are you trying to strong-scale? - what are you simulating? - can you provide log files of the runs? Cheers, -- Szilárd On Tue, Jul 23, 2019 at 1:34 AM Carlos Navarro wrote: > > No one can give me an idea of what can be happening? Or how I can solve it? > Best regards, > Carlos > > —— > Carlos Navarro Retamal > Bioinformatic Engineering. PhD. > Postdoctoral Researcher in Center of Bioinformatics and Molecular > Simulations > Universidad de Talca > Av. Lircay S/N, Talca, Chile > E: carlos.navarr...@gmail.com or cnava...@utalca.cl > > On July 19, 2019 at 2:20:41 PM, Carlos Navarro (carlos.navarr...@gmail.com) > wrote: > > Dear gmx-users, > I’m currently working in a server where each node posses 40 physical cores > (40 threads) and 4 Nvidia-V100. > When I launch a single job (1 simulation using a single gpu card) I get a > performance of about ~35ns/day in a system of about 300k atoms. Looking > into the usage of the video card during the simulation I notice that the > card is being used about and ~80%. > The problems arise when I increase the number of jobs running at the same > time. If for instance 2 jobs are running at the same time, the performance > drops to ~25ns/day each and the usage of the video cards also drops during > the simulation to about a ~30-40% (and sometimes dropping to less than 5%). > Clearly there is a communication problem between the gpu cards and the cpu > during the simulations, but I don’t know how to solve this. > Here is the script I use to run the simulations: > > #!/bin/bash -x > #SBATCH --job-name=testAtTPC1 > #SBATCH --ntasks-per-node=4 > #SBATCH --cpus-per-task=20 > #SBATCH --account=hdd22 > #SBATCH --nodes=1 > #SBATCH --mem=0 > #SBATCH --output=sout.%j > #SBATCH --error=s4err.%j > #SBATCH --time=00:10:00 > #SBATCH --partition=develgpus > #SBATCH --gres=gpu:4 > > module use /gpfs/software/juwels/otherstages > module load Stages/2018b > module load Intel/2019.0.117-GCC-7.3.0 > module load IntelMPI/2019.0.117 > module load GROMACS/2018.3 > > WORKDIR1=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/1 > WORKDIR2=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/2 > WORKDIR3=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/3 > WORKDIR4=/p/project/chdd22/gromacs/benchmark/AtTPC1/singlegpu/4 > > DO_PARALLEL=" srun --exclusive -n 1 --gres=gpu:1 " > EXE=" gmx mdrun " > > cd $WORKDIR1 > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 0 > -ntomp 20 &>log & > cd $WORKDIR2 > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 10 > -ntomp 20 &>log & > cd $WORKDIR3 > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 20 > -ntomp 20 &>log & > cd $WORKDIR4 > $DO_PARALLEL $EXE -s eq6.tpr -deffnm eq6-20 -nmpi 1 -pin on -pinoffset 30 > -ntomp 20 &>log & > > > Regarding to pinoffset, I first tried using 20 cores for each job but then > also tried with 8 cores (so pinoffset 0 for job 1, pinoffset 4 for job 2, > pinoffset 8 for job 3 and pinoffset 12 for job) but at the end the problem > persist. > > Currently in this machine I’m not able to use more than 1 gpu per job, so > this is my only choice to use properly the whole node. > If you need more information please just let me know. > Best regards. > Carlos > > —— > Carlos Navarro Retamal > Bioinformatic Engineering. PhD. > Postdoctoral Researcher in Center of Bioinformatics and Molecular > Simulations > Universidad de Talca > Av. Lircay S/N, Talca, Chile > E: carlos.navarr...@gmail.com or cnava...@utalca.cl > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.
[gmx-users] older server CPUs with recent GPUs for GROMACS
Hi Mike, Forking the discussion to have a consistent topic that is more discoverable. On Thu, Jul 18, 2019 at 4:21 PM Michael Williams wrote: > > Hi Szilárd, > > Thanks for the interesting observations on recent hardware. I was wondering > if you could comment on the use of somewhat older server cpus and > motherboards (versus more cutting edge consumer parts). I recently noticed > that Haswell era Xeon cpus (E5 v3) are quite affordable now (~$400 for 12 > core models with 40 pcie lanes) and so are the corresponding 2 cpu socket > server motherboards. Of course the RAM is slower than what can be used with > the latest Ryzen or i7/i9 cpus. ,When it comes to GPU accelerated runs, given that most of the arithmetically-intensive computation is offloaded, major features of more modern processors / CPU instruction sets don't help much (like AVX512). As most bio-MD (unless running huge systems) fits in the CPU cache, RAM performance and more memory channels also has little to no impact (with some exceptions being 1-st gen AMD Zen arch, but that's another topic). What dominates the performance CPU contribution of CPUs is cache size (and speed/efficiency) and number/speed of the CPU cores. This is somewhat of a non-trivial thing to assess as the clock speed specs don't always reflect the stable clocks these CPUs run at, but roughly you can count the (#core x frequency) as a metric to gauge the performance of a CPU *in such a scenario*. More on this you can find in our recent paper where we do in fact compare the performance of the best bang for buck modern servers (spoiler alert: AMD EPYC was already and will especially be the champion with the Rome arch) with upgraded older Xeon v2 nodes; see: https://doi.org/10.1002/jcc.26011 > > Are there any other bottlenecks with this somewhat older server hardware that > I might not be aware of? There can be: PCI topology can be an issue; you want a symmetric, e.g. two x16 buses connected directly to each socket (for dual-socket systems) rather than e.g. many lanes connected to a PCI switch all connected to the same socket. You can also have significant GPU-to-GPU communication issues on older-gen hardware (like v2/v3 Xeon), but GROMACS does not make use of that yet (partly due to that very reason), but with the near future releases that may also be a slight concern if you want to scale across many GPUs. I hope that helps, let me know if you have any other questions! Cheers, -- Szilárd > Thanks again for the interesting information and practical advice on this > topic. > > Mike > > > > On Jul 18, 2019, at 2:21 AM, Szilárd Páll wrote: > > > > PS: You will get more PCIe lanes without motherboard trickery -- and note > > that consumer motherboards with PCIe switches can sometimes cause > > instabilities when under heavy compute load -- if you buy the aging and > > quite overpriced i9 X-series like the i9-7920 with 12 cores or the > > Threadripper 2950x 16 cores and 60 PCIe lanes. > > > > Also note that, but more cores always win when the CPU performance matters > > and while 8 cores are generally sufficient, in some use-cases it may not be > > (like runs with free energy). > > > > -- > > Szilárd > > > > > > On Thu, Jul 18, 2019 at 10:08 AM Szilárd Páll > > wrote: > > > >> On Wed, Jul 17, 2019 at 7:00 PM Moir, Michael (MMoir) > >> wrote: > >> > >>> This is not quite true. I certainly observed this degradation in > >>> performance using the 9900K with two GPUs as Szilárd states using a > >>> motherboard with one PCIe controller, but the limitation is from the > >>> motherboard not from the CPU. > >> > >> > >> Sorry, but that's not the case. PCIe controllers have been integrated into > >> CPUs for many years; see > >> > >> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-introduction-basics-paper.pdf > >> > >> https://www.microway.com/hpc-tech-tips/common-pci-express-myths-gpu-computing/ > >> > >> So no, the limitation is the CPU itself. Consumer CPUs these days have 24 > >> lanes total, some of which are used to connect the CPU to the chipset, and > >> effectively you get 16-20 lanes (BTW here too the new AMD CPUs win as they > >> provide 16 lanes for GPUs and similar devices and 4 lanes for NVMe, all on > >> PCIe 4.0). > >> > >> > >>> It is possible to obtain a motherboard that contains two PCIe > >>> controllers which overcomes this obstacle for not a whole lot more money. > >>> > >> > >> It is possibly to buy motherboards with PCIe switches. These don't > >> increase the number of lanes just do what a swtich does: as long as not all > >> connected devices try to use the full capacity of the CPU (!) at the same > >> time, you can get full speed on all connected devices. > >> e.g.: > >> https://techreport.com/r.x/2015_11_19_Gigabytes_Z170XGaming_G1_motherboard_reviewed/05-diagram_pcie_routing.gif > >> > >> Cheers, > >> -- > >> Szilárd > >> > >> Mike > >>> > >>> -Original Message- > >>> From: