Hi Ingo,
I don't use gnubg for analysis. I just happened to notice that gnubg seems to use just one core on my eight physical core machine. No matter how many threads, they all run in one core (or maybe it's two cores). When I use the chess program crafty to analyze positions, and I have it use 8 threads for evaluation, the top command shows CPU%=800%. If I have it use 16 threads, the top command shows 1600% (which must mean that 16 virtual cores are being used, via hyperthreading). My point is simply that threads are not being distributed among cores on my new Nehalem Mac Pro. I'm hoping someone will help me understand why, just so I know. Louis ----- Original Message ----- From: "Ingo Macherius" <[email protected]> To: "Louis Zulli" <[email protected]> Cc: "Michael Petch" <[email protected]>, [email protected] Sent: Thursday, August 6, 2009 2:43:31 PM GMT -05:00 US/Canada Eastern Subject: RE: [Bug-gnubg] Re: Getting gnubg to use all available cores I've done two experiments on Debian 5.0.2 on my 2xXeon 5130 box (4 cores in 2 chips). A) Run 4 instances of gnubg analyzing 5 matches with 7pt each (real fibs matches) B) Run 1 instance of gnubg analyzing the same matches 4 times each (with caches flushed in between) with 4 threads => The amount of work done is equal, each experiiment analyzes 20 7pt matches. Run A takes about 10.3 secs real time (wallclock), Run B takes about 13.2 secs. These are averaged values over several runs, there were no significant escapes. I've used the "top" and "mpstat" commands running while both experiments took place. I've noticed that both sort of smooth the % CPU usage they display over a sliding window. When you start either A or B, the displayed % usage remains low for some secs, gradually goes up, and peaks shortly before the run ends and then falls to zero again. In other words: the % displayed is an average over the last n seconds, not the current usage. In other words: gnubg is utilizing the cores well in both cases, what you see is the "beautifying" effect of top. Run a longer batch which runs for say several minutes and you will see the 100% CPU usage (Case A) or near 90% CPU usage (Case B) you expect. The fact threaded gnubg can not utilize cores to the same effect than 4 parallel no-threading gnubgs is explainable by these assumptions of mine: - The system scheduler does a better job then gnubg's - There is overhead for threading even in multithreaded binaries, a gnubg compiled without threading is some % faster than one with threading running in one thread only. - Thread synchronization causes some amount of idle time So if you want to utilize your CPU the best, run as many gnubg instances as there are cores in parallel. Compile them without threading support. Split the amount of work to analyze into batches manually. If you prefer the more comfortable way to let gnubg do the scheduling of your batches, accept the reasonable penalty for that. Ingo P.S. The batch I've used to run 4 in parallel is #!/bin/bash TMPBATCH=/tmp/gnubgbatch echo > ${TMPBATCH} set cache 131072 echo >> ${TMPBATCH} clear cache echo >> ${TMPBATCH} clear hint echo >> ${TMPBATCH} analysis clear echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090802162313580.mat echo >> ${TMPBATCH} analyze match echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090521010117727.mat echo >> ${TMPBATCH} analyze match echo >> ${TMPBATCH} import mat You_vs_sale_20090803190848830.mat echo >> ${TMPBATCH} analyze mat echo >> ${TMPBATCH} import mat You_vs_fortuna_20090802225906909.mat echo >> ${TMPBATCH} analyze mat echo >> ${TMPBATCH} import mat You_vs_VaGrant_20090103173459437.mat echo >> ${TMPBATCH} analyze mat for t in 1 2 3 do ./gnubg-nt < ${TMPBATCH} > /dev/null & done ./gnubg-nt < ${TMPBATCH} > /dev/null > -----Original Message----- > From: Louis Zulli [mailto:[email protected]] > Sent: Thursday, August 06, 2009 6:29 PM > To: Ingo Macherius > Cc: 'Michael Petch'; [email protected] > Subject: Re: [Bug-gnubg] Re: Getting gnubg to use all available cores > > > Hi, > > I put > > #define MAX_NUMTHREADS 64 > > in multithread.h and rebuilt. > > In Settings-->Options-->Other, I put Eval Threads to 64. > > I then let gnubg analyze a game using 4-ply analysis. > > According to my unix top command, gnubg had 69 threads and was using > 188%CPU. So apparently all the threads were running (into > each other!) > in one physical core. > > In any case, increasing the max number of threads above 16 seems > trivial to do, unless I'm missing something. > > Louis > > > On Aug 6, 2009, at 11:34 AM, Ingo Macherius wrote: > > > Do you use the calibrate command or a batch analysis of matchfiles? > > The > > former was shown to be of no value for benchmarks, see here: > > http://lists.gnu.org/archive/html/bug-gnubg/2009-08/msg00006.html > > > > With calibrate I had the very same effect of high idle times during > > benchmarks, unless I used at least 8 threads per physical core. > > > > I am doing benchmark on a 4 core machine which iterates over #thread > > (1..6) > > and cache size (2^1 .. 2^27). Should be posted in say 3 hours, it > > literally > > is still running :) > > > > Ingo > > > >> -----Original Message----- > >> From: [email protected] > >> [mailto:[email protected]] On Behalf Of > >> Louis Zulli > >> Sent: Thursday, August 06, 2009 3:21 PM > >> To: Michael Petch > >> Cc: [email protected] > >> Subject: [Bug-gnubg] Re: Getting gnubg to use all available cores > >> > >> > >> > >> On Aug 5, 2009, at 4:02 PM, Michael Petch wrote: > >> > >>> I'm unsure how the architecture is deployed and how OS/X > >> handles the > >>> physical cores, but it almost sounds like one Physical > core is being > >>> used (Using Hyperthreads to run 2 threads > simultaneously). I wonder > >>> if the memory > >>> is shared across all the cores? A friend of mine was > >> suggesting that > >>> people > >>> may have to wait for Snow Lapard to come out before OS/X properly > >>> utilizes the Nehalem architecture (whetehr that si true or not, I > >> don't know). > >>> > >>> Anyway, as an experiment. If you run 2 copies of Gnubg at > the same > >>> time (using multiple threads) do you get 400% CPU usage? > >>> > >> > >> > >> Hi Mike, > >> > >> Sorry for the delay. I just had two copies of gnubg > analyze the same > >> game, using 3 ply analysis. Each instance of gnubg used 200% > >> CPU. Each > >> copy was set to use 4 evaluation threads. > >> > >> So what's the verdict here? Is Leopard simply not directing threads > >> correctly? > >> > >> Louis > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Bug-gnubg mailing list > >> [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg > > >
_______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
