Hi Ingo, 

I don't use gnubg for analysis. I just happened to notice that gnubg seems to 
use just one core on my eight physical core machine. No matter how many 
threads, they all run in one core (or maybe it's two cores). 


When I use the chess program crafty to analyze positions, and I have it use 8 
threads for evaluation, the top command shows CPU%=800%. If I have it use 16 
threads, the top command shows 1600% (which must mean that 16 virtual cores are 
being used, via hyperthreading). 


My point is simply that threads are not being distributed among cores on my new 
Nehalem Mac Pro. 


I'm hoping someone will help me understand why, just so I know. 


Louis 





----- Original Message ----- 
From: "Ingo Macherius" <[email protected]> 
To: "Louis Zulli" <[email protected]> 
Cc: "Michael Petch" <[email protected]>, [email protected] 
Sent: Thursday, August 6, 2009 2:43:31 PM GMT -05:00 US/Canada Eastern 
Subject: RE: [Bug-gnubg] Re: Getting gnubg to use all available cores 

I've done two experiments on Debian 5.0.2 on my 2xXeon 5130 box (4 cores in 
2 chips). 

A) Run 4 instances of gnubg analyzing 5 matches with 7pt each (real fibs 
matches) 
B) Run 1 instance of gnubg analyzing the same matches 4 times each (with 
caches flushed in between) with 4 threads 

=> The amount of work done is equal, each experiiment analyzes 20 7pt 
matches. 

Run A takes about 10.3 secs real time (wallclock), Run B takes about 13.2 
secs. These are averaged values over several runs, there were no significant 
escapes. 

I've used the "top" and "mpstat" commands running while both experiments 
took place. I've noticed that both sort of smooth the % CPU usage they 
display over a sliding window. When you start either A or B, the displayed % 
usage remains low for some secs, gradually goes up, and peaks shortly before 
the run ends and then falls to zero again. In other words: the % displayed 
is an average over the last n seconds, not the current usage. 

In other words: gnubg is utilizing the cores well in both cases, what you 
see is the "beautifying" effect of top. Run a longer batch which runs for 
say several minutes and you will see the 100% CPU usage (Case A) or near 90% 
CPU usage (Case B) you expect. 

The fact threaded gnubg can not utilize cores to the same effect than 4 
parallel no-threading gnubgs is explainable by these assumptions of mine: 
- The system scheduler does a better job then gnubg's 
- There is overhead for threading even in multithreaded binaries, a gnubg 
compiled without threading is some % faster than one with threading running 
in one thread only. 
- Thread synchronization causes some amount of idle time 

So if you want to utilize your CPU the best, run as many gnubg instances as 
there are cores in parallel. Compile them without threading support. Split 
the amount of work to analyze into batches manually. If you prefer the more 
comfortable way to let gnubg do the scheduling of your batches, accept the 
reasonable penalty for that. 

Ingo 

P.S. The batch I've used to run 4 in parallel is 

#!/bin/bash 

TMPBATCH=/tmp/gnubgbatch 

echo > ${TMPBATCH} set cache 131072 
echo >> ${TMPBATCH} clear cache 
echo >> ${TMPBATCH} clear hint 
echo >> ${TMPBATCH} analysis clear 
echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090802162313580.mat 
echo >> ${TMPBATCH} analyze match 
echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090521010117727.mat 
echo >> ${TMPBATCH} analyze match 
echo >> ${TMPBATCH} import mat You_vs_sale_20090803190848830.mat 
echo >> ${TMPBATCH} analyze mat 
echo >> ${TMPBATCH} import mat You_vs_fortuna_20090802225906909.mat 
echo >> ${TMPBATCH} analyze mat 
echo >> ${TMPBATCH} import mat You_vs_VaGrant_20090103173459437.mat 
echo >> ${TMPBATCH} analyze mat 

for t in 1 2 3 
do 
./gnubg-nt < ${TMPBATCH} > /dev/null & 
done 
./gnubg-nt < ${TMPBATCH} > /dev/null 

> -----Original Message----- 
> From: Louis Zulli [mailto:[email protected]] 
> Sent: Thursday, August 06, 2009 6:29 PM 
> To: Ingo Macherius 
> Cc: 'Michael Petch'; [email protected] 
> Subject: Re: [Bug-gnubg] Re: Getting gnubg to use all available cores 
> 
> 
> Hi, 
> 
> I put 
> 
> #define MAX_NUMTHREADS 64 
> 
> in multithread.h and rebuilt. 
> 
> In Settings-->Options-->Other, I put Eval Threads to 64. 
> 
> I then let gnubg analyze a game using 4-ply analysis. 
> 
> According to my unix top command, gnubg had 69 threads and was using 
> 188%CPU. So apparently all the threads were running (into 
> each other!) 
> in one physical core. 
> 
> In any case, increasing the max number of threads above 16 seems 
> trivial to do, unless I'm missing something. 
> 
> Louis 
> 
> 
> On Aug 6, 2009, at 11:34 AM, Ingo Macherius wrote: 
> 
> > Do you use the calibrate command or a batch analysis of matchfiles? 
> > The 
> > former was shown to be of no value for benchmarks, see here: 
> > http://lists.gnu.org/archive/html/bug-gnubg/2009-08/msg00006.html 
> > 
> > With calibrate I had the very same effect of high idle times during 
> > benchmarks, unless I used at least 8 threads per physical core. 
> > 
> > I am doing benchmark on a 4 core machine which iterates over #thread 
> > (1..6) 
> > and cache size (2^1 .. 2^27). Should be posted in say 3 hours, it 
> > literally 
> > is still running :) 
> > 
> > Ingo 
> > 
> >> -----Original Message----- 
> >> From: [email protected] 
> >> [mailto:[email protected]] On Behalf Of 
> >> Louis Zulli 
> >> Sent: Thursday, August 06, 2009 3:21 PM 
> >> To: Michael Petch 
> >> Cc: [email protected] 
> >> Subject: [Bug-gnubg] Re: Getting gnubg to use all available cores 
> >> 
> >> 
> >> 
> >> On Aug 5, 2009, at 4:02 PM, Michael Petch wrote: 
> >> 
> >>> I'm unsure how the architecture is deployed and how OS/X 
> >> handles the 
> >>> physical cores, but it almost sounds like one Physical 
> core is being 
> >>> used (Using Hyperthreads to run 2 threads 
> simultaneously). I wonder 
> >>> if the memory 
> >>> is shared across all the cores? A friend of mine was 
> >> suggesting that 
> >>> people 
> >>> may have to wait for Snow Lapard to come out before OS/X properly 
> >>> utilizes the Nehalem architecture (whetehr that si true or not, I 
> >> don't know). 
> >>> 
> >>> Anyway, as an experiment. If you run 2 copies of Gnubg at 
> the same 
> >>> time (using multiple threads) do you get 400% CPU usage? 
> >>> 
> >> 
> >> 
> >> Hi Mike, 
> >> 
> >> Sorry for the delay. I just had two copies of gnubg 
> analyze the same 
> >> game, using 3 ply analysis. Each instance of gnubg used 200% 
> >> CPU. Each 
> >> copy was set to use 4 evaluation threads. 
> >> 
> >> So what's the verdict here? Is Leopard simply not directing threads 
> >> correctly? 
> >> 
> >> Louis 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> _______________________________________________ 
> >> Bug-gnubg mailing list 
> >> [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg 
> > 
> 

_______________________________________________
Bug-gnubg mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to