I've got access to a single threaded machine and just did the same quick test
with a few different builds (all on windows with gcc 3.4.5 -O3):
set gnubgid 4HPwATDgc/ABMA:cAkNAAAAAAAA
eval
And I got these results:
st mt mt(1) mt(2) mt(3)
54.254 55.259 54.000 55.094 53.379
54.098 54.972 53.974 54.755 53.096
------ ------ ------ ------ ------
54.176 55.116 53.987 54.925 53.238
% diff 1.73% -0.35% 1.38% -1.73%
Notes
1 Cache locking code commented out to establish if this is cause of slower
times 2 Cache locking switched off by function pointer (attempt to speed up mt
times)
3 Same as 2 but with experimental position key code
The first two rows are separate tests and then the average below, a quick look
shows that they aren't particularly accurate but it gives a rough idea.
The problem with (2) is that this would slow down multiple thread runs. I think
we should be optimising for multiple core use. We could duplicate even more
code and then either do a evaluation with/without cache locking (depending on
the number of eval threads) - and this should give the same performance as the
single threaded builds. Maybe some clever use of the preprocessor could
minimise the amount of duplicated source.
My rewrite of the PositionKey functions seems to give about a 3% increase so
with the new sigmoid function we might have a compelling reason for people to
upgrade to the latest version.
Jon
Christian Anthon wrote:
> I have timed some simple evaluations of the opening positions using
> various compile settings. The following times is reported for each of
> the compile settings.
>
> A. 3x 4ply evaluation (clearing the cache in between with a command that
> is not in the present code)
> B. 3x clearing the cache without any evaluation
> C 1000x 2ply evaluation (clearing the cache in between)
> D 1000x clearing the cache
>
> The lost time is from locking and unlocking the cache, I believe.
>
> threaded
> 146.307531
> 0.011090
> 104.297596
> 3.803742
>
> non-threaded
> 138.310104
> 0.010516
> 92.876412
> 3.614214
>
> threaded-sigmoidSSE
> 139.664481
> 0.011588
> 95.686871
> 3.824007
>
> non-threaded-sigmoidSSE
> 131.947215
> 0.010806
> 87.237141
> 3.605156
>
> from timeit import *
>
> gnubg.command("set gnubgid 4HPwATDgc/ABMA:cAkNAAAAAAAA")
>
> gnubg.command("set evaluation cube evaluation plies 4")
> t = Timer('gnubg.command("clear cache"); gnubg.command("eval")', 'import
> gnubg')
> print "%f" % t.timeit(3)
>
> t = Timer('gnubg.command("clear cache")', 'import gnubg')
> print "%f" % t.timeit(3)
>
> gnubg.command("set evaluation cube evaluation plies 2")
> t = Timer('gnubg.command("clear cache"); gnubg.command("eval")', 'import
> gnubg')
> print "%f" % t.timeit(1000)
>
> t = Timer('gnubg.command("clear cache")', 'import gnubg')
> print "%f" % t.timeit(1000)
>
>
>
> On Wed, Apr 29, 2009 at 2:38 PM, Massimiliano Maini
> >
> wrote:
>
>
>
> Jonathan Kinsey > wrote on 29/04/2009 12:54:26:
>
>
> > Massimiliano Maini wrote:
> >>
> >> Christian Anthon wrote on 29/04/2009 10:23:59:
> >>
> >>> On Wed, Apr 29, 2009 at 10:04 AM, Massimiliano Maini
> >>> [email protected] > wrote:
> >>>
> >>> bug-gnubg-bounces+massimiliano.maini=amadeus.com
> @gnu.org wrote on
> >>> 28/04/2009 22:01:23:
> >>>
> >>> MaX build with single thread : ~32400 eval/s
> >>> MaX build with MT code, 1 thread : ~24800 eval/s
> >>> MaX build with MT code, 2 threads : ~34600 eval/s
> >>>
> >>> However, a quick rollout (648 trials, expert, full, 2 top moves of
> >> postion
> >>> t60BYCButycAAA:cAnnAWAASAAA) has shown the following:
> >>>
> >>> MaX build with single thread : 2m04s
> >>> MaX build with MT code, 1 thread : 2m04s
> >>> MaX build with MT code, 2 threads : 1m48s
> >>>
> >>> I'm much more worried about the last two numbers here. MT code
> >>> should give close to twice the speed, or we are doing something
> wrong.
> >>
> >> Here at office the PC is single core, don't know if this
> explains the
> >> "poor" result. I'll check at home (dual core).
> >
> > You did say the pc was "1 core, 2 threads", does this mean it's a
> > hyper-threaded
> > machine? That would match a small increase for 2 threads,
>
> Yes, 1 core with hyper-thread. I wasn't really surprised by the
> small increase.
>
> > note also that the 1
> > thread test will be using 2 threads (one for the gui and one for the
> > evaluations
> > - the gui thread will only be redrawing the screen).
>
> I run the calibrate on the command line version and the rollout in
> the gui
> one. Not sure it's a big deal however ... just a progress bar and a
> few numbers
> updated from time to time ...
>
> > The best test would be on a simple single core/processor machine,
> these are
> > getting quite rare, all the pcs I see are multi-core now.
>
> MaX.
>
>
_________________________________________________________________
Get the best of MSN on your mobile
http://clk.atdmt.com/UKM/go/147991039/direct/01/_______________________________________________
Bug-gnubg mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-gnubg