Quoting Bojun Huang <[email protected]>:
It seems to me that, there is a thread of efforts that try to
improve the playing capability of GO bots by dramatically increasing
playouts/sec. Now we know that FPGA, GPU, and SIMD can make much
more playouts per second than single-core CPU, but all these results
are based on "light" playout schemes. So everytime when these kind
of results come out, people would doubt the likelihood that these
designs really generate strong programs.
So my question is, Is there a "widely accepted" baseline performance
to compare with for all these works?
For example, we may pick a known program with "lightest" playout
scheme among those frequently attending the KGS monthly. So if a
high-performance design implements similar playout scheme of that
program but achieves much higher playout/sec, we could reasonably
expect a stronger program based on this design.
Another question ... does more playouts really provide a
*consistent* improvement on the ELO score, especially for those
strongest programs? I remember that some programs running on laptop
rank very high in the Olympaids, that seems imply that speed simply
doesn't matter here ...
Valkyria is very heavy but speed still does matter. If Valkyria is
allowed to search twice as deep it will play much better. And this is
true for all the strong and heavy programs I think.
If a MCTS does not play better if it is allowed to search deeper
(either by being faster or having longer thinking times) there is a
bug in the program that prevents progress.
Best
Magnus
Thanks,
Bojun Huang
Date: Wed, 25 May 2011 22:23:29 +0200
From: Antoine de Maricourt <[email protected]>
To: [email protected]
Subject: Re: [Computer-go] Direct DX11 and graphics cards for cheaper
simulation hardware?
Message-ID: [email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Despite the challenges using it in a tree, and the contentious issue of
whether light playouts can make a really strong program, I think this is
interesting research. By 1.6 times quicker than libego, do you mean as
it runs on the CPU? Or is this a simulated speed as if it was running on
the GPU? I think libego was the clear leader in light playout speed, so
working out a way to do playouts even faster (if that is what you have
done) is amazing.
I just emulated data structures and algorithms that are targeting GPU
in C++ for a CPU. 128-bit CPU's SIMD instruction set simply emulates 4
GPU-like threads working on 32-bit registers. After several attempts
made to test various ideas, the first complete implementation had
performances similar to libego, without a simple CPU specific
optimization. I then put back some specific CPU optimizations (not
likely to be effective on GPU) + tuning and easily improved the
performances. This is really how it runs on the CPU. The same data
structure and algorithm is likely to have an even better ratio against
libego with an AVX enabled processor.
Light playout was a beginning to start with. The random move generator
has been designed to take into account a probability distribution (with
a little slowdown) that can be derived from local pattern matching.
Regards,
Antoine
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go