>I just wondered if this new Fermi GPU solves the issues for go >playouts, or don't really make any difference?
My first impression of Fermi is very positive. Fermi contains a lot of features that make general purpose computing on a GPU much easier and better performing. However, it remains the case that all kernels on a multiprocessor must execute the same instruction on each cycle. When executing if/then/else logic, this implies that if *any* core needs to execute a branch, then *all* cores must wait for those instructions to complete. Playout policies have a lot of if/then/else logic. Sequential processors handle such code quite well, because most of it isn't executed. But when you have 32 playouts executing in parallel, then there is a high chance that both branches will be needed. This really cuts into the potential gain. Amdahl's law is a factor, as well. Amdahl's law says that the gain from parallelization is limited when some aspects of the solution execute sequentially. For example, the GPU has to generate positions and transfer them to the GPU for playout. Generation and transfer are sequential. Because of such overhead, massively parallel programs generally need very high increases from parallelization. Clock speed is also a factor. CPUS execute at over 3 GHz, and because of speculative execution they often execute more than one instruction per clock. The GPU generally has a clock rate ~ 1 GHz, and most general purpose instructions require multiple clocks. So you must have a large parallel speedup just to break even. (Unless you can exploit some of the specialized graphics instructions, such as texture mapping, that equate to dozens of sequential instructions yet execute as a single instruction on the GPU. I don't think computer Go has that possibility.) So I am not convinced yet, but Fermi is a big step (really many small steps) in the right direction. _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
