> Now it seems to me that this is related to the way playouts are done > and it will be difficult to improve with Mogo style (rule-based) > playouts above certain strength, without using larger patterns and next > move choice based on probability distribution. Currently, playing out > a simple joseki in a sensible way in simulations will just never happen. > This is a bit frustrating since all my attempts at successfully > implementing probdist-based playouts have failed so far, but I guess > I will just have to try again...
To implement softmax, you can refer to my thesis where I have described the framework of the move generator for the playout. Detecting forbidden moves and replacing useless moves by better alternatives are very useful. There you can gain a lot by applying much Go-knowledge. Two good candidate algorithms for training the feature weights are MM and SB(Simulation Balancing). I tried hard but failed to measure any improvement from SB gammas (trained on 9x9) on 19x19. You can use CLOP to tune the MM gammas which are far from optimal according to our experience. Also, my regression test of seki and L&D that pachi has participated could be helpful to improve program's tactical strength. In my opinion, that is the most crucial factor to reach high-dan level. Cheers, Aja
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
