I can't help but feel we are missing something.   With UCT we miss those
wonderful beta cutoff's that you get with straightforward alpha beta
pruning - but with alpha beta pruning you are still exploring a lot more
useless nodes.    It seems the 2 are just not very compatible.

Currently UCT does seem like the best choice by far.
Yes, in the current state of the art. What would be great is to understand in
what extend this would continue to be true if we change the evaluation
function. What I believe (but this is only intuition, this have to be
proved !), is that UCT is very efficient in Go because our evaluation
function is bad. With a good/(very)^+ good evaluation function, alpha-beta
would become much better than UCT. I think this is clearly the case if you
can reach the end of the game in the tree for example.
I don't know if someone tried UCT in chess to see the results. As we often
assume that the evaluation function in chess is much better than in Go, it
would be a good test case?

I have not tried UCT for chess, but i have tried to combine UCT with the Suzie-Eval. Not to the end of the game, but up to 11 plies deep. I compared this with the standard 4 to 5 Plies Alpha-Beta search. All experiments were done on 19x19 (Suzie runs only at that size). First of all I had better results with iterative deepening. Searching 1,3,5,7,9,11 Plies. Each depth was given a fixed node count. Only when the best move was clearly better the next iteration depth was started earlier. It producted reasonable strategic play, some good moves, but the number of really bad moves was also higher than with Alpha-Beta. The increased search depth and the random nature of the search finds better the stupidities of the eval. I stopped this experiments, because I had/have the feeling that the playing behaviour is dominated in all cases by the eval and that I should spent effort there. Additionally with a shallow Alpha-Beta search it is easier to find the reason of bad moves. One can go down the criticial line and can look what went wrong. With UCT this is almost impossible. I tried out also some sort of Alpha-Cutoff. As long as the current best move in a node was good enough to refute the line, no new move was tried out. The whole search was a (strange) mixture of the UCT idea and Alpha-Beta.
The Suzie eval is also too slow for statistic sampling on 19x19.


In hindsight I am also from the theoretic point of view sceptical about the idea. One probably gets the Austrian-Opinion-Poll problem. It seems to be very difficult to create an unbiased eval. It is also a problem of weighting. One extreme value can dominate the other more realistic ones. One has certainly to introduce some form of robust statistics. E.g. using the median instead of the mean or ignoring the minimum and the maximum value.


Chrilly

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to