Hi, Strength of a MC programs depends a lot on playout knowledge. If you select legal moves uniformly at random, except for eye-filling moves, then it is normal that your program is weak. You should at least reply to atari by extending or capturing a contiguous string. Avoiding self atari of large strings or playing good patterns with higher probability helps a lot, too.
Rémi On 26 mars 2011, at 12:43, Daniel Shawul wrote: > Hello, > > I am very new to UCT, just implemented basic UCT for go yesterday. > But with no success so far for GO,I think mostly because it searches not > very deep (depth = 3 on a 5 sec search with those values). > I am using the following values as UCT parameters > > UCTK = sqrt(1/5) = 0.44 UCTN = 10 (visits afte which best move is > expanded) > > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start > position for a 5 sec search. > For how deep a search should I tune these parameter for ? > Before UCT, I have an alpha-beta searcher which sometimes plays on CGOS. > It reached a level of ~1500, and this engine seems to be too strong for the > UCT version. > It just gets outsearched in some tactical positions and also in evaluation I > think. > For example, I have an evaluation term which gives big bonuses for connected > strings which seems > to give an edge in a lot of games.. How do you introduce such eval terms in > UCT ? > > But for my checkers program , to my big surprise , UCT made a significant > impact. The regular > alpha-beta searcher averages a depth=25 but the UCT version I think is > equally strong from the games > I saw. That was a kind of surprise for me because I thought UCT would work > better for bushy trees and > when the eval has a lot of strategy. It also reached good depths averaging 16 > plies . > My checkers eval had only material in it, so I don't know if UCT is bringing > strategy (distant information) to the game > which the other one don't have.The games are not really played out to the end > rather to a MAX_PLY = 96 > afte which the material is counted and a WDL score is assigned (I call it > partial playout). > Also the fact that captures are forced seem to help a lot because it doesn't > make too many mistakes. > > I also found out some positions where it encounters similar problems as > ladders in go. But in the checkers case, > this problems are still solved correctly. Only problem is that it doesn't > report correct looking winning rates. > For example, in a position with two kings where one of the kings is chasing > the other to the sides to mate it, but > the loosing king can draw by making a serious of correct moves to get itself > to one of the safe corners; The program > displays winning rates of 0.01 (when it should have been more like 0.5) but > it still manages the draw ! > > thanks and apologies for the verbose email > Daniel > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
