Re: [Computer-go] UCT parameters and application to other games

Rémi Coulom Sat, 26 Mar 2011 05:47:59 -0700

Hi,

Strength of a MC programs depends a lot on playout knowledge. If you select 
legal moves uniformly at random, except for eye-filling moves, then it is 
normal that your program is weak. You should at least reply to atari by 
extending or capturing a contiguous string. Avoiding self atari of large 
strings or playing good patterns with higher probability helps a lot, too.


Rémi

On 26 mars 2011, at 12:43, Daniel Shawul wrote:

> Hello,
> 
> I am very new to UCT,  just implemented basic UCT for go yesterday.  
> But with no success so far for GO,I think  mostly because it searches not 
> very deep (depth = 3 on a 5 sec search with those values). 
> I am using the following values as UCT parameters
> 
> UCTK = sqrt(1/5) = 0.44     UCTN = 10 (visits afte which best move is 
> expanded)
> 
> Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start 
> position for a 5 sec search.
> For how deep a search should I tune these parameter for ?
> Before UCT,  I have an alpha-beta searcher which sometimes plays on CGOS. 
> It reached a level of ~1500, and this engine seems to be too strong for the 
> UCT version.
>  It just gets outsearched in some tactical positions and also in evaluation I 
> think.
> For example, I have an evaluation term which gives big bonuses for connected 
> strings which seems
> to give an edge in a lot of games.. How do you introduce such eval terms in 
> UCT ?
> 
> But for my checkers program , to my big surprise , UCT made a significant 
> impact. The regular
> alpha-beta searcher averages a depth=25 but the UCT version I think is 
> equally strong from the games
> I saw. That was a kind of surprise for me because I thought UCT would work 
> better for bushy trees and
> when the eval has a lot of strategy. It also reached good depths averaging 16 
> plies . 
> My checkers eval had only material in it, so I don't know if UCT is bringing 
> strategy (distant information) to the game
> which the other one don't have.The games are not really played out to the end 
> rather to a MAX_PLY = 96
> afte which the material is counted and a WDL score is assigned (I call it 
> partial playout).
> Also the fact that captures are forced seem to help a lot because it doesn't 
> make too many mistakes.
> 
> I also found out some positions where it encounters similar problems as 
> ladders in go. But in the checkers case,
> this problems are still solved correctly. Only problem is that it doesn't 
> report correct looking winning rates.
> For example, in a position with two kings where one of the kings is chasing 
> the other to the sides to mate it, but
> the loosing king can draw by making a serious of correct moves to get itself 
> to one of the safe corners; The program
> displays winning rates of 0.01 (when it should have been more like 0.5) but 
> it still manages the draw !
> 
> thanks and apologies for the verbose email
> Daniel
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] UCT parameters and application to other games

Reply via email to