Thank you Sylvain, your explanation is enough for me. -- Yamato
>Hi Yamato, > >> However I cannot find any explanation for it. >> Does anyone know what Discounted UCB is? > >"Discounted" means you forget somehow the past. More precisely, if "w" >is your count of wins, and "t" your total playouts, and "r" the >results of the current simulation, instead of doing: > >w <- w+r >t <- t+1 > >you do, with gamma <1: > >w<- gamma *w + r >t <- gamma*t + 1 > >So it is as if you kept a memory of the order of 1/(1-gamma) > >> Is it useful for MC Go? >The idea is appealing for UCT, as the distribution of the arms is not >stationary, and discounting is the simplest idea to deal with >non-stationarity. >However, all my trials in this direction had been unsuccessful. Maybe >some succeed I don't know. > >I hope that makes things clearer, >Sylvain >_______________________________________________ >computer-go mailing list >[email protected] >http://www.computer-go.org/mailman/listinfo/computer-go/ -------------------------------------- Start Yahoo! Auction now! Check out the cool campaign http://pr.mail.yahoo.co.jp/auction/ _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
