Re: [Computer-go] Exploration formulas for UCT

Aja Sat, 01 Jan 2011 19:41:27 -0800

Hi Hiroshi,

(1 - beta) * (win_rate + 0.31 * sqrt( ln(parent_visits) / child_visits)) +beta (rave_win_rate * 0.31 * sqrt( ln(rave_parent_visits) /rave_child_visits))

I suggest to take off the exploration_term of RAVE, just like Silversuggested in his PhD thesis. Considering exploration for RAVE is a bitmeaningless, since in a node normally all moves are updated at the sametime.

UCT searches B(E5),W(D3),B(C5),W(F7), and in this position, playoutsearches
B(E7),W(E8),B(D8),W(F8),B(D7)...Black win.

In W(D3) positions, Aya updates RAVE and UCT,
Updates  C5(UCT)
Updates  C5(RAVE)
Updates  E7(RAVE)
Updates  D8(RAVE)
Updates  D7(RAVE)
I think "Updates C5(RAVE)" is strange, but I could not get good resultwithout this.

I can't see why it is strange and wonder why do you think so. In Erica, Iupdate C5(RAVE) as well.


 Aja


_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Exploration formulas for UCT

Reply via email to