Hi Hiroshi,
(1 - beta) * (win_rate + 0.31 * sqrt( ln(parent_visits) / child_visits)) +
beta (rave_win_rate * 0.31 * sqrt( ln(rave_parent_visits) /
rave_child_visits))
I suggest to take off the exploration_term of RAVE, just like Silver
suggested in his PhD thesis. Considering exploration for RAVE is a bit
meaningless, since in a node normally all moves are updated at the same
time.
UCT searches B(E5),W(D3),B(C5),W(F7), and in this position, playout
searches
B(E7),W(E8),B(D8),W(F8),B(D7)...Black win.
In W(D3) positions, Aya updates RAVE and UCT,
Updates C5(UCT)
Updates C5(RAVE)
Updates E7(RAVE)
Updates D8(RAVE)
Updates D7(RAVE)
I think "Updates C5(RAVE)" is strange, but I could not get good result
without this.
I can't see why it is strange and wonder why do you think so. In Erica, I
update C5(RAVE) as well.
Aja
_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go