With the help of Michi <https://github.com/pasky/michi> (thank you Petr!)
I’m currently working on adding RAVE to my UCT tree search. Before I get
too deep into it I’d like to make sure I actually understand it correctly.
It would be great if you could have a quick look at my pseudo code (mostly
stolen from michi).

Give a Node with the fields

* color
* visits,
* wins
* amaf_visits
* amaf_wins

The tree is updated after a playout in the following way:

We traverse the tree according to the moves played. visits gets incremented
unconditionally, and wins gets incremented if the playout was a win for
color. That is the same as UCT.

Then we have a look at the children of the node and increment amaf_visits
for the children if color of that particular (child) node was the first to
play on that intersection in the playout. If the playout was also a win for
the (child) node then we also increment amaf_wins.


Then we also need to change the formula to select then next node. I must
admit I just copied the one from Michi (RAVE_EQUIV = 3500. Stolen from
Michi):

win_rate = wins / plays (assumes plays will never be 0)
if amaf_plays == 0 {
  return win_rate
} else {
  rave_winrate = amaf_wins / amaf_plays
  beta = amaf_plays / ( amaf_plays + plays + plays * amaf_plays /
RAVE_EQUIV)
  return beta * rave_winrate + (1 - beta) * winrate
}

Obviously I’m not expecting anyone to actually check the formula, but it
would be great if I could get a thumbs up (or down) on the general idea.

Cheers,

Urban
-- 
Blog: http://bettong.net/
Twitter: https://twitter.com/ujh
Homepage: http://www.urbanhafner.com/
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to