Re: [Computer-go] A Linear Classifier Outperforms UCT on 9x9 Go

Peter Drake Wed, 29 Jun 2011 13:05:07 -0700

On Jun 29, 2011, at 10:17 AM, Brian Sheppard wrote:

Why is a classifier better than having a lookup table indexed byOurLastMove, OppLastMove, ProposedNextMove that returns the Wins /Trials experienced when ProposedNextMove is played after thesequence OurLastMove, OppLastMove?


The advantage here is that we combine information from several piles:

- All times this move was played.
- All times this move was played in response to previous move X.
- All times this move was played in response to penultimate move Y.

The scheme you propose only gathers:

- All times this move was played in response to previous move X andpenultimate move Y.

This information is more accurate, but accumulates more slowly. (Seethe Power of Forgetting paper for more discussion on this.)

Are the training cases for your classifier selected from only theUCT nodes, or also from playout nodes?


From the entire playout.

Is the output of your classifier used to initialize the Wins /Trials values for legal moves in new UCT nodes? Is that done byassuming a fixed number of trials (how many?) and setting Wins =ClassifierOutput * Trials?

There is no tree in this system. The primary policy (used for thefirst 10 moves of each playout) is to choose the (legal) move that theclassifier rates highest.

Is that the only use of the classifier in the system?


The above is the only use of the classifier.

Peter Drake
http://www.lclark.edu/~drake/

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] A Linear Classifier Outperforms UCT on 9x9 Go

Reply via email to