Re: [Computer-go] A Linear Classifier Outperforms UCT on 9x9 Go

Peter Drake Thu, 30 Jun 2011 08:27:45 -0700

On Jun 29, 2011, at 5:09 PM, Imran Hendley wrote:

Thanks for the detailed explanation of the paper.
Would it make sense to vary the number of moves generated by theclassifier as you run more playouts? Have you tried this? It seemslike the classifier would return garbage initially and slowly givebetter moves deeper down the sequence, analogous to descending thetree in MCTS.

We tried this briefly, setting the "cutoff" (number of moves generatedby the classifier) to 1 + (growth * #playouts), where growth is aparameter such as 0.002. This didn't help, but it's conceivable thatsome other schedule might.

You mentioned that adding more than two previous moves as (linearlyindependent) input terms does worse. What happens when you startcombining moves into a single feature? have you tried just onefeature with a 1 at each of the two previous move locations? Or a 1and a c<1? Or what about using this as a third term, like y[i] =w1[i]*m1 + w2[i]*m2 + w12[i]+m12 + b[i]?

We haven't really tried this; that table would be very large (boardarea squared), but it could be done.

In the paper you say you only consider local moves, which is naturalbecause your input vectors represent the last two moves, which wealready know are very important for predicting local moves.

I can't find the word "local" in the paper. Can you find the statementyou're referring to?

What steps can we take to try and learn from other features of thegame? One way to add patterns to the classifier might be to haveinput vectors for 3x3 patterns. Instead of a 1 at the location ofall the stones in the 3x3 pattern you could have some small value,and zero elsewhere. So the output for some square would look likey[i] = w1[i]*m1 + w2[i]*m2 + w3[i]*p[i]. Or maybe you don't evenneed the m1 and m2 terms for non-local moves. You could add othertypes of features too (atari, capture, extend, etc.) by puttingsmall values in input vectors.

We tried looking at local patterns and at board locations in 3x3 orlarge-knight's-move neighborhoods. Disappointingly, neither of thesethings helped.

And this is where offline learning from game records could come inhandy, for initializing the p[i]'s, etc.

We tried pre-initializing the weights to bias the system in favor ofplaying (a) near the center (in 9x9 games) and (b) near the two recentmoves. Again, no improvement.

Of course, it's possible that one of these ideas is valid and we justdid it wrong. We welcome experiments by others!


Thanks,

Peter Drake
http://www.lclark.edu/~drake/

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] A Linear Classifier Outperforms UCT on 9x9 Go

Reply via email to