I've tried several types of self-learning programs and I think the key point is that go is not a single-player game (taking apart the degrees of freedom the game has). Your algorithm has to learn how to model the enemy behaviour during the game and try to figure out what the next movements will be.
In any case, I will continue on this line because what I want is not an algorithm to play go (something quite simple, isn't it? ;-)) but an algorithm able to learn go by itself. 2010/4/27 Mark Boon <[email protected]> > For a bit of a secondary project I've been doing the past few weeks I > made a self-learning Tetris program. To my own surprise this worked > extremely well. It 'grows' an expert player in just a few thousand > generations (a few days real time) starting from zero knowledge . And > it doesn't have any 'a priori' knowledge about the game, so it adjusts > when the (scoring) rules or playing conditions change. It works so > well, I was thinking this could very well apply to computer-Go > somehow. > > Tetris is not Go of course. And also it's a single-player game, where > Go is a two-player game. So some adjustments about determining the > survivors and procreation may have to be made. It's a bit early for me > to divulge more details at this point. Also, this is research which my > current employer may not want to become public in great detail just > yet. > > But based on my findings so far I can only encourage anyone who thinks > about making a self-learning Go program to go ahead and try. > > Mark > > > On Mon, Apr 26, 2010 at 11:53 AM, Hendrik Baier > <[email protected]> wrote: > > Hello list, > > > > I am a Master's student currently working on his thesis about certain > > aspects of Monte-Carlo Go. I would like to pose a question concerning the > > literature - I hope some of you can help me out! > > > > My problem is that I can't find many papers about learning of MC playout > > policies, in particular patterns. A lot of programs seem to be using > Mogo's > > 3x3 patterns, which have been handcoded, or some variation thereof. A lot > of > > people have tried some form of pattern learning, but mostly to directly > > predict expert moves it seems, not explicitly optimizing the patterns for > > their function in an MC playout policy. Actually, I am only aware of > > "Computing Elo Ratings of Move Patterns in the Game of Go", where > patterns > > have been learned from pro moves, but then also successfully used in an > MC > > playout policy; and "Monte Carlo Simulation Balancing". > > > > Considering the huge impact local patterns have had on the success of MC > > programs, I would have expected more attention towards automatically > > learning and weighting them specifically for MC playouts. There is no > reason > > why patterns which are good for predicting experts should also be good > for > > guaranteeing diverse, balanced playout distributions. Have I missed > > something? > > > > Or how did your program come to its patterns? I'd be interested. Did you > > maybe even try learning something else than patterns for your playout > > policy? > > > > cheers, > > Hendrik > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
