Re: [Computer-go] learning patterns for mc go

Jason House Wed, 19 May 2010 20:13:42 -0700

I mostly skimmed it, but here's what I got from it: In a simulation,pick moves based off the leaf node's RAVE values, but discount moveswhose follow-up moves have already been taken.

The tiling is simply a tracking of how effect a move is when combinedwith a specific follow-up move. Near the start of a simulation, thiswould match RAVE values. Deep in a simulation, it's highly situationaland based on which follow-up moves remain open.


I hope that helps!

Sent from my iPhone

On May 19, 2010, at 7:50 PM, Darren Cook <[email protected]> wrote:

My problem is that I can't find many papers about learning of MCplayout
policies, in particular patterns.


A just published paper about learning MC policies:
http://hal.inria.fr/inria-00456422/fr/
It works quite well for Havannah (not tested on hex I think).

I struggled with this paper ("Multiple Overlapping Tiles forContextual

Monte Carlo Tree Search"), as it wasn't clear to me what a "tile" was.
Specifically I couldn't work out if they were 2d patterns of
black/white/empty, or are they are a sequence of moves (e.g. joseki,
forcing moves, endgame sente/gote sequences, etc. in go)? Or perhaps
something else altogether?

While I wear the dunce's cap and stand in the corner, is some kindsoul

able to explain the idea in go terms?

Thanks,
Darren


--
Darren Cook, Software Researcher/Developer

http://dcook.org/gobet/  (Shodan Go Bet - who will win?)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] learning patterns for mc go

Reply via email to