Hi!

On Mon, Apr 26, 2010 at 11:53:48PM +0200, Hendrik Baier wrote:
> My problem is that I can't find many papers about learning of MC
> playout policies, in particular patterns. A lot of programs seem to
> be using Mogo's 3x3 patterns, which have been handcoded, or some
> variation thereof. A lot of people have tried some form of pattern
> learning, but mostly to directly predict expert moves it seems, not
> explicitly optimizing the patterns for their function in an MC
> playout policy. Actually, I am only aware of "Computing Elo Ratings
> of Move Patterns in the Game of Go", where patterns have been
> learned from pro moves, but then also successfully used in an MC
> playout policy; and "Monte Carlo Simulation Balancing".

  I'm not aware of any research in this direction other than the
(Silver and Tesauro, 2009) paper:

        http://conflate.net/icml/paper/2009/500

  I think David Silver implied he is continuing research in this
direction, but I'm not sure at all.

> Considering the huge impact local patterns have had on the success
> of MC programs, I would have expected more attention towards
> automatically learning and weighting them specifically for MC
> playouts. There is no reason why patterns which are good for
> predicting experts should also be good for guaranteeing diverse,
> balanced playout distributions. Have I missed something?
> 
> Or how did your program come to its patterns? I'd be interested. Did
> you maybe even try learning something else than patterns for your
> playout policy?

  An important point is that the move selection in playouts is (I think
in all implementations?) always randomized, that is even in case of
matching pattern there is small chance it won't be played. This usually
manages to spread out simulations reasonably evenly. My gut feeling is
that MC-specific patterns won't give a large boost, but of course I can
be wrong.

  I think most researchers are focused on integrating MCTS with other
evaluation functions, or adding more expert knowledge. I personally
think the best direction is simulation learning from the tree search,
among other things.

-- 
                                Petr "Pasky" Baudis
When I feel like exercising, I just lie down until the feeling
goes away.  -- xed_over
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to