On Fri, Jun 24, 2011 at 1:40 AM, Darren Cook <[email protected]> wrote:

> On 2011-06-21 02:53, Álvaro Begué wrote:
> > I think when you have explored a node enough times, there is no point
> > in considering the score of the node to be the average of all the
> > tries, but it should really be just the score of the best child (i.e.,
> > the minimax rule). UCT does converge to that value, but it does so by
> > reducing exploration of inferior moves, which results in the
> > long-lines behavior I just described.
> >
> > Perhaps there is a role for alpha-beta near the root of the tree when
> > we have enough CPU, and that might scale better.
>

Referring to what Álvaro Begué wrote,  I think the idea is sound,  MCTS
really
is mini-max and considering the score of the other nodes is just a noise
reduction technique which is not strictly necessary.    When there are few
samples it's surely a benefit but many not when there are many.   Perhaps
the influence of sibling scores should be gradually removed?    I guess in
practice that is what happens when one move is so popular others rarely
get played.


>
> One of my favourite subjects. I noticed, about 3-4 years ago now, from
> my sm9 (human-computer team 9x9 go) experiments that an MCTS program
> would usually have chosen a strong move after say 50,000 playouts, but
> when it was wrong it typically would still be wrong after a million
> playouts. (Very subjective, sorry Don ;-)
>

I have no problem with empirical and subject observations when it's used to
for
hypothesis building.   But once you view something as a fact then you need
to
be correct  because now new ideas are built upon it and then you could have
a
mess!     For example  once you accept as fact that the earth is the center
of
the universe you cannot make further progress in understanding the universe.



>
> Hence the proposal to use alpha-beta as the top-level search, using MCTS
> with about 50K playouts at the nodes. I've done a few experiments in
> this direction, and I still think it is very promising. Technically the
> current state of sm9 automation is minimax on top of 4 MCTS and one
> traditional go program. (But very few nodes in the minimax tree as I
> give each program a few minutes of CPU time for every move.)
>
> Darren
>
>
> --
> Darren Cook, Software Researcher/Developer
>
> http://dcook.org/work/ (About me and my work)
> http://dcook.org/blogs.html (My blogs and articles)
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to