On 23-07-17 18:24, David Wu wrote:
> Has anyone tried this sort of idea before?

I haven't tried it, but (with the computer chess hat on) these kind of
proposals behave pretty badly when you get into situations where your
evaluation is off and there are horizon effects. The top move drops off
and now every alternative that has had less search looks better (because
it hasn't seen the disaster yet). You do not want discounting in this
situation.

It's true that a move with a superior winrate than the move with the
maximum amount of simulations is a good candidate to be better. Some
engines will extend the time when this happens. Leela will play it, in
certain conditions.

> I recall a paper published on this basis. A paper presumably about 
> CrazyStone: Efficient Selectivity and Backup Operators in
> Monte-Carlo Tree Search.

I'm reasonably sure this did not include forgetting/discounting, only
shifting between average and maximum by focusing simulations near the
maximum. It's the predecessor of UCT.

-- 
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to