hi Michael,
I agree with your views. Playout policy is rather important. I will try
Professor Drake's "last good reply" in the playout and report the result
next week.
Aja
-----原始郵件-----
From: Michael Williams
Sent: Thursday, January 20, 2011 10:44 AM
To: [email protected]
Subject: Re: [Computer-go] semeai example of winning rate
On Wed, Jan 19, 2011 at 9:12 AM, Brian Sheppard <[email protected]> wrote:
The risk to scalability is that we will bias the search by focusing on
variations that a blitz program cannot discover, but a massively scalable
system could.
Another possible instance: Pachi's playout policy. Pachi has conditioned
each generator in the playout policy on a probability weight. For example,
the rule that says "play around the last point if you see this pattern" is
now only executed with a certain probability. IIRC, they report that
executing each rule with 90% probability is marginally better than using
100%. I am pretty sure that deep and shallow searchers can differ on this.
A
deep search can afford to explore because the MCTS tree is large and will
sort things out, whereas a shallow tree is better off gambling that its
rule
is correct.
A deep and long search has many short and shallow searches near the
fringe of the tree.
It seems to me that improvements in playout policy would apply to any
time control. But perhaps in-tree heuristics should be dependent on
the number of visits to the node.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go