I'm just digging through tons of posts I didn't have time to read.
This one is particularly interesting for me.
Thanks for sharing this idea.
I don't have slow tactical reader, but it helps me to understand why
heavy playouts work while
direct optimization of strength of the playout doesn't help.
I wonder what would happen if we limit capture and atari heuristics to
only prehistoric stones.
Lukasz
On Mon, Feb 2, 2009 at 13:36, Mark Boon tesujisoftw...@gmail.com wrote:
I haven't gotten very far yet in incorporating many of the suggestions
published on this mailing-list into the MCTS ref-bot. As such I feel I still
have a lot of catching up to do when it comes to MC programs, mostly due to
lack of time.
But I had an idea I wanted to share as I haven't seen anything like it
described here. It comes forth from an observation I had when looking at
playouts and what effects some of the patterns had on it. So far it's my
opinion that guiding playouts is mostly useful in order to maintain certain
features of the original position and prevent the random walk from stumbling
into an unreasonable result. As an example I'm going to use the simple case
of a stone in atari that cannot escape. When random play tries an escaping
move, I make the program automatically play the capturing move to maintain
the status of the stone(s) (now more than one) in atari. When implementing
something like that in the playouts however, more often than not this
'pattern' arises not based on the original position but purely from the
random play. I figured it doesn't help the program at all trying to maintain
the captured status of a stone or stones that weren't even on the board at
the start of the playout.
So I tried a simple experiment: whenever a single stone is placed on the
board I record the time (move-number really) it was created in an array I
call stoneAge. When more stones are connected to the original they get the
same age. When two chains merge I pick an arbitrary age of the two (I could
have picked the smallest number, but it doesn't really matter). So for each
chain of stones the array marks the earliest time of creation. Next, when a
playout starts, I mark the starting time in a variable I call 'playoutStart'
and there's a simple function:
boolean isPrehistoric(int chain)
{
return stoneAge[chain]=playoutStart;
}
During playout, I only apply the tactical reader to chains for which the
isPrehistoric() function returns true. Tests show that using this method
doesn't affect the strength of the program at all. But the amount of time
spent in the tactical reader is cut in less than half.
I'm suspecting the same holds true to a large degree for other patterns, but
I haven't had the time yet to test that. Other cases may not provide as much
gain because they are cheaper to compute. But I think in general it's better
to let the random play run its course as much as possible and restrict moves
guided by patterns as much as possible to situations relevant to the
original position. The stone-age information is very cheap to maintain so
it's hardly a burden to use.
Hope this helps anyone, especially those with slow tactical readers :) If
anyone manages to use this successfully in other situations than tactical
readers I'd be interested to hear it, as so far it's only a hunch that this
has wider applicability than just tactics. I was going to wait until posting
this until I had time to try it out for myself but lately I didn't have the
time.
Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/