Hi Joel,
I did a lot of testing and for me option A seems a little better. The
problem with just pondering the latest known position is that there is
very little tree under moves the program doesn't think the opponent will
play. It appears to be
good to get the full blown benefit of a correct prediction.
But I would by all means experiment if you are in the mood.
Here is a brief overview of how I do it, doesn't mean it's the best way:
1. Assume the opponent will play the move you think is best.
2. Set your thinking goal time as if you were not pondering.
3. Apply the expected opponent move to the board, start thinking.
4. If the opponent takes a long time to move, continue
thinking, but as soon as the opponent makes his move
then stick with the goaltime you set.
This means if the opponent takes a long think, you
might have an instant response. In either case you
will save some time. This extra time benefits
future moves.
Once you have implemented this, you will want to change
how you set your thinking "goal time." Lazarus sets
the goaltime of
each search based on a percentage of remaining time left.
Currently my only modification to this is that Lazarus
will play much more quickly if the game is nearly won. This
is useful because it enables you to allocate your time more
aggressively.
It's fairly important to allocate your time aggressively
near the beginning of the game and get progressively faster.
This is even MORE true if you are pondering. You have to
assume that you will get future benefit from the ponder
moves and plan accordingly. Pondering will be little
benefit if you are playing an opponent who plays instantly
but those players usually are weak.
When your opponent also ponders, it gets to be interesting,
because if you can play instantly, you are depriving him of
thinking time - neutralizing his pondering ability. You
can get into situations where one player is taking a long
time to think while the other player is playing instantly,
and you are playing without charge to your clock. To
combat this, chess programs sometimes implement a feature
to play obvious moves more quickly. But this is a double
edged sword - it has to be done right or you may blunder.
Of course obvious moves are generally predicted by the
opponent which means he gets to think for free - thus the
desire to deal with this. (Lazarus doesn't worry about this
now.)
One might argue that your time is better spent on algorithmic
improvements to make your program play better moves. This argument is
based on the fact that this kind of improvement is the type of thing you
might imagine being applied to a polished program that is being made
ready for competition and that it is easier to get improvements doing
other things.
However, this is worth about 50 ELO rating points and qualifies as a
substantial improvement in it's own right. You might as well have it
in your programs since it represents a substantial improvement for very
little effort. I call this "low hanging fruit" and you should pick low
hanging fruit before climbing a ladder to get the pieces of fruit at the
top of the tree.
Still, there are good reasons to delay doing this until your program is
more advanced - in most of my testing I turn pondering off. I would
suggest that you have a relatively solid UCT program in place that is
relatively bug-free before getting into this. If you are not a fairly
experienced programmer you might find implementing this messy. You
will have to get into threading or co-routines programming to make this
work. It's actually very easy with UCT since there the search is
incremental and you don't have to worry about saving
state in order to abort the search gracefully when a move is predicted
wrongly.
Lazarus also saves subtrees from the previous search. I tried pondering
based on just continuously thinking on the latest branch avialable from
the game and existing tree, but got much less than I expected.
By the way, Lazarus predicts approximately 21% of the opponent moves.
However I believe in reality is more effective that this would indicate.
When playing strong opponents it predicts more moves. Near the end of a
game the prediction rate becomes meaningless because it's just filling
the board (more or less) randomly. My 21% is based on all opponents,
all moves of the game.
Maybe I do a test later and test the prediction rate when playing strong
opponents and while the expected result of the game is relatively
uncertain.
I suspect that on bigger boards, pondering will be less beneficial, at
least
until programs get substantially stronger and thus prediction rates get
higher.
- Don
On Tue, 2007-05-01 at 17:24 +1000, Joel Veness wrote:
> Hello,
>
> I have lurked for a while now on this list, and now that I finally
> have a weak monte-carlo/UCT program (goanna) up on CGOS, I decided
> that I should say hello. I am going to be spending some time on this
> project in the next few months, hopefully learning something about
> monte-carlo based tree search in the process.
>
> I have a couple of questions regarding pondering and GTP. Since the
> protocol unfortunately does not support it in a clean way, how have
> people tackled this problem?
>
> I was planning to invoke a search after a "genmove" command and then
> try and use the results of the speculative search if the next command
> is a "play" command. Is there any obvious problems with this approach?
>
> Also, I can see two different ways to ponder, and at the moment I am
> undecided as to what makes more sense. I should say that I am only
> interested in the case where two reasonably strong programs play each
> other. Here is a summary of my thoughts:
>
> Suppose your opponent thinks for X seconds on average.
>
> You can:
>
> a) Guess your opponents next response, and assume they will make this
> move. Fire off a search from the resultant position. If you guess
> correctly, then you save X seconds. But if you only guess correctly p
> % of the time, you expect to gain pX seconds of extra thinking time
> per move.
>
> b) Think as if you were your opponent. Once your opponent makes a
> move, you keep the relevant sub-tree. This means that you will always
> get useful information from each ponder, but (assuming that you don't
> use the transposition information) you have wasted time searching
> moves the opponent didn't choose. I think a crude way of estimating
> the amount of time gained by this form of pondering would be to
> determine the expected value of:
>
> 'the size of subtree left after opponent move' / 'size of ponder subtree' * X
>
>
> My intuition suggests that b) is the better approach, but I know that
> a) works much better in computer chess.
>
> Any comments would be most appreciated.
>
> Joel Veness
> _______________________________________________
> computer-go mailing list
> [email protected]
> http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/