On Wed, Oct 6, 2010 at 5:00 AM, Heikki Levanto <[email protected]> wrote:
> On Tue, Oct 05, 2010 at 04:28:40PM -0400, Álvaro Begué wrote:
>>
>> The correct way to evaluate an action is the expected value (yet
>> another name for the mean) of the utility function (there is a theorem
>> by Von Neumannand Morgenstern about this
>> http://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem
>> ). Since we want to win the game, it makes sense to use a utility of 1
>> for winning and 0 for losing, and this means that the probability of
>> winning should be maximized. As far as I can tell, the theory ends
>> there.
>
> I agree that we should play the move that is most likely to win, given
> reasonable play from both parts (say, play that approximates theoretically
> best, with some errors thrown in). I do not necessarily agree that the theory
> quoted above says much about how to estimate this probability using more or
> less random playouts. Calculating the winning rate after each move, using a
> selective tree search and random simulations sounds like reasonable
> assumption, and it seems to work quite well, but that seems well outside the
> VNM theory.
>

Certainly, VNM says nothing about how to estimate the expected utility
of each action. There is a complicated process to follow this decision
(the rest of the game) which we substitute by a simplistic playout
when we use MC to estimate it. I guess one could think of this process
as resulting in a score that is somewhat different from the score you
would get in the actual game, and then you would need to use some
sigmoid function on the score as your estimate of the result of the
game. So I guess there could be some theoretical justification for
doing funny things to the score at the end. If we expect our process
to have some bias with respect to the actual game (e.g., because one
of the players is known to be weaker), it would even make sense to use
dynamic komi. I have to think about it a bit more.

Álvaro.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to