Re: [computer-go] Definition for monte carlo

Don Dailey Wed, 31 Oct 2007 09:00:39 -0800

This type of behavior of course is unnatural for human players and even
offends our eyes so to speak,  but it's perfectly logical.


Keep in mind that for a program to do this,  there has to be NO
difference between the winning odds of playing G1 or K1.   When a
program does this it's not being recklessly foolish.   It's actually
being conservative and pragmatic.  

In your example,  I can image that if you could make it favor the
greater win when it clearly makes no difference whatsoever,  you might
increase the playing strength in a very minor way.     You would be
making is slightly more robust in the face of miscalculations.      

I think the right approach was mentioned just recently,  add a very tiny
incentive that cannot effect any bits of the main calculation as a kind
of tie-breaker.   I doubt this will buy much, but it might help just
slightly and make it appear to play more "natural" by our standards.   
But I know trying to fix this any other way reduces the playing strength
significantly.   


- Don



Lavergne Thomas wrote:
> On Tue, Oct 30, 2007 at 08:07:49PM +0100, Heikki Levanto wrote:
>   
>> On Tue, Oct 30, 2007 at 02:45:56PM -0400, Jason House wrote:
>>     
>>> Similarly, I've been in won games and gotten bitten by a tesuji by the
>>> opponent.  If I had been just a bit safer in my play, I could have had a
>>> comfortable win.  Similarly, reasonable MC bots solidify the core to win
>>> rather than try to keep everything.
>>>       
>> I would like to think so too, but that is not what I am seeing. In my own
>> small experiments, I saw this kind of things often enough (bottom edge of the
>> board):
>>
>>  3 O O O O O O O + + + + + +
>>  2 O X X X X X O O O O O O O 
>>  1 O X + X + X + X X + X X O
>>    A B C D E F G H J K L M N
>>
>> Any "sane" human player would connect at G1 first, and K1 later. But to a MC
>> player, those two are very close to equal, as long as the game seems to be
>> decided by five points or more, or there is another similar situation on the
>> board, and the program can be sure of getting one of them.
>>
>>     
>
> This is expected for a monte-carlo evaluator. Let assume that any other
> play on the board cannot change the result, and black is more that 5
> points in advance, you can play either move theresult will be same : a
> victory. The only thing that will change is the score of the game, but
> in most MC engine the score is not used, only the result. So you can
> make all simulation you want, the score for g1 and k1 will be same : 1.0
>
> MC tend to choose the safer path in the tree. At a given node, if you
> have only 2 possible moves :
>   - the first one is a clever tesuji which lead to a 50pt win but need
>     to folow a strict sequence of 10 moves or you loose,
>   - the second one assure you a small victory of 0.5pt even if you make
>     some mistakes later in the game.
> Your program will choose the second move because more random simulation
> lead to a win in the second case.
>
> The first type of move is the ones that you can look in professional
> games where pro are always on the line and any mistakes can make you
> loose the game. Second one reflect the safest way of playing when you
> have some advance in the game, and when you just want to win the game.
> You just make your group fully alive and solid, and you prevent any
> invasion from your oponent.
>
> Tom
>
>   
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Definition for monte carlo

Reply via email to