Re: [computer-go] Depth dependent evaluation effects on monte carlo searches

Jason House Sat, 09 Jun 2007 17:07:09 -0700

Magnus Persson wrote:

Quoting Jason House <[EMAIL PROTECTED]>:
Does anyone have any data on just how optimistic or pessimistic theresultswould be? I'd like to use some heuristics that inherit winningpercentagesfrom a parent node to bias the expected winning percentage of thechildrennodes... and maybe pruning away portions of a search in a fixed depthmonte
carlo search with iterative deepening.
This is confusing to me. Are you asking how UCT behaves in order toimplement
something which is not UCT?

I'd say that I want to understand the nature of monte carlosimulations... not specifically UCT. You are absolutely correct thatI'm trying to cook up something which is not UCT.

Anyway, UCT scores has the property that the score of a node changesvery slowlywhen it is searched deep.

For my theoretical analysis, one unfortunate property of UCT is thatmany playouts from a set starting position means that the search treegets expanded and that the measured winning rate is some kind of averageof elements deeper in the tree. My original post asked about "montecarlo" in an attempt to avoid that issue.

But does this mean that sibling scores can be compared
to each other? I would hesitate here because the scores change as afunction ofthe search. In difficult positions the scores for all good moves areoftenvery similar. The scores often move up or down slowly together asfunction of if
the position is good or not.

Actually, "the scores moving up or down slowly together as function ofif the position is good or not" is exactly the type of thing I'm tryingto gauge. Essentially, I'd like to predict the performance of anunexplored branch based on available data. Not an attempt to prune itaway, but rather to figure out when exploration of those branches shouldoccur.

I attach an sgf file where i added the principal variation of Valkyriaafter theblack opening move of the center point using 5 minutes of thinkingtime. Foreach move I give the winning percentage and the number of playoutsthat passedthis node. For the second move of black I also give the winningpercentage of
all siblings. The imortant thing here is that many of those move are only
searched 1000 times whereas the few best move were searched 1000000times, but
there is still not much of a difference in the actual scores.

Thanks! Taking a quick look, I see the following winning percentagesfor the principle moves:

B: 54.3%
W:         46%
B: 54.6%
W:         46.3%
B: 54.7%
W:         47.6%
B: 54.9%
W:         47.0%

Looking at a single color, the winning percentage seems to shift by 0.2to 0.4%... About what I'd expect to see. What confuses me though is howto interpret the jump back and forth as the color changes (about 8%).Are the percentages always the winning percentage for black? Or is itthe winning percentage for the color to move?

If it's the winning percentage for the color to move, it seems reallystrange that it'd go up for both colors as the principle variation went on.


If it's the winning percentage for black, why does it vary so drastically?
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Depth dependent evaluation effects on monte carlo searches

Reply via email to