The following code did not hurt the strength against self-play in over
2000 games at boardsize 8x8 (faster games) with 10k playouts per move:
moveEval[m] = (float)wins[m]/nGames[m] +
points[m]/(nGames[m]*Board-Spaces*100)
where points[m] is accumulated only for wins.
On 2/9/07, Weston Markham [EMAIL PROTECTED] wrote:
I don't seem to have any numbers on this anymore, but I should be able
to try some experiments this weekend. I do have some code that does
what I describe below. It is also using an all moves as first
heuristic. According to my notes, I made
I think that you are essentially correct. However, this is only going
to affect a small number of games where two different moves are
exactly tied for the best winning percentage, after many playouts.
Even if the underlying probabilities are exactly the same, you can't
really expect this to
On Mon, Feb 12, 2007 at 11:20:43AM -0500, Weston Markham wrote:
I think that you are essentially correct. However, this is only going
to affect a small number of games where two different moves are
exactly tied for the best winning percentage, after many playouts.
Even if the underlying
On Wed, Feb 07, 2007 at 02:41:22PM -0600, Nick Apperson wrote:
If it only did one playout you would be right, but imagine the following
cases:
case 1: White wins by .5 x 100, Black wins by .5 x 100
case 2: White wins by 100.5 x 91, Black wins by .5 x 109
the method that takes into account
I don't seem to have any numbers on this anymore, but I should be able
to try some experiments this weekend. I do have some code that does
what I describe below. It is also using an all moves as first
heuristic. According to my notes, I made this change in an attempt to
avoid severely
Quoting Heikki Levanto [EMAIL PROTECTED]:
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
In truth the only thing that matters is to increase your winning
percentage - not your score. There seems to be no point in tampering
with this.
I guess I must accept the wisdom of those
The average score can contain a very large proportion of losees if it is
compensated by bigger wins.
yes, it is easy to see how this might cripple the play of an MC player.
that 90% territory win that requires 3 opponent blunders is tempting enough
to ignore the fact that all other
I think there are 15 first moves in 9x9 go if you factor out the
symetries.
UCT isn't good at evauating all the moves, it will pick one of them and
spend most of it's time on it.But you could search each 1 at a time.
The UCT programs are memory bound, so you could search each of these 15
I thought that the memory boundedness was completely fixed by not
expanding a UCT node until it has been visited X number of times.
Just increase X until you are no longer memory bound. I don't recall
anyone reporting a loss in playing strength by doing this.
On 2/8/07, Don Dailey [EMAIL
Quoting Don Dailey [EMAIL PROTECTED]:
I think there are 15 first moves in 9x9 go if you factor out the
symetries.
UCT isn't good at evauating all the moves, it will pick one of them and
spend most of it's time on it.But you could search each 1 at a time.
The UCT programs are memory bound,
On Thu, 2007-02-08 at 08:59 -0500, Chris Fant wrote:
I thought that the memory boundedness was completely fixed by not
expanding a UCT node until it has been visited X number of times.
Just increase X until you are no longer memory bound. I don't recall
anyone reporting a loss in playing
On 2/8/07, steve uurtamo [EMAIL PROTECTED] wrote:
i wonder if this kind of greediness might, however, be useful for selecting,
say, the first move or two in a 9x9 game. the thinking here is that since the
endgame is essentially noise at this point, you might as well be greedy
before tactics
-Original Message-
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Wed, 7 Feb 2007 5:34 AM
Subject: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo
(QMC))
On Wed, Feb 07, 2007 at 12:06:40PM +0200, Tapani Raiko wrote:
Let my try again using the
I should have mentioned that I have only tested on 9x9. For larger boards,
I don't know.
- Dave Hillis
`
Intuitively, it seems like this should work. You only give the winning
margin a small weight, or only use it to break ties, or only apply it after the
game
If I recall correctly, someone spoke of constraining the opening moves to the
3rd,4th,and 5th lines in the absence of nearby stones, or something to that
effect. What was the impact of this experiment? I notice the recent discussion
of the need for a lot of thinking time to find good opening
What sort of sampling was used for the playouts? For this variable (
incorporating some information about the score vs only the win-loss variable ),
does it make a difference whether playouts are totally random or incorporate
varying degrees of similitude to good play?
From: [EMAIL PROTECTED]
terry mcintyre wrote:
If I recall correctly, someone spoke of constraining the opening moves
to the 3rd,4th,and 5th lines in the absence of nearby stones, or
something to that effect. What was the impact of this experiment?
For what it's worth, I tried a number of experiments along these
Subject: Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte
Carlo (QMC))
What sort of sampling was used for the playouts? For this variable (
incorporating some information about the score vs only the win-loss variable ),
does it make a difference whether playouts are totally
Don Dailey wrote:
On Wed, 2007-02-07 at 11:34 +0100, Heikki Levanto wrote:
All this could be avoided by a simple rule: Instead of using +1 and -1
as the results, use +1000 and -1000, and add the final score to this.
Heikki,
I've tried ideas such as this in the past and it's quite
If it only did one playout you would be right, but imagine the following
cases:
case 1: White wins by .5 x 100, Black wins by .5 x 100
case 2: White wins by 100.5 x 91, Black wins by .5 x 109
the method that takes into account score would prefer the second case even
though it has a lower
That drives me nuts! Minimax search would eliminate bad lines of play whenever
a refutation is found. A good opponent would not play badly, and the quantity
of possible bad moves should not affect the evaluation of good moves - but that
seems to be what MC does, averaging out all moves
On Wed, 2007-02-07 at 14:08 -0600, Matt Gokey wrote:
Don, do you have any theories or information about why this is the
case?
Not really. In truth the only thing that matters is to increase your
winning percentage - not your score. There seems to be no point in
tampering with this.
- Don
: Wed, 7 Feb 2007 4:31 PM
Subject: Re: [computer-go] MC approach
That drives me nuts! Minimax search would eliminate bad lines of play whenever
a refutation is found. A good opponent would not play badly, and the quantity
of possible bad moves should not affect the evaluation of good moves
as often as good ones, so they don't contribute as much to the
estimation of the worth of a node.
- Dave Hillis
-Original Message-
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Wed, 7 Feb 2007 4:31 PM
Subject: Re: [computer-go] MC approach
That drives me nuts
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
In truth the only thing that matters is to increase your winning
percentage - not your score. There seems to be no point in tampering
with this.
I guess I must accept the wisdom of those who have tried these things.
Still, it hurts
On Thu, 2007-02-08 at 00:46 +0100, Heikki Levanto wrote:
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
In truth the only thing that matters is to increase your winning
percentage - not your score. There seems to be no point in tampering
with this.
I guess I must accept
But of course, it's not the size of the win that counts, it is rather
the confidence that it really is a win. In random playouts that
continue from a position from a close game, the ones that result in a
large victory are generally only ones where the opponent made a severe
blunder. (Put
Matt Gokey: [EMAIL PROTECTED]:
Weston Markham wrote:
But of course, it's not the size of the win that counts, it is rather
the confidence that it really is a win.
Yes, and my reasoning was that a larger average win implied a higher
confidence since there is more room for error. That intuition
29 matches
Mail list logo