Maybe I should ask first, for clarity sake, is MCTS performance in
handicap games currently a problem?
Mark
Yes, it's a big problem. And thats not a matter of opinion.
MC bots, leading a game by a large margin, will give away their advantage
lighly except for the last half point.
Even on a
I admit I had trouble understanding the details of the paper. What I
think is the biggest problem for applying this to bigger (up to 19x19)
games is that you somehow need access to the true value of a move,
i.e. it's a win or a loss. On the 5x5 board they used, this might be
approximated
I don't think the komi should be adjusted.
Instead:
Wouldn't random passing by black during the playouts model black making
mistakes much more accurately? The number of random passes should be
adjusted such that the playouts are close to 50/50. Adjusting the komi
would make black play greedily,
In future papers they should avoid using a strong authority like Fuego for the training and instead force it to learn from a naive uniform random playout policy
(with 100x or 1000x more playouts) and then build on that with an iterative approach (as was suggested in the paper).
I also had
I'd like to say that the problem comes from the fact the model of
the opponent in the simulations is not enough accurate in MCTS
flamework. So, the solution is to make the model being more precise
but this has practically no sense.
What is Komi or handicap? Since W is stronger than B, W must
A web search turned up a 2 page and an 8 page version. I read the
short one. I agree that it's promising work that requires some follow-
up research.
Now that you've read it so many times, what excites you about it? Can
you envision a way to scale it to larger patterns and boards on modern
Is anyone (besides the authors) doing research based on this?
Well, Pebbles does apply reinforcement learning (RL) to improve
its playout policy. But not in the manner described in that paper.
There are practical obstacles to directly applying that paper.
To directly apply that paper, you must
This idea makes much more sense to me than adjusting komi does.At least
it's an attempt at opponent modeling, which is the actual problem that
should be addressed. Whether it will actually work is something that
could be tested.
Another similar idea is not to pass but to play some
On Thu, Aug 13, 2009 at 1:39 AM, Christoph Birk b...@ociw.edu wrote:
On Aug 12, 2009, at 3:43 PM, Don Dailey wrote:
I believe the only thing wrong with the current MCTS strategy is that you
cannot get a statistical meaningful number of samples when almost all games
are won or lost.You
One reason dynamic komi seems a bit odd is that the numbers are pulled out of
thin air. Why should the komi be X instead of Y? When should the value be
changed?
Going back to the original thought experiment: the komi at the start of the
game should reflect the expert assessment of how far
With crazystone-like playouts, you can just put noise over the
possibilites. the more noise, the more random the playout is, which is
weaker. The best move in the tree is then the one that requires the
least amount of noise for the other player to reach 50% win chance if
behind, or the one
Modeling the opponents mistakes is indeed an alternative to introducing komi.
But it would have to be a lot more exact than simply rolling the dice or
skipping a move here and there.
Successful opponent modeling would implement the overplay school of thought -
playing tactically refutable
There is one crude way to measure goal compatibility. See if you can make
the same move work with different komi.If I'm on the east coast of the
US traveling to the west coast, I will probably start off on the same road
regardless of whether I'm going to Seattle or San Diego.If the same
2009/8/13 Stefan Kaitschick stefan.kaitsch...@hamburg.de
Modeling the opponents mistakes is indeed an alternative to introducing
komi.
But it would have to be a lot more exact than simply rolling the dice or
skipping a move here and there.
Successful opponent modeling would implement the
The dynamic komi is perhaps a misnomer; it's by accident that changing komi
reflects something which we do want to measure, namely the predicted score.
An algorithm which does not make use of the predicted score would not make use
of all available information.
On a 19x19 board, it is common
2009/8/13 terry mcintyre terrymcint...@yahoo.com
The dynamic komi is perhaps a misnomer; it's by accident that changing
komi reflects something which we do want to measure, namely the predicted
score.
An algorithm which does not make use of the predicted score would not make
use of all
I have never heard a pro say I estimate my chances of winning this game to be
50.3%, but you will hear black is ahead by 3 points or white wins by 1/2
point. -- they'll make this evaluation based on the alternation of equally
competent play.
Terry McIntyre terrymcint...@yahoo.com
“We hang
. Pebbles has a Mogo playout design, where you check
for patterns only around the last move (or two).
In MoGo, it's not only around the last move (at least with some probability
and when there are empty spaces in the board); this is the fill board
modification.
(this provides a big
. Pebbles has a Mogo playout design, where you check
for patterns only around the last move (or two).
In MoGo, it's not only around the last move (at least with some probability
and when there are empty spaces in the board); this is the fill board
modification.
Just to clarify: I was not
Just to clarify: I was not saying that Mogo's policy consisted
*solely* of looking for patterns around the last move. Merely that
it does not look for patterns around *every* point, which other
playout policies (e.g., CrazyStone, if I understand Remi's papers
correctly) appear to do. The RL
A couple of weeks ago I made the playouts slightly heavier by adding a few
2-liberty local rules. It made a big difference in the program's strength
(from strong 3 kyu to weak 1 kyu).
www.gokgs.com/servlet/graph/ManyFaces-en_US.png
David
___
David Fotland wrote:
made the playouts slightly heavier by adding a few
2-liberty local rules.
What does heavier mean here and could you please give an example of
such a rule? Do you have an understanding why they make your program
stronger?
--
robert jasiek
Heavier means more analysis in the playouts about what move to make - less
pure random. I don't understand why its stronger, but I'm happy with the
result. Playouts are pretty much try something and test it.
David
-Original Message-
From: computer-go-boun...@computer-go.org
David Fotland: 091c01ca1c4f$9dea69e0$d9bf3d...@com:
A couple of weeks ago I made the playouts slightly heavier by adding a few
2-liberty local rules. It made a big difference in the program's strength
(from strong 3 kyu to weak 1 kyu).
www.gokgs.com/servlet/graph/ManyFaces-en_US.png
Is this
Works for me. It's the rank graph. You can also get it on KGS, user info
for ManyFaces
-Original Message-
From: computer-go-boun...@computer-go.org [mailto:computer-go-
boun...@computer-go.org] On Behalf Of Hideki Kato
Sent: Thursday, August 13, 2009 9:42 PM
To: computer-go
25 matches
Mail list logo