On Aug 9, 2008, at 9:45 PM, Don Dailey wrote:
I'm curious what you guys think about the scalability of monte carlo
with UCT.
The MCTS technique appears to be extremely scalable. The theoretical
papers about it claim that it scales up to perfect play in theory.
We agree here that this is not true of course.
You can prove that random forms of search basically search based upon
mobility.
The insight for that is real simple. If you control more area you
have more possibilities,
so the statistical odds of that backtracking to the root is bigger
than positions where you
control little area.
So a very selective searcher equipped with a good mobility function
will beat it,
as its worst case is better. As you notice correctly at a certain
time when programs get real
strong at certain areas, then the randomness of a search tree will
backfire as it creates a
worst case in program vs program.
There is however 2 problems to solve to make it happen, maybe a 3d:
a) making a good mobility evaluation function. Not easy in chess
nor go.
i must admit in chess it took me until somewhere in
2004-2005, so more than 11 years
after starting chess, to make one. This as a good chessplayer
who has put in a lot of time there.
In go where literature is even more inaccessable to
programmers it will be harder even.
b) selective searching is not so easy. How many is there who can
search ultra selective
and still improve bigtime in playstrength when given more cpu
time? Well, other than Leela that is...
Historic note: we had randomness at chess too, if you just use
mobility, nothing else, not even material values,
you soon end up above 2000 elo.
Go programs are far from that yet.
So until then you still will have to deal with the busloads of guys
who claim neural networks
work very well as a replacement of search and/or evaluation.
It took until 1998 or so to find great well tuned parameters for
material in chess (done by Chrilly Donninger,
note he posted publicly that the tuning didn't matter, just the
patterns themselves; an interesting statement at the time),
Now the question is, what will happen when someone finds those for
the most dominating positional components in computer-go?
Note Don that the obvious thing in go that a lot of humans who suck
in go miss, is that finding the best
moves in a position as a subset of all total moves is a LOT easier
than in chess.
So selective search in go is far easier to do well than in chess.
It's relative easy to select each time just a move or 30 to 40
that are real relevant. If you look at it like that, then you can
achieve easily far deeper selective search depths that are relevant
in go, than in chess, given the same hardware.
In chess you simply cannot afford to not look at the most stupid line
possible, simply because sometimes sacraficing the entire board works.
The goal is just conquering the enemies king.
In go however if with some certain degree of sureness you have a won
position, there is NEAR ZERO need to look further.
Hard pruning works far superior there.
Effectively go topprograms therefore get similar selective search
depths to the search depths of chessprograms at the start of the game,
given the same hardware and time controls. This despite that
chessprograms get far higher NPS-es.
When todays top chessprograms run on hardware from 1999, which was
quad xeon 500Mhz that showed up in world champs,
at 3 minutes a move, they totally destroy with 100% scores the field
back then. Not a single draw.
So we can argue a lot about search here. Considering the selective
search depths todays go programs get i would argue that
the most important improvement now is evaluation function. That
doesn't mean it needs to get implemented in the same manner,
nor that it needs to be some sort of huge function.
Considering the go strengths of the top authors in computer-go, one
would argue they better can implement simple knowledge
and tune it very well.
That'll kick more butt than a small improvement in search.
People who holy believe in search depths as the absolute truth are
interesting people.
If i play my todays Diep at hardware from 1999 against Fritz from
those days, one of the
stronger programs at the time, Fritz would get 17 ply every move.
Versus my Diep maybe 12.
Hell i'll even give 7 ply odds when needed. Just let me turn on a few
extensions to not be tactical TOO weak.
It won't help Fritz from those days. It loses bigtime.
On paper 7 ply search depth extra is worth an elo or 700, so todays
Diep should make 0% chance, just on paper.
In chess the center is dominant. Real soon everyone figured out that
pieces and pawns in the center are worth more than
when not in the center. In Go all this is more complicated than
chess. In todays top programs when they do not know what to do,
they just put their (light) pieces and pawns in the center. That
simply even at the highest levels wins the games.
That dominant such a simple knowledge rule is. There is just 4 center
squares in chess, you'll realize how easy then chess is
from a knowledge viewpoint seen. Material + central placement of
(light) pieces. Because the goal is to mate a king,
search cannot abort easily based upon the above positional knowledge.
In go termination is far easier, what's hard to discover is HOW TO
EVALUATE in a simple yet GENERIC and real effective manner.
Secondly: how to TUNE it?
Whether you do that with a todays random search or not won't matter
that much as the improvement possible
with evaluation function.
A 16 core Power6 node should be more than enough to beat majority of
all professional players.
Of course it is very well possible that the combination of a real
good evaluation function of the future requires a better
search. For now i'd say, try to find that evaluation function.
If i could put back todays Diep's evaluation in 1999's diep's search,
which sucks real major league compared to todays
searching, it still would've won any event from 2004 and backwards,
given the hardware i used at each championship,
which never was the fastest hardware except for 2003 world champs (so
the supercomputer out of 2003 i would replace
by my oldie dual k7, which also in 2003 was real slow compared to the
quad xeon MP 2.8Ghz that most participants
showed up with).
Note that i would argue that my evaluation as it is today won't win
the world champs 2008. I need to really do a lot coming
month to do well there. Most likely in 2009 i tell you about my
evaluation function 2008 that it was THAT WEAK, that
any type of non-total bugged search, could beat it.
So i would argue that all the discussion above about search is total
irrelevant to the future of computer-go.
Vincent
My feeling is that as you scale up in power, certain things will
improve relative to humans faster than other things as you imply.
This
happened in computer chess and to this day computers are inferior to
human players in many ways, and yet the computers are superior in
playing the game overall.
At some point computer go programs will start doing some thing better
than the top players, while other things will remain behind. So they
will still lose. The things the computers do not do well will still
continue to slowly improve, helping the situation. The things
computers
do better will become overwhelming and it's only a question of when
the
combined effect is enough to beat the top players.
I don't believe there are any solid barriers as you imply. There are
just some things that are harder than others and improvements in those
areas will come slower (when compared to humans.)
I personally believe that what computers do better will give humans a
hard time more than the reverse. It may be that once computers start
doing a few things better, humans will have a difficult time.
In chess this manifested itself in 2 areas - tactics and consistency.
They do "consistency" better than any human will. A human will make a
silly oversight that is "uncharacteristic" of him. A computer may
also
make errors, but not "uncharacteristically." A chess computer will
never miss a 3 move checkmate for instance. This is a wonderful
strength that should not be underestimated. In tennis it is hard to
beat even a mediocre player who's only strength is that he doesn't
make
errors. If the ball comes near to him, it will always come back and
you will never get a free point. It puts the pressure on you to
deliver
and you must never miss.
Because of the ability to calculate accurately, chess programs
quickly
became known for their tactical ability. Even though they made
positional errors frequently, it got to the point where you still had
to work hard to beat them and they never let down. So then players
had to actually change their style in order to win, which in turn
is a
constraint on the way you play and makes you weaker.
It will be harder in GO than in Chess. The weakness are more glaring
and the strengths are less useful in GO, but I think it will
eventually
happen.
- Don
On Sat, 2008-08-09 at 14:54 -0400, Robert Waite wrote:
I'm curious what you guys think about the scalability of monte carlo
with UCT. Let's say we took a cluster like that which was used for
the
Mogo vs. Kim game. Then lets say we made 128 of these clusters and
connected them together efficiently. Putting aside implementation and
latency issues... what kind of stones-strength increase would you
imagine?
Its a pretty arbitrary guess.. but do you think one stone
improvement... or that this would alone be enough to beat a pro even?
I am wondering because there could be a weakness or limit in MC w/
UCT. I am only now learning about the UCT addition... but there are
vast numbers of possible games that are never visited during the
monte
carlo simulations. The random stone simulations are pretty random
aren't they? I am reading some of the papers on the UCT addition...
and that does seem to show certain branches to be "better" and worth
more time. Pro players may have a counter-strategy that might come
out
as Mogo is tested at higher levels of play. Perhaps there will be a
need to combine MCwUCT with heuristics or more knowledge based play.
Going the route of heuristics seems unpleasant and the promise of
using a more computational method would be great. However... if MC
techniques alone have a diminishing return... the route of heuristics
might come back (or perhaps a whole new paradigm for game
algorithms).
I am still secretly on the side of human go beating the machine.. but
the recent match really changed my view on topic and really showed
the
value of statistical analysis. I am just wondering about what kind of
roadblocks might show up for the monte carlo techniques.
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/