Re: [computer-go] Strength of Monte-Carlo w/ UCT...

Vincent Diepeveen Sat, 09 Aug 2008 17:00:24 -0700


On Aug 9, 2008, at 9:45 PM, Don Dailey wrote:

I'm curious what you guys think about the scalability of monte carlo

with UCT.

The MCTS technique appears to be extremely scalable.  The theoretical
papers about it claim that it scales up to perfect play in theory.


We agree here that this is not true of course.

You can prove that random forms of search basically search based uponmobility.The insight for that is real simple. If you control more area youhave more possibilities,so the statistical odds of that backtracking to the root is biggerthan positions where you

control little area.

So a very selective searcher equipped with a good mobility functionwill beat it,as its worst case is better. As you notice correctly at a certaintime when programs get realstrong at certain areas, then the randomness of a search tree willbackfire as it creates a

worst case in program vs program.

There is however 2 problems to solve to make it happen, maybe a 3d:

a) making a good mobility evaluation function. Not easy in chessnor go.i must admit in chess it took me until somewhere in2004-2005, so more than 11 yearsafter starting chess, to make one. This as a good chessplayerwho has put in a lot of time there.In go where literature is even more inaccessable toprogrammers it will be harder even.

b) selective searching is not so easy. How many is there who cansearch ultra selectiveand still improve bigtime in playstrength when given more cputime? Well, other than Leela that is...

Historic note: we had randomness at chess too, if you just usemobility, nothing else, not even material values,

you soon end up above 2000 elo.

Go programs are far from that yet.

So until then you still will have to deal with the busloads of guyswho claim neural networks

work very well as a replacement of search and/or evaluation.

It took until 1998 or so to find great well tuned parameters formaterial in chess (done by Chrilly Donninger,note he posted publicly that the tuning didn't matter, just thepatterns themselves; an interesting statement at the time),

Now the question is, what will happen when someone finds those forthe most dominating positional components in computer-go?

Note Don that the obvious thing in go that a lot of humans who suckin go miss, is that finding the bestmoves in a position as a subset of all total moves is a LOT easierthan in chess.

So selective search in go is far easier to do well than in chess.It's relative easy to select each time just a move or 30 to 40that are real relevant. If you look at it like that, then you canachieve easily far deeper selective search depths that are relevant

in go, than in chess, given the same hardware.

In chess you simply cannot afford to not look at the most stupid linepossible, simply because sometimes sacraficing the entire board works.

The goal is just conquering the enemies king.

In go however if with some certain degree of sureness you have a wonposition, there is NEAR ZERO need to look further.

Hard pruning works far superior there.

Effectively go topprograms therefore get similar selective searchdepths to the search depths of chessprograms at the start of the game,given the same hardware and time controls. This despite thatchessprograms get far higher NPS-es.

When todays top chessprograms run on hardware from 1999, which wasquad xeon 500Mhz that showed up in world champs,at 3 minutes a move, they totally destroy with 100% scores the fieldback then. Not a single draw.

So we can argue a lot about search here. Considering the selectivesearch depths todays go programs get i would argue thatthe most important improvement now is evaluation function. Thatdoesn't mean it needs to get implemented in the same manner,

nor that it needs to be some sort of huge function.

Considering the go strengths of the top authors in computer-go, onewould argue they better can implement simple knowledge

and tune it very well.

That'll kick more butt than a small improvement in search.

People who holy believe in search depths as the absolute truth areinteresting people.If i play my todays Diep at hardware from 1999 against Fritz fromthose days, one of thestronger programs at the time, Fritz would get 17 ply every move.Versus my Diep maybe 12.

Hell i'll even give 7 ply odds when needed. Just let me turn on a fewextensions to not be tactical TOO weak.


It won't help Fritz from those days. It loses bigtime.

On paper 7 ply search depth extra is worth an elo or 700, so todaysDiep should make 0% chance, just on paper.

In chess the center is dominant. Real soon everyone figured out thatpieces and pawns in the center are worth more thanwhen not in the center. In Go all this is more complicated thanchess. In todays top programs when they do not know what to do,they just put their (light) pieces and pawns in the center. Thatsimply even at the highest levels wins the games.

That dominant such a simple knowledge rule is. There is just 4 centersquares in chess, you'll realize how easy then chess isfrom a knowledge viewpoint seen. Material + central placement of(light) pieces. Because the goal is to mate a king,

search cannot abort easily based upon the above positional knowledge.

In go termination is far easier, what's hard to discover is HOW TOEVALUATE in a simple yet GENERIC and real effective manner.

Secondly: how to TUNE it?

Whether you do that with a todays random search or not won't matterthat much as the improvement possible

with evaluation function.

A 16 core Power6 node should be more than enough to beat majority ofall professional players.

Of course it is very well possible that the combination of a realgood evaluation function of the future requires a better

search. For now i'd say, try to find that evaluation function.

If i could put back todays Diep's evaluation in 1999's diep's search,which sucks real major league compared to todayssearching, it still would've won any event from 2004 and backwards,given the hardware i used at each championship,which never was the fastest hardware except for 2003 world champs (sothe supercomputer out of 2003 i would replaceby my oldie dual k7, which also in 2003 was real slow compared to thequad xeon MP 2.8Ghz that most participants

showed up with).

Note that i would argue that my evaluation as it is today won't winthe world champs 2008. I need to really do a lot comingmonth to do well there. Most likely in 2009 i tell you about myevaluation function 2008 that it was THAT WEAK, that

any type of non-total bugged search, could beat it.

So i would argue that all the discussion above about search is totalirrelevant to the future of computer-go.


Vincent

My feeling is that as you scale up in power,  certain things will

improve relative to humans faster than other things as you imply.This

happened in computer chess and to this day computers are inferior to
human players in many ways,  and yet the computers are superior in
playing the game overall.

At some point computer go programs will start doing some thing better
than the top players, while other things will remain behind.  So they
will still lose.   The things the computers do not do well will still

continue to slowly improve, helping the situation. The thingscomputersdo better will become overwhelming and it's only a question of whenthe

combined effect is enough to beat the top players.

I don't believe there are any solid barriers as you imply.  There are
just some things that are harder than others and improvements in those
areas will come slower (when compared to humans.)

I personally believe that what computers do better will give humans a
hard time more than the reverse.   It may be that once computers start
doing a few things better, humans will have a difficult time.

In chess this manifested itself in 2 areas - tactics and consistency.
They do "consistency" better than any human will.  A human will make a

silly oversight that is "uncharacteristic" of him. A computer mayalso

make errors, but not "uncharacteristically."   A chess computer will
never miss a 3 move checkmate for instance.   This is a wonderful
strength that should not be underestimated.  In tennis it is hard to

beat even a mediocre player who's only strength is that he doesn'tmake

errors.  If the ball comes near to him,  it will always come back and

you will never get a free point. It puts the pressure on you todeliver

and you must never miss.

Because of the ability to calculate accurately, chess programsquickly

became known for their tactical ability.   Even though they made
positional errors frequently,  it got to the point where you still had
to work hard to beat them and they never let down.    So then players

had to actually change their style in order to win, which in turnis a

constraint on the way you play and makes you weaker.

It will be harder in GO than in Chess.  The weakness are more glaring

and the strengths are less useful in GO, but I think it willeventually

happen.

- Don





On Sat, 2008-08-09 at 14:54 -0400, Robert Waite wrote:

I'm curious what you guys think about the scalability of monte carlo

with UCT. Let's say we took a cluster like that which was used forthe

Mogo vs. Kim game. Then lets say we made 128 of these clusters and
connected them together efficiently. Putting aside implementation and
latency issues... what kind of stones-strength increase would you
imagine?

Its a pretty arbitrary guess.. but do you think one stone
improvement... or that this would alone be enough to beat a pro even?

I am wondering because there could be a weakness or limit in MC w/
UCT. I am only now learning about the UCT addition... but there are

vast numbers of possible games that are never visited during themonte

carlo simulations. The random stone simulations are pretty random
aren't they? I am reading some of the papers on the UCT addition...
and that does seem to show certain branches to be "better" and worth

more time. Pro players may have a counter-strategy that might comeout

as Mogo is tested at higher levels of play. Perhaps there will be a
need to combine MCwUCT with heuristics or more knowledge based play.
Going the route of heuristics seems unpleasant and the promise of
using a more computational method would be great. However... if MC
techniques alone have a diminishing return... the route of heuristics

might come back (or perhaps a whole new paradigm for gamealgorithms).


I am still secretly on the side of human go beating the machine.. but

the recent match really changed my view on topic and really showedthe

value of statistical analysis. I am just wondering about what kind of
roadblocks might show up for the monte carlo techniques.

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Strength of Monte-Carlo w/ UCT...

Reply via email to