From: Petri Pitkanen <[email protected]>


2011/6/17 Jean-loup Gailly <[email protected]>

I have done precisely this. The reports of scalability death are greatly
>exaggerated, as you can see from the attached graph.  To avoid self play
>benchmarks which are misleading, I tested Pachi against Fuego 1.1. 
> FuegoJean-loup
>
>
>
Well this gives a biased solution. Wrong sample so to speak. Fuego will not 
create complex semeais and har read ishi-no-shita nakade shapes i.e opponent 
that puts no pressure to known problems . So you prove that agains opponent who 
does not play like human  you do scale. But you advance the ladder of human 
players these small issues tend pop-up more often.

Scaling measurement against strong humans is obviously bit hard. Just about 
only 
thing is letting different CPU machines play in KGS.

Yes I do believe that pachi/Fuego will play better given more time. But It 
would 
scale better if there were better algorithm in place and part of that extra CPU 
would be used there. Just that exactly what to for it is bit murky.

So I don't think that we get to 6 Dan EGF (8-9 Dan KGS?) with current programs 
just adding memory and CPU.

Petri

I believe Petri is correct. An automatic tournament amongst a few similar MCTS 
programs, which tend to have similar weak points, is not as useful as playing 
against a strong, adaptive human community. The humans will discover and 
exploit 
entire categories of bugs - such as failure to understand nakade, insensitivity 
to capturing races, failure to understand the value of a big eye in a capturing 
race, weak borders of central moyos, poor yose skills - which may be shared by 
both programs in any given match. Even if the programs don't share the same 
weakness, it will be rare for a program to exploit such a weakness in other 
programs.  


Humans, on the other hand, tend to observe and adapt. When a weakness becomes 
known, it will be exploited. 


That said, I can donate four cores for a few weeks or months for a study, 
however it is organized. 


I'd like to suggest looking into the costs of setting up a farm on Rackspace or 
Amazon, and a "pay me" button so that people could toss virtual coins into the 
meter to keep the experiment running. 


Another possibility: by now there must be a large database of known situations 
where strong programs managed to snatch defeat from the jaws of certain 
victory. 
( If not, there should be; examples certainly abound. ) How about a scalability 
study which asks "how many playouts must program X use to handle situation Yn 
correctly, for a large set of Y?" Ideally, such a test should follow through - 
if a bad move is made, it should be punished; if the first move is correct, it 
should respond with one of several replies, and determine if the program 
continues to play correctly. 
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to