Re: [computer-go] Scalability problem of play-out policies

Don Dailey Wed, 23 Jan 2008 11:36:34 -0800


Joel Veness wrote:
> Hi Ivan,
>
> I like to view game tree search methods as a systematic ways to
> correct static evaluation errors. I don't find it surprising that UCT
> scales well with increasing time and space. I claim that the accuracy
> of the monte-carlo method increases as we get closer to the end of a
> game. UCT helps propagate monte-carlo evaluations from the best
> looking lines deep inside the tree. If the best line of play is deep
> enough and sampled well enough, then positions that might have caused
> problems for the monte-carlo evaluation will eventually be avoided.
> Remember that with UCT, no line of play is avoided forever, no matter
> what the monte-carlo evaluation says. Extra time and space just
> increases the chances of the UCT component correcting any errors in
> the monte-carlo evaluation, which is why I think it scales so well.
>
> I certainly think it is reasonable to suggest that certain flaws in a
> monte-carlo evaluation function could create scaling issues. This
> could happen if there ever reaches a point that a program is losing
> game after game due to the evaluation function not understanding some
> particular dynamic of the game quick enough to give the UCT search a
> chance of correcting the misconception. However, I actually don't
> think this to be the case in practice.
>   
Yes,  you can entertain any unreasonable possibility but it's a waste of
time.   People get some idea in their head that in their mind they can
see as happening,  and perhaps can even construct some corner case that
is possible or plausible, and then it can be endlessly debated but has
nothing to do with anything actually happening.


- Don
 

> I think that why the scaling results are so consistent and smooth is
> due to the imperfection that comes with using a monte-carlo evaluation
> function. We have to deal with stochastic noise, in addition to
> misconceptions due to the method. Since there are so many errors to
> correct (many of them small errors), any extra processing time
> provides ample opportunity to help the program get a better idea of
> the board situation and avoid that game losing blunder. In my opinion,
> this plays a big factor in creating these almost linear scaling
> results that we see regularly.
>
> Joel
>
>
> On Jan 23, 2008 4:16 PM, ivan dubois <[EMAIL PROTECTED]> wrote:
>   
>> Sorry, this may very well be nitpicking, but it is not nonsense.
>> I just dont like when people are making illegitimate deductions, using the 
>> fact that the same word can have two distinct meanings. Anyway all this is 
>> just nitpicking from my part, I have to admit.
>>
>> What is not nitpicking however, is wether some play-out policy (wich might 
>> appear to be working well) can compromise the scalability of UCT at some 
>> point. I read that there seemed to be a problem with Fatman's scalability. I 
>> dont know what are the latest news about this, but I think you should at 
>> least consider the possibility that it may come from some persistent 
>> missconception from the play-out policy.
>> After all, I may very well be right.
>> After thinking a little about it, I came to the conclusion that there can be 
>> a scalability problem when :
>>     There is a bias in the MC evaluation AND there is no move to be made in 
>> the UCT part that can partialy correct the missconception of the MC 
>> evaluation.
>> This happens each time there is a situation that just cannot be settled 
>> immediatly, but has to wait the end of the game.
>> Now I think I can give you a clean example : The bent four in the corner 
>> shape : http://senseis.xmp.net/?BentFourInTheCornerIsDead
>>
>> Tell me if I am wrong, but I bet nor your program nor Mogo does scale at all 
>> on this situation. That means whatever computing power you can give to it, 
>> If i can create such a situation in my game against him, I will probably 
>> win. And notice that more computing power will not help Mogo to avoid this 
>> situation because it doesnt know it to be a problem.
>> Tell me if I am wrong, but I think it is a scalability problem that puts a 
>> real limit to the strength it can reach in theory.
>>
>> But please, do not miss understand me, I am NOT saying that this is a 
>> crucial problem to MC that cannot be fixed. I am sure it CAN be fixed. I am 
>> just saying that currently MC programs do not scale infinitely, but I have 
>> no doubt that these problems will be corrected as soon as they become 
>> crucial to the playing strength.
>>
>> Ivan
>>
>>
>>  Message d'origine ----
>> De : Don Dailey <[EMAIL PROTECTED]>
>> À : computer-go <computer-go@computer-go.org>
>> Envoyé le : Mercredi, 23 Janvier 2008, 4h44mn 17s
>> Objet : Re: Re : Re : [computer-go] Is MC-UCT really scalable ... is a troll
>>
>>
>>
>> ivan dubois wrote:
>>     
>>> Thanks !
>>> I think I agree with everything you said.
>>> After some reasoning, I changed my opinion about the practical scalability 
>>> of Mogo. It does have some missconceptions about eyes, but as you said, it 
>>> can be corrected by the UCT part. I tried to find a situation where the UCT 
>>> part does not help until the situation comes to a conclusion, with no 
>>> success. So you are probably right, maybe I will give some more thought 
>>> about it, or not :)
>>>
>>> But I stay on my position that the theoritical "infinite scalability of 
>>> UCT" doesnt give any hint about whether or not it is scalable in practice. 
>>> I think you made my point clear : From a pure theoritical point of view, it 
>>> has no advantage over a simple mini-max solver. Given that a mini-max 
>>> solver is NOT considered scalable in practice, I see no reason why people 
>>> would see this property of UCT as an indication of its scalability.
>>>
>>>       
>> This is still nonsense.  UCT in actual real world "PRACTICE" responds
>> dramatically to more hardware,  how can you say it's not clear whether
>> it's scalable in practice?    What is practice to you?
>>
>> As far as mini-max, that is also scalable in the real world that I live
>> in.  Aya is an example of a reasonably good mini-max player that
>> scales.    The theoretical properties of scalability hold in actual
>> practice.
>>
>> If you mean that we will never see in our lifetimes real world hardware
>> fast enough to see it play perfectly,  everyone already knows this, it's
>> not any kind of revelation.  (However  I'm surprised at how many people
>> get hung up on this point - thinking it is somehow relevant as a proof
>> that the idea of search is totally useless.)
>>
>> The property we are interested in is that we can always expect an
>> increase of playing strength by doing one or more of the following things:
>>
>>     1. let it think longer.
>>     2. let it think faster in the same amount of time.
>>     3. continue to improve the knowledge.
>>     4. continue to improve the efficiency of the whole system
>>
>> "infinitely scalable" is a poor term since there is a finite limit.
>> The finite limit is perfect play and as I said, we don't actually expect
>> to achieve that.
>>
>> - Don
>>
>>
>>     
>>> Ivan
>>>
>>>
>>> ----- Message d'origine ----
>>> De : Weston Markham <[EMAIL PROTECTED]>
>>> À : computer-go <computer-go@computer-go.org>
>>> Envoyé le : Mercredi, 23 Janvier 2008, 0h41mn 08s
>>> Objet : Re: Re : [computer-go] Is MC-UCT really scalable ... is a troll
>>>
>>> (I typed the following up earlier today, before other people cast some
>>> doubts about what "infinite scalability" means.  Perhaps it is
>>> helpful.)
>>>
>>> I think that Alain was specifically referring to a property of UCT,
>>> whereby given that a winning line of play exists, the probability that
>>> the algorithm has discovered one such line & settled upon it
>>> approaches 1 as time approaches infinity.  I believe that this has
>>> been proven.  And no, this theoretical result does not represent any
>>> intrinsic advantage over a calculation of the minimax value.   (Someone
>>> should feel free to correct me if there is a stronger theoretical
>>> result that I am not aware of, though.)
>>>
>>> I think it is important to understand that unless you restrict UCT
>>> from exploring some correct lines of play, this "infinite scalability"
>>> will still be true.  And, it is true regardless of what moves have
>>> been preferred/pruned/whatever within the MC playouts.  (Or whatever
>>> one does at the leaf nodes; it need not be MC at all, as long as the
>>> UCT tree eventually gets expanded as the node is revisited.)
>>>
>>> This is not to say that this particular property of UCT is sufficient
>>> in order for us to claim that UCT is scalable in practice.  My point
>>> is rather that noone is claiming "infinite scalability of MC", but
>>> rather scalability of UCT.  When you combine that with the fact that
>>> UCT has been very successful at 9x9 go, (and scalable across a wide
>>> range of time limits) it seems to be reasonable to expect more of the
>>> same in 19x19.  (Which is what concerned the original poster.)  Also,
>>> one should expect that the UCT portion of MC-UCT will tend to
>>> eventually "fix up" any systematic errors that are made by the MC
>>> playouts.
>>>
>>> I have one other point I'd like to make, in regard to "light" versus
>>> "heavy" playouts:  Even "light" playouts do not actually use a uniform
>>> distribution, due to the quasi-eye rule that is generally used.  I
>>> think that there is every reason to expect that other "nonuniform"
>>> playout policies further outperform "light" playouts in every
>>> practical way.  Granted, it is possible to introduce "severe
>>> misconceptions" when one incorporates knowledge into one's playouts.
>>> But even in that case, one can still fall back on the fact that UCT is
>>> cleaning up after one's mistakes:  the eventual behavior, given enough
>>> time, is still perfect play.  (And of course it is not as if people
>>> blindly adjust the monte carlo policies without checking the revised
>>> versions for "severe misconceptions".)
>>>
>>> Weston
>>>
>>> On 1/22/08, ivan dubois <[EMAIL PROTECTED]> wrote:
>>>
>>>       
>>>> I think there is some missconception about this "infinite scalability of 
>>>> MC" stuff.
>>>> Theory is only usefull when the model it is based on shares important 
>>>> aspects with reality.
>>>> In theory, 19*19 Go is a finite game, wich can be analysed in a finite 
>>>> amount of time. So ANY algorithm that solves the game at some point is, in 
>>>> theory, infinitely scalable. For example, the folowing algorithm is 
>>>> infinitely scalable :
>>>>    Analyse the complete mini-max tree of the game. If enough time to 
>>>> finish, returns the correct move, if not, return a random move.
>>>>
>>>> Now, is this algorithm really scalable, in practive ? Of course not.
>>>>
>>>> I support the idea that MC is infinitely scalable (in the reality) ONLY if 
>>>> the play-out part does not have severe misconceptions. So i think that 
>>>> currently, only MC based on uniform playouts is infinitely scalable.
>>>> It is well know that even Mogo has troubles with big eyes (he thinks a big 
>>>> eye gives life, wich is false). This problem can not be solved with more 
>>>> computing power (excep absurdly big, of course you can always mini-max to 
>>>> the end).
>>>>
>>>> ----- Message d'origine ----
>>>> De : Alain Baeckeroot <[EMAIL PROTECTED]>
>>>> À : computer-go <computer-go@computer-go.org>
>>>> Envoyé le : Mardi, 22 Janvier 2008, 22h13mn 26s
>>>> Objet : Re: [computer-go] Is MC-UCT really scalable ... is a troll
>>>>
>>>> Le mardi 22 janvier 2008, David Fotland a écrit :
>>>>
>>>>         
>>>>> The UCT-MC programs do particularly well against traditional programs
>>>>> because they expose the brittleness inherent in the pattern databases they
>>>>> use.  Strong humans are not so easily beaten by playing unconventional and
>>>>> somewhat inferior moves.
>>>>>
>>>>>
>>>>>           
>>>> I think Remi posted a game of CrazyStone on 19x19 commented by one pro
>>>> who said "this move is 2 dan level".
>>>> So far no go program got such analyse, and the also the huge novelty
>>>> provided by MC/UCT programs is they have a _real_ strenght with their
>>>> own style, like humans:
>>>> play solid when winning, and play hamete (tricky moves) when losing.
>>>>
>>>> On kgs MC programs played hundreds (if not thousands) games against humans,
>>>> and no doubt they fully merit their rank, and no doubt they are improving.
>>>>
>>>> Infinite scalability is a theoricaly proved property, so please
>>>> don't feed the troll :-)
>>>>
>>>> Alain
>>>>
>>>> _______________________________________________
>>>> computer-go mailing list
>>>> computer-go@computer-go.org
>>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>>
>>>>
>>>>      
>>>> _____________________________________________________________________________
>>>> Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! 
>>>> Mail http://mail.yahoo.fr
>>>> _______________________________________________
>>>> computer-go mailing list
>>>> computer-go@computer-go.org
>>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>>
>>>>
>>>>         
>>> _______________________________________________
>>> computer-go mailing list
>>> computer-go@computer-go.org
>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>
>>>
>>>      
>>> _____________________________________________________________________________
>>> Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! 
>>> Mail http://mail.yahoo.fr
>>> _______________________________________________
>>> computer-go mailing list
>>> computer-go@computer-go.org
>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>
>>>
>>>       
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
>>
>>       
>> _____________________________________________________________________________
>> Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>> http://mail.yahoo.fr
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
>>     
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>   
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scalability problem of play-out policies

Reply via email to