Re: [computer-go] A problem with selectivity and AMAF bias
magnus, I hate to ask the obvious, but how much time does each simulation take? If 100 simulations would be enough information to get it out of its rut, why not just check every legal move for (say) 100 simulations before doing anything else? on another note, i think that it's cool that you have a board situation that exhibits such borderline behavior (i.e. that it takes relatively few simulations to fix the problem, but that the code as it stood would never find the problem on its own). steve. On Fri, Apr 11, 2008 at 4:25 AM, Magnus Persson [EMAIL PROTECTED] wrote: Quoting Michael Williams [EMAIL PROTECTED]: Magnus Persson wrote: Yesterday I noticed an odd phenomena in Valkyria wich was caused by high selectivity and AMAF. Then since there is nothing in the AMAF scores that indicate that move 2 is any good it is never searched, since the search is so selective that the bounds will not grow large to compensate for the bias in a reasonable time. Isn't this fixed by never straying from a move in the tree until it loses and then trying an untried move? Or something like that. It wasn't my idea and I don't remember the details, but it seems like it fixes what you describe. No, it does not because the AMAF score for move 2 is strictly lower than the evaluation for move 1 and all other moves for some reason. It will try other moves deeper in the tree instead and the position is sufficently complex to generate a very large tree until it gives up on the move at the root level. The thing is that if it searches move 2 for at least 100 simulations it will discover it is a good move. But because of the AMAF score is so low and all other moves are indeed losing moves it sticks to move 1 because it at least makes it into a fight although at bad odds. Otherwise I am quite happy with the current implementation since it is strong in testing, this only happens when there are two hot candidates and the first one is searched first because of a limitation in move ordering, and a particularly strong bias works against the second best. The annoying thing is that is can suddenly lose a game it was winning. But I found a better fix. I also tried to enter my pattern priorities as the priors of the AMAF scores. And I now strongly believe that this is also better in general and not for this particular situation. In the position I wrote about yesterday this version get the right move almost immediately. I tested as Valkyria 3.2.0 overnight on CGOS and it seems to be just as strong as the previous version, and I can still tune the parameters of it. This means that the search of position will be guided in order by 1) My pattern priorities disguised as priors of AMAF scores. 2) Then AMAF score will take over 3) If a move is searched, then the true winrates for those moves will be used and there is no bias from the pattern priorities except very weakly from the AMAF scores. -Magnus -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] A problem with selectivity and AMAF bias
Quoting steve uurtamo [EMAIL PROTECTED]: magnus, I hate to ask the obvious, but how much time does each simulation take? If 100 simulations would be enough information to get it out of its rut, why not just check every legal move for (say) 100 simulations before doing anything else? My program is doing perhaps 2000 simulations per second in the endgame. Your suggestion would kill performance completely, since it would apply to all nodes in the tree. If there are 16 candidates moves in a node it would spend 1600 simulations just to search the root and one would get only 2-3 ply with normal thinking times, this simply almost reverts into simple MC eval without tree search. This is how my first MC program worked before UCT, it did alpha-beta search evaluating each move with a constant number of moves unless a cut was enabled early. Here is an example of how wonderfully selective the search is right now: After 500 simulations it has switched to the correct move 2 which is searched 292 times wheras as move 1 was searched 184 times. With my first fix the switch would take at least 2 seconds. The program was allowed to search for 15 seconds but stopped after 2.8 seconds because of superiority of the best move. Here is how the search tree looks like. Root: Move 1 is searched 6163 times WR=67%, move 2 575 times WR=48%, all 16 moves that were not pruned at move generations they were searched 2-14 times each. 1 ply 3568/1149/785/272/181/156/... and then 1-7 times 2 ply 2432/641/297/59/48/39/16/... and then 1-3 times 3 ply 1016/799/501/98/... and then 1-2 times 4 ply 766/166/25/16/... and then 1-5 times for each ply I listed the number of simulations spent on all candidate moves that got any attention. The principal variation is longer than this but my programs only prints those positions that have been searched above some threshold. Also the move ordering of this principal variations is very strong. As you can see the tree which is something like 16x15x14x13x12x... has been reduced to something like 2x3x3x2 which is higly selective. And the quality of moves are still excellent. This is of course only possible in the endgame, in the opening the width of the search is reduced from about 20-25 candidate moves to 3-7 moves. I probably still have some strong biases in AMAF in my program, and I guess there will be more positions whith bad behavior but from now on this will do it for me because as long as it does not locks into searching only one move the bias is probably mixed up and cannot become that extreme. But any ideas on how to remove bad bias from AMAF is wellcome. The only thing I do right now is that I only use the first 70% of the moves played in the simulations for AMAF, following the idea that moves at the end of each simulations probably is only noise anyway. -Magnus -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Scalability study request
Hi all, the result of the scalability study at http://cgos.boardspace.net/study/13/index.html seems to look a lot like 2 parallel lines over the entire range, which I find very surprising, since I'd have expected at least some differences caused by different playout strategies. I am now wondering if scalability could be unaffected by playouts (just adding a constant offset) and only depend on the UCT/search implementation. From the publications of the MoGo team it seems likely that the programs are very similar there. I am able to provide a Leela version that is identical to the one currently used in the study, but with light (uniform random) playouts (and being about 3 times as fast). I think it would be very interesting to see the behaviour of this. IMHO it would provide deeper understanding in how the combination of MC/UCT works. The critical question is if we would get another parallel line, indicating that the search is the key to further progress, or if we would get a line with a different steepness, indicating playouts are the key. Or maybe the two are more strongly linked and both contribute. Either way I believe more data would be usefull. -- GCP ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
I am now wondering if scalability could be unaffected by playouts (just adding a constant offset) and only depend on the UCT/search implementation. From the publications of the MoGo team it seems likely that the programs are very similar there. Leela and mogo are probably quite similar. On the other hand, we know that some playouts have different scalings than others: - the nakade patch on the playouts does not change anything for small numbers of simulations per move - it is very efficient for high number of simulations per move. (By the way, this implies that optimizing parameters on small numbers of simulations is perhaps not always a good idea.) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
If we start up another scalability test, I'd be delighted to offer up a few computer cores. It would be real nice to not only have the light-playout variant of leela, but perhaps the nakade-patch version of mogo and maybe even some third program. ( if wishes were horses ... ) Terry McIntyre lt;[EMAIL PROTECTED]gt; Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery. Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
Olivier Teytaud wrote: I am now wondering if scalability could be unaffected by playouts (just adding a constant offset) and only depend on the UCT/search implementation. From the publications of the MoGo team it seems likely that the programs are very similar there. Leela and mogo are probably quite similar. On the other hand, we know that some playouts have different scalings than others: - the nakade patch on the playouts does not change anything for small numbers of simulations per move - it is very efficient for high number of simulations per move. (By the way, this implies that optimizing parameters on small numbers of simulations is perhaps not always a good idea.) Is the patch in some way parameterized by the number of simulations? -- GCP ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
simulations is perhaps not always a good idea.) Is the patch in some way parameterized by the number of simulations? No. Perhaps it should :-) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
--- Olivier Teytaud [EMAIL PROTECTED] wrote: light-playout variant of leela, but perhaps the nakade-patch version of mogo and maybe even some third no problem for the nakade-patch version of mogo, but results are only known in 9x9, no idea for 13x13. Maybe it is better, maybe it is worse :-) Good to find out, no? Terry McIntyre lt;[EMAIL PROTECTED]gt; Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery. Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
light-playout variant of leela, but perhaps the nakade-patch version of mogo and maybe even some third no problem for the nakade-patch version of mogo, but results are only known in 9x9, no idea for 13x13. Maybe it is better, maybe it is worse :-) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
Good to find out, no? we have validated that: - it is good in 9x9; - it is bad in 19x19 (unless perhaps for very large number of simulations). we did not have a look at 13x13. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
Olivier Teytaud wrote: light-playout variant of leela, but perhaps the nakade-patch version of mogo and maybe even some third no problem for the nakade-patch version of mogo, but results are only known in 9x9, no idea for 13x13. Maybe it is better, maybe it is worse :-) At 9x9 you see a diminishing return in the previous study already, perhaps because you search so deep without the UCT exploration term. So in this case I'm not surprised that improving the playouts changes the steepness. It's not so clear to me what happens in the area where the search itself is still scaling so well (as is the case in the 13x13 study area). My playouts should be quite different from yours, so the parallel lines in the study were very surprising to me. -- GCP ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
If we can get some consensus on what to test, we can do more.Or we can add 1 program version to the current study. Any ideas?(Or we could do a 19x19 study.) - Don terry mcintyre wrote: If we start up another scalability test, I'd be delighted to offer up a few computer cores. It would be real nice to not only have the light-playout variant of leela, but perhaps the nakade-patch version of mogo and maybe even some third program. ( if wishes were horses ... ) Terry McIntyre lt;[EMAIL PROTECTED]gt; “Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery.” Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scalability study request
Gian-Carlo, We could probably add this new version to the mix and extend the study.But what kind of data has your own testing produced? Do you have an indication that it is roughly as strong at the same basic time setting (because of it's being 3X faster or so?) Even if it isn't, it would still be interesting to see if the line is parallel.It might indicate, for instance, that some simplified hardware implementation of play-outs could be competitive. - Don Gian-Carlo Pascutto wrote: Hi all, the result of the scalability study at http://cgos.boardspace.net/study/13/index.html seems to look a lot like 2 parallel lines over the entire range, which I find very surprising, since I'd have expected at least some differences caused by different playout strategies. I am now wondering if scalability could be unaffected by playouts (just adding a constant offset) and only depend on the UCT/search implementation. From the publications of the MoGo team it seems likely that the programs are very similar there. I am able to provide a Leela version that is identical to the one currently used in the study, but with light (uniform random) playouts (and being about 3 times as fast). I think it would be very interesting to see the behaviour of this. IMHO it would provide deeper understanding in how the combination of MC/UCT works. The critical question is if we would get another parallel line, indicating that the search is the key to further progress, or if we would get a line with a different steepness, indicating playouts are the key. Or maybe the two are more strongly linked and both contribute. Either way I believe more data would be usefull. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/