Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi, I really put all my reading skills into case1.sgf of two_safe_groups. the test file says black to move, and I really can not get an answer for w to win against B J7 (and oakfoam does not too:). (by the way, the sgf says w to move) Could anybody please give me a hint Detlef Am Mittwoch, den 20.03.2013, 12:39 + schrieb Aja Huang: Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Are you looking at the initial state or the state after W j8 (after which B J7 is self-atari, and W lives)? Perhaps just show the position here to make sure we talk about the same one. On Tue, Jun 25, 2013 at 4:01 PM, ds d...@physik.de wrote: Hi, I really put all my reading skills into case1.sgf of two_safe_groups. the test file says black to move, and I really can not get an answer for w to win against B J7 (and oakfoam does not too:). (by the way, the sgf says w to move) Could anybody please give me a hint Detlef Am Mittwoch, den 20.03.2013, 12:39 + schrieb Aja Huang: Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
You are right, I did not see that I missed the w move, sorry for the noise:( Detlef Am Dienstag, den 25.06.2013, 16:21 +0200 schrieb Erik van der Werf: Are you looking at the initial state or the state after W j8 (after which B J7 is self-atari, and W lives)? Perhaps just show the position here to make sure we talk about the same one. On Tue, Jun 25, 2013 at 4:01 PM, ds d...@physik.de wrote: Hi, I really put all my reading skills into case1.sgf of two_safe_groups. the test file says black to move, and I really can not get an answer for w to win against B J7 (and oakfoam does not too:). (by the way, the sgf says w to move) Could anybody please give me a hint Detlef Am Mittwoch, den 20.03.2013, 12:39 + schrieb Aja Huang: Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Interesting that Aya and ManyFaces scored the same. -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Hiroshi Yamashita Sent: Tuesday, March 26, 2013 4:28 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go Hi Aja, Thanks for posting this result. It seems seki is easy for MC. But semeai and Life and Death are big problem. Almost all programs don't understand except Zen. Zen's result is awesome. I think it is a reason Zen is 5d or 6d and others are 2d or 3d. About test, this post is helpful. A Regression test set for exploring some limitations of current MCTS programs in Go http://www.mail-archive.com/computer- g...@dvandva.org/msg04954.html Seki128k playouts Aya 32/33 96% Crazy Stone 29/33 87% GNU Go L10 29/33 87% GNU Go MC 24/33 72% Gomorra 26/33 78% Many Faces 32/33 96% Pachi 26/33 78% StoneGrid 31/33 93% Steenvreter 33/33 100% Fuego 17/33 51% Zen 33/33 100% two_safe_groups_0.3 128k playouts (W two groups are alive.) Aya 2/15 13% Gomorra 0/15 0% Many Faces 2/15 13% Pachi0/15 0% Steenvreter 4/15 26% Fuego0/15 0% Zen 13/15 86% My anti-semeai version Aya gets Aya 7/15 46% (anti-semeai) Aya 1/15 6% (normal) But I could not get good result on KGS and selfplay from anti-semeai. Its strength is almost same. Maybe side-effects? My anti-semeai strategy is 1. Each node searchs its first 100 playouts. Then 2. Semeai analysis runs. 3. Recognize two adjacent groups that has similer living percentage, like 25% and 28%. 4. Count their true liberties. If one has 2 libs and one eye, 1 share lib, 5 nakade in the corner include 2 dead stones, etc... 5. In playout, if one reduces his own true libs, undo and play killing another group move. Regards, Hiroshi Yamashita - Original Message - From: Aja Huang ajahu...@gmail.com To: computer-go@dvandva.org Sent: Wednesday, March 20, 2013 9:39 PM Subject: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups- regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Aja, Thanks for posting this result. It seems seki is easy for MC. But semeai and Life and Death are big problem. Almost all programs don't understand except Zen. Zen's result is awesome. I think it is a reason Zen is 5d or 6d and others are 2d or 3d. About test, this post is helpful. A Regression test set for exploring some limitations of current MCTS programs in Go http://www.mail-archive.com/computer-go@dvandva.org/msg04954.html Seki128k playouts Aya 32/33 96% Crazy Stone 29/33 87% GNU Go L10 29/33 87% GNU Go MC 24/33 72% Gomorra 26/33 78% Many Faces 32/33 96% Pachi 26/33 78% StoneGrid 31/33 93% Steenvreter 33/33 100% Fuego 17/33 51% Zen 33/33 100% two_safe_groups_0.3 128k playouts (W two groups are alive.) Aya 2/15 13% Gomorra 0/15 0% Many Faces 2/15 13% Pachi0/15 0% Steenvreter 4/15 26% Fuego0/15 0% Zen 13/15 86% My anti-semeai version Aya gets Aya 7/15 46% (anti-semeai) Aya 1/15 6% (normal) But I could not get good result on KGS and selfplay from anti-semeai. Its strength is almost same. Maybe side-effects? My anti-semeai strategy is 1. Each node searchs its first 100 playouts. Then 2. Semeai analysis runs. 3. Recognize two adjacent groups that has similer living percentage, like 25% and 28%. 4. Count their true liberties. If one has 2 libs and one eye, 1 share lib, 5 nakade in the corner include 2 dead stones, etc... 5. In playout, if one reduces his own true libs, undo and play killing another group move. Regards, Hiroshi Yamashita - Original Message - From: Aja Huang ajahu...@gmail.com To: computer-go@dvandva.org Sent: Wednesday, March 20, 2013 9:39 PM Subject: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
2013/3/26 Hiroshi Yamashita y...@bd.mbn.or.jp My anti-semeai version Aya gets Aya 7/15 46% (anti-semeai) Aya 1/15 6% (normal) But I could not get good result on KGS and selfplay from anti-semeai. Its strength is almost same. Maybe side-effects? One possibility might be your anti-semeai strategy in some situations helps the weaker side in a semeai. For example, suppose B wins by 1 liberty in a semeai. The living percentages of B and W are 80% and 20%. Your anti-semeai strategy might bias the percentages to 70% and 30%. Though both B and W both become *stronger* in semeai by your new rules, but the weaker side (W) gains more. So the *balance* is broken. To handle semeais in playouts, we might need to introduce on-line learning rather than relying on static rules. Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
The attached example shows that a *good rule* might break the balance of playouts and produce a worse evaluation. Suppose in the playout we add a new rule to forbid B's D1 self-atari. This rule makes sense since D1 is a completely meaningless suicide in terms of Go knowledge. But, in fact, this rule might break the balance of the playouts in the sense that B now has a higher probability to live by seki (if W doesn't capture at D1). And B's *live by seki* is a *wrong* result here. The correct result of the playouts should be W kills B with 100% probability. So, along with the rule prohibiting B's self-atari at D1, there should be a new rule to make W capture at D1. Then, the playouts can be balanced toward the correct evaluation. Rules for semeai are usually much more complicated and might break the balance in this way. Aja PlayoutBalance.sgf Description: Binary data ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Dear all, If you are interested, you can download the newest version of our regression test set (seki and two-safe-groups) at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip or in Fuego svn http://fuego.svn.sourceforge.net/viewvc/fuego/trunk/regression/name which contains the results of all participating programs including Crazy Stone, Zen, Steenvreter, pachi, ManyFaces, Gomorra and Fuego, etc. Kind regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Aja, Thanks for the interesting seki problems. Aya's result are http://www.yss-aya.com/g_seki_moves_1k.html http://www.yss-aya.com/g_seki_moves_2k.html http://www.yss-aya.com/g_seki_moves_4k.html http://www.yss-aya.com/g_seki_moves_8k.html http://www.yss-aya.com/g_seki_moves_16k.html http://www.yss-aya.com/g_seki_moves_32k.html http://www.yss-aya.com/g_seki_moves_64k.html http://www.yss-aya.com/g_seki_moves_128k.html I used latest case11.sgf Regards, Hiroshi Yamashita - Original Message - From: Aja Huang ajahu...@gmail.com To: computer-go@dvandva.org Sent: Saturday, May 19, 2012 7:14 AM Subject: Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go Dear all, If you are interested, you can download our latest regression test set at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip which was updated with 1. Newest, bug-free gogui-adapter.jar. 2. Fixed case11.sgf of the seki test set. 3. genmove version of the test files prefixed by g_. We appreciate that not a few authors are interested to participate in our test. Thanks Erik, Yamato and Remi for helping us check and point out the errors in the test set. We will release the final, bug-free version as soon as possible. Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Dear all, If you are interested, you can download our latest regression test set at http://webdocs.cs.ualberta.ca/~shihchie/seki-and-two-safe-groups-regression-test.zip which was updated with 1. Newest, bug-free gogui-adapter.jar. 2. Fixed case11.sgf of the seki test set. 3. genmove version of the test files prefixed by g_. We appreciate that not a few authors are interested to participate in our test. Thanks Erik, Yamato and Remi for helping us check and point out the errors in the test set. We will release the final, bug-free version as soon as possible. Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Now with the correct e-mail address. On 17 mai 2012, at 16:43, Rémi Coulom wrote: I took a closer look at the games. 19 is hanezeki: http://senseis.xmp.net/?Hanezeki I don't worry too much about that. Did this ever occur in a real game? I would recommend using non-integer komi for your tests, because they test the ability of the program to deal with jigo at the same time as they test seki. Dealing with jigo in the search is not an easy job: it is much more difficult to get a consistent search, with proved convergence to optimal play, when the outcome of the game is not binary. Completely greedy search will solve any position with non-integer komi, but it is likely to fail with integer komi (ie, get stuck on jigo when a stronger move can win but has a low evaluation in the beginning of the search). Crazy Stone evaluates hanezeki correctly if komi is set to 7.5 instead of 7.0. Sorry, that should be 6.5. With 6.5, Crazy Stone still fails. So hanezeki is still difficult. Rémi case11 is strange. In the variation contained in the sgf, W loses by two points. Aja, are you sure case11 is correct? Rémi ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Olivier, Yes that's our plan. We will appreciate very much if you could participate in our regression test and contribute Mogo's results. It will be interesting to see Mogo's performance of these test cases on large simulations like 1M, 2M, 4M or even 32M over a mega cluster/strong machine. The version of Mogo I ran over the test was downloaded at http://www.lri.fr/~teytaud/mogor It's probably not a current version and I couldn't figure out how to get Mogo's evaluation of a position. Best regards, Aja 2012/5/17 Olivier Teytaud olivier.teyt...@lri.fr If you run tests twice, you get nearly the same results ? Aja: you'll publish results with varying numbers of simulations for MC bots ? Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Rémi, Yes, you are right. Case11 is not correct. I have fixed it. Case19 is Hanezeki that might never occur in real games. The purpose of this search is to explore some limitations of current MC Go programs so Martin asked me to design the most difficult seki cases on the earth. Then I just did it. As for komi 7.0, thanks for your suggestion. We will discuss it and announce our decision. Best regards, Aja 2012/5/17 Rémi Coulom remi.cou...@free.fr Now with the correct e-mail address. On 17 mai 2012, at 16:43, Rémi Coulom wrote: I took a closer look at the games. 19 is hanezeki: http://senseis.xmp.net/?Hanezeki I don't worry too much about that. Did this ever occur in a real game? I would recommend using non-integer komi for your tests, because they test the ability of the program to deal with jigo at the same time as they test seki. Dealing with jigo in the search is not an easy job: it is much more difficult to get a consistent search, with proved convergence to optimal play, when the outcome of the game is not binary. Completely greedy search will solve any position with non-integer komi, but it is likely to fail with integer komi (ie, get stuck on jigo when a stronger move can win but has a low evaluation in the beginning of the search). Crazy Stone evaluates hanezeki correctly if komi is set to 7.5 instead of 7.0. Sorry, that should be 6.5. With 6.5, Crazy Stone still fails. So hanezeki is still difficult. Rémi case11 is strange. In the variation contained in the sgf, W loses by two points. Aja, are you sure case11 is correct? Rémi ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Aja, The testing program codes different problems in the same sgf file like in: loadsgf sgf/seki/case1.sgf 4 14 genmove w #? [B2|J3] loadsgf sgf/seki/case1.sgf 6 16 genmove w #? [B2] If you ignore the move numbers, j3 is not even a legal move. Unfortunately, move numbers hardly mean anything since the sgf file is not a game, but a list of stones. Each program will translate that its own way and get different move numbers, possibly alternating B,W,B,W.. or whatever. I also, don't know what the numbers 4 and 6 mean at the end of the loadsgf command. Can you please provide a list of the last moves played before the genmove so we can verify that we are all analyzing the same position? Ideally, I would prefer a simple sgf file without tricks representing the tested position, but assuming that this position is reachable by just removing the last move a number of times, I can produce the SGF file myself. I would be happy to participate in your test. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Jacques, We will appreciate very much if you could participate in our test. In the specification of GTP, about the command 'loadsgf' it says Board size and komi are set to the values given in the sgf file. Board configuration, number of captured stones, and move history are found by replaying the game record up to the position* before move_number *or until the end if omitted. So for the command loadsgf sgf/seki/case1.sgf 4 The program should load the position of case1.sgf BEFORE move 4, not AFTER. Just today some author found a bug of gogui-adapter and kindly reported to me: gogui-adapter incorrectly loads the position AFTER move_number. Markus has already fixed the bug for us, see https://sourceforge.net/tracker/?func=detailaid=3527339group_id=59117atid=489964 If you use gogui-adapter to translate 'loadsgf' for your program, please download the newest version of gogui which is available at https://sourceforge.net/scm/?type=gitgroup_id=59117 Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
By the way, to use gogui-adapter to translate 'loadsgf' the command is something like ./run.sh -p java -jar gogui-adapter.jar \PATH_TO_PROGRAM \ -t g_seki_moves.tst (use backslash character (\) to escape the quotes in the string) I used gogui-adapter to run pachi and Mogo as well because they both don't support 'loadsgf'. Please don't hesitate to let me know if it doesn't work for you. Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
I used gogui-adapter too because many faces doesn't have loadsgf, but gogui doesn't send the komi, so I had to adjust it by hand. From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] On Behalf Of Aja Huang Sent: Wednesday, May 16, 2012 8:53 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go By the way, to use gogui-adapter to translate 'loadsgf' the command is something like ./run.sh -p java -jar gogui-adapter.jar \PATH_TO_PROGRAM \ -t g_seki_moves.tst (use backslash character (\) to escape the quotes in the string) I used gogui-adapter to run pachi and Mogo as well because they both don't support 'loadsgf'. Please don't hesitate to let me know if it doesn't work for you. Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Dear all, Martin Mueller and I are writing a paper about exploring some limitations of current MCTS programs in Go. For this purpose we have carefully designed a regression test set which consists of 20 seki and 15 two-safe-groups cases on 9x9 board. If you are interested, it is available at http://webdocs.cs.ualberta.ca/~mmueller/ps/seki-and-two-safe-groups-regression-test.zip We will appreciate if you would like to run your program over our regression test and send us the results for our publication. It's easy to run your program through these positions (.sgf). Mainly, the script run.sh under /utility is able to run a given program for a given regression test file (.tst) and produce the result in a related html file. For example, for the seki test you can simply type ./run.sh -p PATH_TO_PROGRAM -t g_seki_moves.tst Some notes: 1. Your program must support the command sg_compare_float for the two-safe-groups test. If it doesn't support reg_genmove then the test file g_seki_moves.tst is good to use which performs genmove instead. 2. On Windows platform, you will be able to execute 'run.sh' directly at the command prompt after cygwin is installed. 3. If your program doesn't support the GTP command 'loadsgf', gogui-adapter is able to translate 'loadsgf' into a sequence of 'play' commands. The file gogui-adapter.jar under /utility is good to use because Markus has fixed some bugs for us, see https://sourceforge.net/tracker/?func=detailaid=3522401group_id=59117atid=489964 https://sourceforge.net/tracker/?func=detailaid=3519829group_id=59117atid=489964 Under /experimental results, there are results of several programs such as Fuego (tilburg version), pachi, ManyFaces and GnuGo. We thank David for providing us the valuable results of ManyFaces. The test set is really not easy because these programs all failed in many cases. Questions are very welcome. If you find any error in the test set please inform us. Thanks. Best regards, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go