[Computer-go] Recent paper on Learning Playouts in IEEE TCIAIG
Hi all, It has been a long time since my last post here! I have been away from computer go since about one year, but the publishing process is really slow. This paper describes research I was doing in 2012 and, since the reviewers were not quite convinced, I had to do more experiments in 2013 and that improved the paper a lot. Now it is finally published! Since learning playouts is a field I have always considered can be the next breakthrough, (I believe all programs have can benefit of some form of learning playouts) I hope you will find it interesting. http://www.dybot.com/research/doku.php?id=tciaig_paper Thanks again to Peter Drake, Sam Stewart and Marcos Moreno for their contributions. PD. The site has also older research and my PhD dissertation (the dissertation includes the learning playouts research as it was end 2012, the paper includes more recent results and insight). ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] A Regression test set for exploring some limitations of current MCTS programs in Go
Hi Aja, The testing program codes different problems in the same sgf file like in: loadsgf sgf/seki/case1.sgf 4 14 genmove w #? [B2|J3] loadsgf sgf/seki/case1.sgf 6 16 genmove w #? [B2] If you ignore the move numbers, j3 is not even a legal move. Unfortunately, move numbers hardly mean anything since the sgf file is not a game, but a list of stones. Each program will translate that its own way and get different move numbers, possibly alternating B,W,B,W.. or whatever. I also, don't know what the numbers 4 and 6 mean at the end of the loadsgf command. Can you please provide a list of the last moves played before the genmove so we can verify that we are all analyzing the same position? Ideally, I would prefer a simple sgf file without tricks representing the tested position, but assuming that this position is reachable by just removing the last move a number of times, I can produce the SGF file myself. I would be happy to participate in your test. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] useless ko threats
Rémi Coulom wrote: Accelerated UCT does this: https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_A pplication_to_Two-Player_Games-111.pdf?page=downloadPaper https://www.conftool.net/acg13/index.php/Hashimoto-Accelerated_UCT_and_Its_ Application_to_Two-Player_Games-111.pdf?page=downloadPaperfilename=Hashimot o-Accelerated_UCT_and_Its_Application_to_Two-Player_Games-111.pdfform_id=11 1form_version=final filename=Hashimoto-Accelerated_UCT_and_Its_Application_to_Two-Player_Games- 111.pdfform_id=111form_version=final This idea was mentioned, circa 2009, on this list. It seemed intuitively right that giving more weight to most recent results should improve play. I implemented it pretty much like the author of the paper in 2009 and it is still a configurable option in my program. I also used simulated sources of results. In simulation it became clear that it was working fine and the learning evaluation was a much better estimator of the final value than the non learning (In my implementation it is called estimate trend). When playing games, not only it didn't work but it makes the program clearly weaker. Even constants resulting in a very slow learning can lose 10 to 20 Elo points. No value has ever made it stronger, at best I can fade it out completely making it irrelevant. And I have tested it more than once, because I believed in it, as the program has evolved for double digit kyu to 3-4 kyu, always with negative results. Has Someone else tried it? I am still interested in understanding why it doesn't work (for me) as it seems a good idea. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] fuego and fixed number of playouts
Hi, To set the number of playouts, you need: uct_param_player ignore_clock 1 uct_param_player max_games *number* for a typical testing environment, you probably also need: uct_param_player ponder 0 uct_param_player reuse_subtree 0 also uct_param_search max_nodes *number* 1400 is about 1 Gb uct_param_search number_threads *number* and book_clear if you don't want the opening book to avoid calculation Jacques ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] CrazyStone in the 5-dan footsteps of Zen
Maybe a tournament is not the best way to see quality computer/human games. There are better ways to measure the computer/computer performance and the human/human performance is not interesting here. We could simply schedule computer/human games on KGS (e.g., 3 times a year, one in each time zone afternoon) with around 4 KGS 5d+ humans and the 2 bots Zen and CrazyStone. Obvious human candidates are: Aja Huang, Robert Jasiek, Stefan Kaitschick, BotHater (don't know his name). Humans could use this list to subscribe and the pairings could be listed in advance. I don't think there are masses of KGS 5d+ players. It would be fun to watch. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Thoughts about bugs and scalability
Darren Cook wrote: Hence the proposal to use alpha-beta as the top-level search, using MCTS with about 50K playouts at the nodes. This is not anyhow close to what I am researching so take my word on this as just a though. But alpha-beta is a method to find outliers. When a program is happy with a losing position because it misevaluates a safe opponent's group that dies in the simulations, you will take its word rather than the word of a more pessimistic (in this case realistic) program. You are not even finding an average, just taking the word of the most optimistic. I guess there is headroom for improvement using consensus methods. Somehow that is what all the clusters are doing, blue-Fuego, deep Zen and the distributed Pachi are just ways to find a consensus between different instances of the same program by forcing them to synchronize their root evaluation, principal variation up to a fixed depth or whatever they synchronize (don't know the details). Why not do the same with different instances of different programs? I think this is the way to go in distributed effort, but to do it well you need the source code. You are lucky that both Fuego and Pachi have distributed versions so it should not be too difficult, at least for these two. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] [Computer-go ] Congratulations to Zen!
Ingo Althöfer wrote: ... don't let that fool you, Zen with over a minute per move will be a hell of an opponent. From where do you have this knowledge? Or is it just your opinion? It is my opinion based on my own experience. I can't tell about Zen. But I know how hard it is to code a program until its first win against gnugo in 19x19. Because people here are used to see programs playing 4k with just 2500 sims (like Aya) you may not realize how hard it is until you write your own. With long times you start seeing single digit kyu programs play really good moves. Magnus also commented in the first slow bot tournament that he was very happy how well Valkyria was playing. I don't seen why Zen would be an exception. Jacques. PD. Sorry I didn't change the title of my previous message and changed the thread name. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] [Computer-go ] Congratulations to Zen!
Nick, thanks again for organizing and the new table. A conclusion that can be draw from the table is that, except for the 2nd and 3rd place of Pachi and MF which was very disputed and anyone could have won, the different ranks give very consistent results. Almost all games against lower ranked opponents are wins. And the 6th is Mogo! We really have a pool of strong programs. Congratulation to StoneGrid, I didn't know it was so strong consistently beating Mogo on these time settings. Points are harder to win than in Formula 1! But I really expect to finish this year above zero. ;-) PD. Thank you very much Aja! I am eager to try the Erica binary. I still use Silvain's Mogo despite its 15M nodes limit and Fuego of course. Erica will be welcome. I hope it can be set to fixed number of simulations and that it can run many sims with reasonable (say 2 Gb) limits. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Direct DX11 and graphics cards for cheaper simulation hardware?
Don Dailey wrote: I don't understand your concept of wrong direction.When you expand the tree for ANY move you get more information about how good or how bad that move is, so it's always helpful (on average) to have more time. I think Hideki's argument is about: more simulations won't reach the right move in real-world circumstances. If there is a skillful move that requires a precise sequence and the playouts get it wrong, the tree will not explore that direction frequently enough to find the skillful sequence by itself. It may take a million years and more RAM that is available on the planet. That is a fact and it can be shown by examples, but that does not contradict your point. These positions exist, but there will also always be positions that fewer simulations get wrong and more simulations get right. So scaling will always maintain, unless there are bugs or RAM limits. When the tree stops growing, its leaves are basically tree-less MC evaluations and it is known that they do have an asymptotic behavior. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Direct DX11 and graphics cards for cheaper simulation hardware?
Don Dailey wrote: Are you trying to say that heavy playouts are better? Who is going to argue with that?I agree completely. Are you trying to make the point that there are very simple to understand positions that computers cannot easily solve? I agree with that. Are you trying to say that heavy playouts can solve many types of common positions orders or magnitude faster than light playouts? I agree with that. Are you trying to say uniformly random playouts suck? I agree with that. I do not pretend to argue. Just to clarify ideas and read what others have to say. And of course I agree on all that. In self play all MCTS programs scale. Everybody agrees and it has been tested empirically. Intuitively: If we admit that 2000 sims is better than 1000, since nodes in the tree are trees themselves, it is clear that no matter how many million simulations we play, there will always be nodes with 1000 visits and they would be better evaluated if they had 2000. The entire tree relies on the correct evaluation at the nodes so the entire tree benefits of more sims. A different question is: Can a really weak program, say vanilla MCTS with uniform random playouts, just no eye filling (no RAVE, no progressive widening) reach the strength of, say Aya, with 2500 sims (KGS 4 kyu) in 19x19 ? The answer is: Theoretically: Yes. In practice: No. Not with a trillion sims per move. You probably don't disagree since that is implicit in heavy playouts can solve many types of common positions orders or magnitude faster than light playouts. Note that this question is equivalent to: Would the current version of Zen become a pro just with hardware evolution? Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Best low-cost hardware for MCTS go
Hi. I know this subject comes to the list from time to time, but since hardware and prices change, it is interesting to read what people say. I purchased an i7 920 two years ago. I was very happy with its overclocked performance (with a Gigabyte GA-EX58-UD5 mobo) but it is starting to have stability issues. I am considering a new purchase. I find the following options: INTEL: 1. i7-960 3.2 251 Euro the CPU (+ mobo ~200) This is similar to my i7-920 but with the 3.2 GHz nominal instead of overclocked. Similar price to 2 years ago. 2. i7-970 at 488 Euro and i7-990X at 871 Euro (same mobo) Why are 6 cores Intel processors so much more expensive than 4 cores? Are these prices right? 3. Intel 2x xeon: 2x6 cores (x 2 hyperthreading) = 24 cores I don't even have a price for this, most providers don't even sell such things. (This is not California.) Someone knows if some of this is available at affordable prices? AMD: 4. Phenom II 1075T x6 3.0 GHz 158 Euro (the 1100T 3.3 GHz) only 177 Euro. Also the mobo is cheaper for those (~80 Euro) * Does anyone have experience with this? * Can this 6 core be compared with with a 488 Euro i7 6 core? * Does it support hyperthreading? * Can anyone test an MCTS program like pachi or fuego on both AMD and Intel and report the number of Ksims/sec/GHz? 5. Dual chip: Do low-cost dual-processor AMD mobos exist? Anyone knows how to build a low-cost 24 thread machine with AMD? Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Two papers
Rémi Coulom wrote: Also, I think we did not yet advertise that paper Lukasz wrote with me last year: http://www.mimuw.edu.pl/~lew/files/Simulation-based%20Search%20of%20Combinatorial%20Games.pdf This one doesn't work for me. I get : Access forbidden! Error 403 Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] cgos down?
Don Dailey wrote: Sometimes nobody is using it, and if there is only 1 player it cannot play games of course. You can test that by putting up a second client to see what happens. I'll check it out. At the moment it is not working. When I test, I run a 19x19 test pack downloaded from: http://computergo.wikispaces.com/CGOS+Standard+Engine+Packs It has 7 engines in 19x19. I run it on a different box (1 core is enough) and my program on another box, 8 engines in all. Right now they have been logged in for more than 20 minutes and no games have started, they are just waiting for a round to start. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] cgos down?
Don Dailey wrote: Ok, I gave it a kick start. Should be coming up shortly. Its working fine now. Thanks. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] reconnecting
David Doshay wrote: I can invoke cgosview-darwin from the command line in a terminal window only if I do not include the: -server cgos.boardspace.net -port 6819 Hi, This happens in all OSs, not just Mac-OS. The correct syntax is: cgosview cgos.boardspace.net 6819 _Without_ -server or -port It should be corrected in the website. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Pachi Mailing List
Pachi won a 7h game against Zhou Junxun 9p, who is an active 9p player and has won at least one world title. I would say this put Pachi at top amature or beginning pro level. Simplifying: All pros are 9d. The handicap system is not fine grained enough to measure differences at pro level. Any handicap is too much handicap between competitive-level pros. I insist in competitive-level, because the p in pro measures the achievements in the whole career. Just like a Wimbledon winner is a Wimbledon winner for ever, even if he/she is 80 year old. Therefore, the p you get depends or strength, but also: luck, available opponents when you where a the peak of your career, psychological strength to keep focused for many years, etc. So some 9p may be weaker than some 1p, specially if big age differences apply.) So handicap 7 (wins and losses) is consistent with 2d strength, which is the current level of top MCTS programs. Again, oversimplifying. Handicap stones do not add linearly, etc. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Congratulations to Zen!
Hiroshi Yamashita wrote: wow! cross-table is great! Thanks. I agree. Another advantage of that table is that it gives insight on how well the pairing algorithm is doing. The ideal table should have empty spaces in the northeast and southwest corners with the games along the main diagonal. Of course, a perfect result would require foreseeing the results ;-) So I guess the current pairing was pretty good. Seem like WMS did a good job in the last revision. Thanks to both. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Computing CFG distance in practice
Hi, I have compiled different collections from over 100K sgf files filtering all repeated, short handicapped, wrong size, wrong player level, blitz time settings, etc. The last one I did which is use in many of my learned patterns and in my paper: http://www.dybot.com/papers/meval.htm Has 55,271 high quality games. The entire collection is stored in a single binary file which is an array of identical records. The format is documented inside the zip file. It can be found in the supplementary materials at: http://www.dybot.com/meval/ And the specific file containing the database is: http://www.dybot.com/godbase/masters.zip Games were compiled from different sources and since only the moves are stored I guess there are no copyright issues. I certainly do not claim any copyright on this file and think it is correct to use it for research as public domain. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] semeai example of winning rate
Ingo Althöfer wrote: How difficult would it be to design a program that generates such bot-difficult semeais (or at least building blocks for such) automatically? Computer generated problems are not new. Thomas Wolf, the author of Go Tools generated a set of tsumego problems automatically that were very interesting and an important part in the development of Go Tools. He also shared the problem set with other researchers. I used it around 2005, before I went MCTS, I planned to implement something with strong local searches linked by deterministic logic that solved redundancies (territories counted more than once by different cluster searches.) The idea was abandoned long time ago. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Semeais
There was another nice example in the last Tournament. http://files.gokgs.com/games/2011/1/9/PueGo-pachi2.sgf Fuego was leading easily (70% according to its own WR) until move 189. Imo, move 190 White .. T7 is a big mistake allowing B to play 191 Black M13. After this move the semeai is won for B and pachi easily wins the game. I ran fuego up to 3Msims and it is obstinated with 190 .. t7. Even gnugo plays this correctly. It plays q14 which is nothing, but is at least kikashi. After r14 (which was forced by q14) it plays m13 correctly keeping W's advantage and easy win in the game. Gnugo 3.7 190 White .. Q14 (Cap. by B: 4, by W: 6) 191 Black R14 .. (Cap. by B: 4, by W: 6) 192 White .. M13 (Cap. by B: 4, by W: 6) Of course, stronger players (I am 2k) may improve this analysis or find a different spot where the game turned from an easy W win to an easy B win. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Orego 7.08 released : Power of Forgetting paper
The discussion changed the subject to joseki, but the paper is not about joseki at all. (There is a poster about joseki in the same website.) The Power of Forgetting is an improvement to the previous Last-Good Reply idea. The results are spectacular and implementation is super-simple. Looks like RAVE applied to playouts, the simple heuristic that beats more ambitious ideas. It improves a MoGo-like policy: (capture, escape, 3x3) from ~10% to ~35% with 8K playouts and from ~25% to ~65% with 32K playouts. (Winrate against GnuGo) in 19x19! Something i guess, everybody will want to try. I will. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Special Offer: Towards perfect play of Scrabble
I typed: Towards perfect play of Scrabble Sheppard (with quotes as above) in Google scholar Followed the links from the first URL until http://arno.unimaas.nl/show.cgi?did=10724 then file 2, which is: http://arno.unimaas.nl/show.cgi?fid=7134 and was able to download a .pdf To my surprise a scanned (not very normal for a 2002 document) 16.7 Mb with the complete 281 pages. I did not find the time to read it yet, but it is in my to do list. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Oakfoam and ELO Features
Hi Francois, You mean all the 3x3 patterns? I'm only using 3x3 patterns that occur a number of times in my training collection. If 3x3 is the only pattern size you use, that probably does not apply. But if your pattern system supports other sizes then you must have implemented some kind of not found mechanism. All not found patterns will have identical urgency, the urgency of not found. As the number of 3x3 patterns is small, 64K patterns because there are ill-formed patterns you cannot find, like patterns where off the board points do no form a side or two sides sharing a corner. So my advice is: regardless of if you found them or not, include all legal patterns, i.e., all patterns where off the board points are as in (1,1), (2,1) or (2,2). In playouts, sooner or later you will find them all. I'm not following you here. What sort of classes? Degrees of liberty? R^40? Would you mind explaining a bit for me? Once you remove the ill-formed patterns and even considering mirror and rotation, you still have tenths of thousands of patterns. Each one has the right to have its own gamma value. But you cannot reasonably optimize so many gamma values, therefore the idea is to group them. Some categories are just standard go relations as you can see in the Mogo papers threatens a cut, bad keima, contact move etc. Other are related with the height (isolated stone in 1st row vs isolated stone at least on the 2nd) and the ones I have tried most are life preserving For instance in a 3x1 shape. OOO OXO OXabaXO --- Clearly, a is much worse than b for both players. a for X kills its own group before it has a 1-point eye. a for O does nothing good, probably forces X to play at b and save the group. So it is clear that the shape of b should have more urgency than that of a. Other shapes are related with connecting your stones, and finally there are classes where you put all the patterns that do not fit in any of the previous but may be, on a side, on a corner, with 1 liberty, with 2 liberties etc. My classification was done by trying to make the most of 3x3 patterns and to identify as a class all the ideas I found in literature plus the life preserving ideas. But I cannot say it works as good as I would like. Something better can be done for sure. When you have done such a classification, all the patterns in the same class have the same gamma value. The gamma of the class. With 40 classes (in my case) you only have to tune 40 gamma values, not all at a time. Say you let 8 change randomly (x.25, x.5, x1, x2, x4) and keep the other 32 constant. Then you see the result of around 2000 games and find 1, 2 or maximum 3 of these values working consistently better high than low (or vice versa) across the 2000 games. These are the candidates to be changed. The initial set of gammas is uniformly random (all equal). But please, feel free to find better ideas and share them ;-) Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Oakfoam and ELO Features
Hi Francois, Welcome For reference I need about 100k playouts with RAVE to get 50% winrate against GnuGo 3.8 L10. Yes that's more or less expected. At least before the big improvements (yet to come ;-) In my case I do a lot of testing at 4x1 because the games are around 15 seconds long and I get fast Elo confidence intervals. At that rate 40 K plyo/move I get about 40-42% of wins against gnugo. This is more or less consistent with a debugged barebones without particular smarts (but with RAVE, without progressive widening). I guess 40% at 4 scales to 50% near 100K but the exact point where I reach 50% has not been studied as I expect it to be much lower in the near future. (Optimistic) The next step is obviously to apply these to the playouts. I am currently testing my program with the ELO features in the playouts, but unfortunately the preliminary results don't look good. That's exactly my experience! Although you do get improvement with extend from atari/capture/distance to prev heuristic. The Mogo and CrazyStone papers report improvement all features included which is true because the other ideas produce improvement, but they don't give results for the patterns in isolation. I got a lot of improvement from Rémi's Bradley-Terry ideas in move prediction (although with some overlearning which I didn't care much about as predicting moves is not my interest.) But neither the naif values (times played/times seen) nor the improved Bradley-Terry values are better in playouts than uniform random. They are 158 CI(114..202) Elo points worse! That is good and bad news. Why should uniform random be the best?. Obviously it is not. But what humans play lacks all the information about what they don't play because it is obvious to them, but it is not obvious to a silly playout policy. How to find good values for the patterns? (What I have tried.) a. Use small patterns (3x3) with all non-ill-formed patterns in the database. (Other databases have a value for unknown this one shouldn't.) b. Classify patterns. I have done that in 40 classes. This way you reduce the amount of degrees of liberty. So your vector of gamma values is in R^40 c. Then what? I really don't know. I have a sort of genetic algorithm. I like the idea that anything changes at random, because the gammas are not independent and this way the expected value of the correlation is zero even under stochastic dependence. Then I select the best winners and move my center of gravity one little step in one or two classes of patterns repeat the entire process. Then test to see if there was improvement. A long process. I only won a little in the first iterations. After tat fake improvement that wasn't verified against uniform random. In all about 100 Elo points, less improvement than the humans patterns do wrong. I guess best playouts are a research area where there is room for improvement. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] News on Tromp-Cook ?
I don't agree the first two games were that easy. In the second game the bot was ahead most of the game and failed in life and death in the top right corner. It needed more CPU power probably. It is a pity that you couldn't rent a better computer. The amazon cloud was 26 ECU (1 ECU = 1 core 1GHz) iirc. As MF can make use of parallelism using a shared tree (up to 32 threads), but not using multi-nodes via MPI which is worse that one node according to David Fotland). The 26 ECU spec lacks the most important information to me. How many nodes, thread, cycles, Kplayouts/sec .. Can the MF binary you own log this kind of information? A good CPU for crunching (but still i7, not Xeon with multi-socket mobo) such as an i7-980 3.33 overclocked at 4GHz would count as 6x4 = 24 ECU, but run 12 threads. I guess the amazon machine was less performant than that. I agree with Petr that the censoring was not just strange but probably lowered the level (computer-go wise) of the remarks compared to what we are used to in KGS tournaments. Maybe, its the price of having near 500 observers, which is a great achievement we all must be happy with. Thank you to the organizers and congrats to John since it wasn't that easy. Great job. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] visualize patterns
Olivier Teytaud wrote: whereas I am looking for (2) input = a database D output = frequently matched patterns This is what I do: I have thousands of separate SGF files and a program to merge them all in a single binary file. This program filters: handicap games, insufficient rank in players, fast time settings, duplicate games, wrong board size, games with less than n moves, etc. Some of these binary files I posted here previously and the one I use most is in the supplementary materials of my IWCG 2010 paper. Paper: http://www.dybot.com/papers/meval.htm Supplementary materials: http://www.dybot.com/meval/ Then I scan this binary file very fast for: * Full board patterns (fuseki) * Corner/side patterns (joseki) * Around the last move patterns (urgency) * Statistics: e.g. How frequent is tenuki at move n, how frequently tenuki goes to the previous last move, how frequent is extension from atari at move n, or whatever I can figure out. Now the bad news: All the extraction is in thousands of undocumented Pascal (Delphi) and x86 assembly source lines. It compiles for Windows. My mew engine has new parts only in C++, but still a lot of assembly and Pascal. And offline learning tools are mainly in Pascal. It is not free software, but if you think it is interesting, I may be very happy to cooperate. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Who knows about the 2011 Olympiad?
In recent years at this time of the year it was known where and when the Olympiad and the ACG congress will be. The call for papers usually end at the end of Feb or beginning of March. Does anyone (Rémi? Hideki?) know where it will be and when? Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] MCTS vs Minimax
Petr Baudis wrote: I don't quite understand. I think most programs do not disallow eye-filling in the tree stage, only in the MC simulation stage. So your simulations will get misevaluated, but given enough time to reach final position, tree will always consider evaluating the eye-filling move too. Of course. I was thinking upwards from the end towards the root as in Álvaro's induction reasoning and thought that if the nodes are never evaluated correctly by the simulations, they will never propagate correct info. But as you state, if you grow the tree to the infinite, why play simulations at all? So infinite MCTS does not need simulations and gets the result correct. Now, why still call it MC if the stochastic part is unnecessary? Hideki Kato wrote: If you are talking about the article Subject Re: 19x19 Study. Nakade is difficult, Message-Id:47A2240E.2090508 at dybot.com, some strong programs already have managed that, though I don't know the detail. Hi Hideki. If I remember correctly, the Nakade problem was about playing or not playing self-atari moves. If done, the program understood nakade but was blind to seki detection, if not, the nakade problem emerged. Please, anyone correct me if I am wrong on this, since I think it is a very important aspect of MCTS. Also feel free to describe what you do to decide when self-atari has to be played and when not. Finding the correct policy about self atari solves the nakade problem while still understanding seki. AFAIK nobody has experimented with filling own eyes as the position was totally artificial and I don't think a program can be forced into that mistake. That was just hypothetical arguing. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] MCTS vs Minimax
Hideki Kato wrote: | . . . . . . . | . # # . # . . | . O # . # . . | O O O . # # . | # # O O O # . | . # # . . . . --- Here, the problem is filling own eye at 1-1 for B, right? Yes, that is the figure I meant, but it is not related with nakade, only with filling your own eye. Self-atari (of a small group) is already allowed by almost all MCTS programs, I believe. I don't understand how this relates with seki. The size of the group can be one criterion, but there are others. The relation with seki is: if a group that lives in seki plays the suicidal self-atari move, it dies and the playouts don't get the evaluation ok. No matter if you avoid the move in the tree or not, you must avoid it in the playouts or else the group dies. But that is tricky, as you cannot simply forbid self atari moves because some self-atari moves are tesuji. Therefore, there is black magic to decide when and when not to play self atari. The size of the group is important, but I guess strong programs must have found better ideas. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] new predictions?
Don Dailey wrote: To the best I can estimate it is something like 300 ELO now which means in a short match there is almost no chance for the human. It is more than that, around 500. Only a match with a handheld device could be seen as similar strength: Top humans are near 2800 (ratings.fide.com) Rank Name Title Country Rating Games B-Year 1 Carlsen, Magnus g NOR 2826 0 1990 2 Topalov, Veselin g BUL 2803 0 1975 3 Anand, Viswanathan g IND 2800 0 Google for Rybka elo: On a phone: Pocket Rybka 3: 2869 ELO on PocketPC? 23 Aug 2008 17.01.2009 Deep Rybka 3 x64 2GB Q6600 2.4 3227 I read somewhere (ICGA Journal I think) that it is over 3300 now. So about 500 points over the top human. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] MCTS vs Minimax
About the convergence proof to go: As far as I can understand Álvaro's proof sounds correct, but that possibly applies to most games, but not go as humans play it. Because what we play is MCTS go an almost identical game where filling your own eyes is not allowed. All MCTS programs finish their simulations because the don't fill their own eyes. You can create a position (I did it years ago and posted it in this list) where filling your own eye is the only good move. Such position is clearly a counterexample for this convergence. If you don't fill your own eye, you get the eval wrong, but if you do, when does the game end? It probably never does. Nobody has proposed anything that works other than not filling own eyes (there is probably no need anyway) and that is enough to find counterexamples. Of course you can find other mechanisms too. Paradoxically, the better it plays the least it converges to minimax evaluation. That's the trouble with minimax value, you can prove it exists, but if you can compute it, the game is not worth playing. ;-) Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] ManyFaces learning
Intriguing! Do you think it actually improves its strength, or is that just an experiment? How does it know where the losing variation starts? Is it updating some pattern weights, or updating some persistent game tree? I guess it is just an opening book that keeps win/loss info. When a branch has losses and 0 wins, you just censor that move. No need to know where the blunder was. Maybe is something more interesting. I am curious too. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] ManyFaces learning
If you censor at the root, you only have 361 losses that you can endure before you are out of ideas. I guess you implicitly meant censoring at move 3 or something. Of course not. You start with a standard book. I my case, learned from 50K games with about 25K nodes. You are just adding learned knowledge to it. (I don't do it, btw. Just guessing what David is doing from what he posted.) Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] ManyFaces learning
You said: When a branch has losses and 0 wins, you just censor that move. Apparently you meant leaf and not branch. Well I did not precise how you grow the tree. Obviously, the initial offline learned tree has all nodes with more than one visit as the number of visits is the criterion to build it. Then, you may stop growing the tree when you reach a branch that is unique. At that moment the last node is a leaf. So you may call it what you want. I wasn't that precise, but it makes sense. I may have explained it bad, but it can be done. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] Series Champion of the KGS bot tournaments
Thank you Nick! I like the idea very much and look forward to participate in 2011. My suggestions are: 1. Use separate board sizes and also combined. So you could award three different winners: a. 19x19, b. smaller boards and c. combined. Combined would be the sum of all points awarded on all sizes. As is is today, the winner in 19x19 and 9x9 will probably be a different program. The games have different approaches for both humans and computers. 13x13 is different from both, but there may not be enough tournaments to separate it from 9x9. Three winners are more opportunities for everyone. 2. I would simplify the system to a fixed number of points like: 10, 5, 3, 2, 1 for the first five. This way it is much easier to grasp a predict for everybody. A tournament is a tournament, no matter if it has more time or more rounds, that's irrelevant in my opinion. The winner will need to participate in all or nearly all and do its best so everybody has the same difficulties and the result is fair. Some may prefer fast and more rounds others slow, but in the end everyone had the opportunity to participate in all tournaments. My 2 cents. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] anti-pondering
On anti-MCTS bot strategy: I don’t know of a strategy, but there sure are principles. I can state one as a proverb: We you clarify, you are helping the bot. E.g., If a connection works but is not obvious, if a semeai can be won but is not obvious, etc. the bot has to discover it for each visited tree path and, until it does, it is reading wrong results and merging them with correct results. When a 40 point group should live 100% of the time and lives only 80%, its weight is as if it was (around) 32 points. That is systematic bias. And for handicap games: First catch-up, then clarify. Most superior humans that lost handicap games to bots did the opposite: When they feel they have catched up enough, from say -60 points to -10 they clarify thinking they will get the other 10 points easily in yose. That surely works with their students, but doesn’t work with MCTS. The bot is extremely strong at counting and if you clarify the position it is very hard to convert a losing game into a winning game. If you do it properly, leave the bot in its blurred misevaluation status until you have catched up, when you clarify the bot will resign. Of course, this is easier said than done. Applying this requires a lot of skill, but not applying it may be the reason why stronger players lose their first games. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] anti-pondering
Oops. Should be: When you clarify, .. Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] time settings
About the previous question, in which phase should more time be given: Don Dailey wrote: In 9x9 go the first few moves are extremely critical and if you start badly you cannot recover. Petr Baudis wrote: This depends a lot, e.g. on 9x9 it is probably worth spending a lot of time in the opening while in 19x19 stashing it for middlegame likely makes a lot more sense. Looks contradictory but it is not. It is consistent with the fact that computers play near pro level in 9x9. At this level you simply cannot afford going wrong, you pay it with a loss. Human pros claim (In the Beginning, Ikuro Ishigure) that when games are 2 days long, you have to dedicate and entire day to the first 50 moves. I agree with Petr that investing in middlegame and endgame is more profitable at the current strength, but that is only (my guess) because 19x19 computer go is so far away from pro level, unlike 9x9 where investing in the first moves is more profitable as Don states. In 19x19, the moves between 10 and say 60 are still the most important moves, (assuming the first 10 moves are “canonical”). They decide if you gain a superior position and your opponent will have to play out of necessity for the rest of the game, or it is you who plays the rest of the game from behind. The problem is just current MCTS programs don’t have a clue what to do with this extra time in the opening. This happens because the tree does not reach endgame or near endgame positions, it is floating in nodes evaluated by naïve playouts. Their kankaku (Yamato used this word for “feeling” in the sense of the value to which playouts converge) is kyu level, no matter what playout policy you choose. Therefore, why give them time to solve a problem they cannot solve? Jacques. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go