[Computer-go] Why do Facebook use DNN next-move predictors? Shouldn't they use next-move generators?
When I read about Facebook's DCNN-using go program, I remembered another paper that I'd come across on arxiv, namely "How (not) to train your generative model: scheduled sampling, likelihood, adversary?" by Ferenc Huszar (http://arxiv.org/pdf/1511.05101.pdf). A lot of that paper went over my head (I am a "half-studied scoundrel" as we say in Norway), but his speculation in the end, I think I sort of got, and it made a lot of sense to me. He argues that which side you approach the K-L divergence from so to say matters for what kind of errors you get when the model, and that when you're generating as opposed to predicting, the goal should be to minimize the K-L divergence from the "other" way. When you're using a DCNN in a go program, you are really doing generating, not prediction, right? You want to generate a good move. A model that generates "flashy" moves that LOOK really strong, but could potentially be very bad, would be a good predictor, but a bad generator. The ideal probability distribution is the distribution of moves a pro would make. But to the degree your model falls short, you want to minimize the chance of making a wildly "un-pro" move, rather than maximizing the chance of making a "pro" move. Since these are probability distributions, those two things are not the same unless your model is perfect (right?). If my understanding is correct (and it's quite possible I'm way off course, I'm an amateur! sorry for wasting your time if so!), then rather than training a move predictor, they should use the adversarial methods which are also in the wind now to train a generative model. -- Harald Korneliussen ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
[computer-go] Selling programs on Linux
Gian-Carlo Pascutto Unfortunately I don't know how to sell support for my Go program, but I am open to ideas. Well, since you ask, there is one neat way you can sell your program on Linux in an open source manner: Offer a so-called assurance contract, such as the ones on fundable.org. The basic idea is that you tell the world what you want for your program - not from each user, but in total - and each of your potential customers tells how much they're willing to pay. If the sum of offers is over what you want, you take the money and release the program. If not, nobody pays anything. The advantage of this approach is that you can actually sell the source (if you want), the objectively most valuable piece of your program. In the windows world you almost have to keep that for yourself to sell the binary. Linux users would also probably want this - after all, if it's just a game that can defeat them they're after, they can just use GnuGo or the Mogo binaries. It's the source which is really valuable. The disadvantage is that those users who aren't into comparison shopping, and just randomly walk into your side one day and thinks hey, cool! - you probably won't get much from them. With a fundable-style contract, you have to sell to all interested parties simultaneously, and this requires a little marketing. Still, assurance contracts are new enough that it should be possible to catch a little attention. I'd love to see a source code release of Leela, if even just a snapshot, and I would pay for it. Thanks for Sjeng, by the way. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] The odd 13x13 graph
Don Dailey wrote: It's also interesting how the graph up to level 11 seems to form 2 very straight lines, almost as if they were connected at an angle. This must be a by-product of how we started the test. We played only the first 4 levels as we were testing the system and that is where the bend point is. Then I added levels gradually. I cannot figure out why this would cause such strange behavior. As we're closing in on 2 games, that trend seems just as apparent: It's not so much a curve as two lines, one below mogo4 and one above... but are we looking at the right axis? I wonder if it may have something to do with the anchor. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Bent four in the corner was:Scalability problem of play-out policies
Ivan Dubois mentioned the bent four in the corner shape as a scalability killer, a situation where more playouts doesn't help (much), because playouts systematically misevaluate it. As I understand it, it could be corrected in the tree, but this is very unlikely to happen until the very end of a game, by which time it may be too late (Mogo having worked the entire game for that solid 0.5 win, which turns out to be a solid loss instead because of the life-and-death misevaluation) I recalled a KGS game of Mogo I'd looked at, where something very similar happened, and with a little digging I found it again: http://files.gokgs.com/games/2007/12/1/Ardalan-MoGoBot3.sgf It turns out it's not the bent four shape, but I suspect it's another such shape, where more playouts only confirm that these moves aren't worth including into the tree, so that UCT catches them very late, if ever. If these situations can be reliably created by a human, then indeed they put an upper limit on the real world scalability of a program. If I should propose a hackish heuristic to deal with such situations, this is it: At one point, when the problematic shape appeared, the human must have done a move that to the computer seemed horribly bad. Why did he do that? Doesn't he see that my shape is alive?. When such situations occur, there are two possibilities: 1. The bot is playing a weaker human player, and the move is indeed bad. 2. The bot is playing a stronger human, and the move is actually good. I think it may be a good idea to do something with the weighting in these situations, so that the relevant moves are added to the tree. In worst case, a lot of effort is wasted in proving a bad move bad - but this should not be so serious, as the bad move will likely mean the opponent has poor chances of winning anyway. In the best case, the program's blunder is revealed after the fact. This may still leave little chance of winning, (if the ld error was severe) but at least the program's counting won't be off for the rest of the game. Since today's programs don't care for winning margins, counting errors by even a single point will spell disaster. I believe this heurtistic would be cheap in terms of computational cost, but hard to evaluate/tune. Self-play would not be very effective... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] The intelligence can of worms reopened
In the thread On average how many board updates/sec can top Go programs do these days? mingwu said of the way MC/UCT programs work that he'd hardly call it intelligent. I've thought (and argued elsewhere) that the MC/UCT approach is fundamentally more intelligent, in the sense of working more like human intelligence apperas to do, than traditional game tree search. How did humans learn to play Go? Well, they tried things, figured out what worked and what didn't. And that is what a UCT seach does as well... the main differences is that it's much worse at generalization, and (for that reason) throws away most of what it's learned from game to game. Still, it plays well with vastly less a priori assumptions/knowledge than a traditional minimax-type search. There is some, of course, but humans also rely on some a priori knowledge according to the philosophers :-) To illustrate it a bit, I know of a game in which UCT's advantages are even more striking. The game designer Nicholas Bentley recently won an award for a game he calls Mind Ninja (silly name, I know, but bear with me: it's a very interesting game). It's a generalized connection/shape-building game, which begins with a negotiation phase, where one player proposes the pattern to be built, and the other decides whether he will be the builder or the blocker. There are extremely many possible patterns. On the regular board, there are 127 points, which can be empty or coloured - the number of colours are also part of the pattern rule. By a pattern rule, each possible position must be either a win for the builder or not (the blocker wins if there are no more legal moves and the builder hasn't won). Each partitioning of positions into wins for the builder and undecided positions/wins for the blocker can constitute a pattern rule. Now, I believe MC/UCT can play this game, and play it well. (I'm working on an implementation). It will adapt to pattern rules it has never seen before, just like a human. It doesn't need any hints in the form of evaluation functions, like chess programs use - in fact, the nature of the game makes applying that approach pretty much impossible. Like a human, it can consider a pattern rule, and try to figure out which side has the advantage. Humans can certainly choose rules which would give them an advantage (I imagine nim-like pattern rules could work), but with trial and error, so could the program - and the game has a way of balancing that advantage, too. Would a MN-playing program be intelligent? I'm pretty sure they would have said so ten years ago, at least :-) But regardless, I think it could illustrate the potential of this new algorithm. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Re: language efficiency
Some thinking out loud here on the topic of languages and efficiency: I'd like to know how well MoGo would have played if you let it think for a week for every move. Only it seems to me that is not possible, because I don't think MoGo will run for a week without crashing. Crazystone also crashes quite a lot, if I understand the comments in KGS logs correctly. You got to wonder. One thing is move quality reduced by traditional bugs, but when the program periodically is forced to restart and throw away its calculations because of a crash, it seems to me there's a lot to gain from using a more reliable programming language. Also, do we really know all we need to know about algorithmic approaches? If yes, then it's about implementing them as efficiently as possible, to maximise the number of playouts per second... C or C++ is probably a good choice. But if not, it probably matters a lot more about how flexible it is, how easy it is to try out different approaches. If it runs ten times as slow as the C version, why not just let it think for ten times as long? That is affordable when experimenting, right? But I have to admit, I don't know exactly how I'd go about implementing a transposition table in Haskell :-/ Perhaps I'll try for a libEGO binding, then at least if there are stability problems, I can blame someone else ;-) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Lisp time
Don Dailey wrote: By the way, I am no fan of C. I don't like C and have tried some of the languages on your list of languages that are supposedly faster than C. I think you must be getting your information from the web pages for those languages. As a general rule any reasonably fast language is going to claim to be faster than C but shame on you if you believe it. In theory all languages are equal. They can all be transformed to optimized machine code. I am not talking about the theoretical, I'm talking about the reality.And the reality is that right now C is the choice, whether I like it or not I accept it and hope something better will come along. You are right, but there are some cases where alternatives may make sense. For instance, look at the paper Ian Osgood linked to: They used Haskell to spit out highly optimized code for Monte Carlo-simulations. There is also FFTW, which makes very fast FFT code with the help of OCaml. Code generation is one area where using another language might be considered. It doesn't have to be a functional one, or even C: I know of a cryptographic algorithm (Serpent) that has an implementation based on Perl-generated Ada code. Prototyping may also be an idea. Especially if one is experimenting with novel approaches, wouldn't it make sense to make a prototype in Haskell, for instance? Then you may reimplement it in C, very carefully, comparing outputs regularly to check for the playing strength-killing bugs Chess programmers always talk about... Occasionally there may be some language feature that makes up for the performance cost. Like the distributed nature of Erlang, or the software transactional memory libraries that GHC Haskell come with. If you believe the hype, STM scales much better than locking-based shared memory. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] low-hanging fruit - yose
Mark Boon wrote: Let me therefore change the discussion a bit to see if this will make things more clear. Consider a chess-playing program with an unorthodox search method. When playing a human after while it announces check-mate in thirty-four moves. Yet the human can clearly see it's check-mate in four. Ah, one could say, but the computer is 100% confident winning either way so it doesn't care which one it chooses. It doesn't matter whether a human thinks mate in four is more beautiful. Now it so happens that with chess we're pretty confident that when a chess-program announces check-mate that it is in fact correct. But what if there could be a sliver of doubt? Maybe the program has no doubt, but the programmer might. Bugs happen. Wouldn't you say it's better to choose the shorter mate sequence? Regardless of whether humans may find it more beautiful? Any probabilistic algorithm should of course prefer a quick win to a tedious one, since there is less assumptions it has to make, the less likely it is that one of them is disastrously wrong. But wouldn't they actually take that into account in their win estimates? Anyway, turns to win is a completely different measure than score at end. But I found something that may interest you. Three researchers at a Paris university, Julien Kloetzer Hiroyuki Iida and Bruno Bouzy, implemented a UCT program for playing Amazons. They found that for that game, performance increased when the algorithm took into account winning margin in addition to win/loss. I wonder what that means. Perhaps greedy strategies work better in Amazons than in Go? The program lost big in the olympiad, though, against a more traditional program, so it might be premature to draw conclusions... Here's the research paper: www.math-info.univ-paris5.fr/~bouzy/publications/KIB-MCAmazons-CGW07.pdf ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How does MC do with ladders?
Raymond Wold wrote: I can code an algorithm that evaluates simple ladders correctly. I'll repeat that. I can code a program that reads ladders better than a pure MC program without knowledge of ladders. I can beat it. Human knowledge programmed into a computer that does that one thing, that basic go skill, better than the MC program. Are you saying that there is absolutely no way to combine such with an MC program to make it better? Not just that no one has done it (I don't know if anyone has) but that it is impossible? Are you saying that attempts to do so are wasted? If you are, I'd appreciate it if you did so clearly. Complaining that MC programs don't read ladders well is a bit like complaining that Forrest Gump can't tie his shoelaces, it seems to me. There will be many things an MC program will be good at that we won't be, and vice versa. It's not unreasonable to believe that progress with MC/UCT programs will be through making most of its strengths, rather than try to patch its weaknesses in a way not compatible with how their brains work. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] How does MC do with ladders?
Wed, 12 Dec 2007 07:14:48 -0800 (PST) terry mcintyre wrote: Heading back to the central idea, of tuning the predicted winning rates and evaluations: it might be useful to examine lost games, look for divergence between expectations and reality, repair the predictor, and test the new predictor against a large database of such blunders. Sounds a little like Temporal Difference Learning to me. I understand both MoGo and Crazystone use patterns, do anyone know whether they use such machine learning techniques to assign weights to them? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/