Re: [computer-go] Explanation to MoGo paper wanted.
Hi, Don I can find arguments to disagree. I think what makes humans homo sapiens is reasoning, not the ability to compute numerical simulations. As a human (I am one) I feel disappointed when the explanation I get for a best move is after x millions of simulated matches it proved to be the best. If it is well done, I believe it, but I don't like it because I cannot verify it by myself. Such a program will not improve my play because it will not tell me what I do wrong or how I can improve. As a user, I would rather pay for a program that makes me improve my play than for the best program that only tells me moves are simulation gospel. On the other hand, I can imagine an old go sensei (I have never met one, I imagine him from Kawabata's novel) like a virtuous piano player, imagine Arturo Benedetti Michelangeli or Tete Montoliú (I met him). These men had their brains and even their bodies transformed by a whole life of study and improvement towards perfection. What they finally gained was _gesture_. The way they put their hands on a piano keyboard or the way they read a go board was the result of a lifetime of training. What you call a dirty hack, patterns deeply implemented in their brains. Gesture is what I expect from my programs. And a lifetime of training may be 100 hours of computing. I call it jeito. It sounds Japanese, that is appropriate for go. Here, in the Canary Islands, the word is used in the sense of a skillful trick and most people believe it is a Canarian word. The truth is it is a Portuguese word and it means gesture. Of course, this is just chatting. At the end the strongest program decides who is right and who is wrong. ;-) Jacques. Don Dailey wrote: What really irks me is that in most peoples minds, that is considered the elegant approach - hard coding tons of fixed knowledge into the program. Nothing could be farther from the truth, this is a dirty shortcut, a hack. There is nothing elegant whatsoever about having a huge database of patterns that tell the program what and what not to do - then calling it smart. The reason UCT/MC is elegant is that it has a built in mechanism to discover truth for itself. It may not be the best way and perhaps there is something better but at least it is a try.In a lot of ways it mimics the human thinking process. The MC play-outs is a crude substitute for human experience (the experience is gained on the fly) and UCT tree is a substitute for reasoning about characteristics of the position, what does and doesn't work. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Wed, 2007-07-11 at 11:47 +0100, Jacques Basaldúa wrote: What you call a dirty hack, patterns deeply implemented in their brains. What you call a dirty hack, patterns deeply implemented in their brains. The dirty hack I'm referring to is the robotic way this is implemented in programs, not how it's done in humans. With a pattern based program you essentially specify everything and the program is not a participant in the process. It comes down to a list of do's and dont's and if we can claim that knowledge was imparted it might be true, but no wisdom or understanding was. UCT simulates understanding and wisdom, patterns just simulates knowledge. Again, this is largely philosophical because even UCT programs are robots just following instructions. It's all about what you are trying to simulate and why it's called AI.I think UCT tries to simulate understanding to a much greater extent than raw patterns in a conventional program. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On 7/11/07, Don Dailey [EMAIL PROTECTED] wrote: The dirty hack I'm referring to is the robotic way this is implemented in programs, not how it's done in humans. With a pattern based program you essentially specify everything and the program is not a participant in the process. It comes down to a list of do's and dont's and if we can claim that knowledge was imparted it might be true, but no wisdom or understanding was. I'm compelled to point out that neural nets, _trained_ on patterns, which patterns themselves are then discarded, have the ability to recognize novel patterns, ones which have never been previously seen, let alone stored. The list of do's and dont's has been discarded, and what to do or not do, in a situation that may never have been seen before, is inferred, not looked-up in a library of rules. So, it is not true that with a pattern-based program you essentially specify everything. At least, not if you have thrown the patterns away, and have substituted multilayer feedforward networks for that _training_data_. UCT simulates understanding and wisdom, patterns just simulates knowledge. This is a very strong assertion. We eagerly await the proof. :-) I can just as easily assert: Trained neural nets simulate understanding and wisdom. (A static pattern library merely simulates knowledge, I agree.) Again, this is largely philosophical because even UCT programs are robots just following instructions. It's all about what you are trying to simulate and why it's called AI.I think UCT tries to simulate understanding to a much greater extent than raw patterns in a conventional program. Than raw patterns, yes. Trained neural nets, too, try to simulate understanding to a much greater extent than do raw patterns. Of course Don is right, it boils down to philosophy. And while we're on that topic, ... I regret some of the terms that have come into use with regard to AI, due to the (misguided, in my humble opinion) philosophy of some. The very name artificial intelligence bothers me; AI programs are neither. When humans run certain computer programs, the programs may seem intelligent enough to perform other tasks. By the implied reasoning, taken to its logical conclusion, a hammer is intelligent enough to drive a nail. The military has their so-called smart bombs but in truth, machines and algorithms are no more intelligent than hammers. By a similar token, pattern recognition bothers me. Machines and algorithms don't recognize anything, ever. That's anthropomorphism. A somewhat better term is pattern classification, but machines don't really classify anything, either. It is we humans who classify, _using_ the machines. It's like saying that the hammer drives the nail, when in fact it is the human who does so, _using_ the hammer. And there is nothing particularly neural about neural networks, other than their origins. (True, they were first invented -- discovered, really -- by someone who was trying to simulate a neuron, but they are much more general than that.) I prefer the term multilayer feedforward network for the type of neural net commonly used in many domains. (And now in go!) This sort of semantic nitpicking may seem too severe. However, it keeps me from falling into the camp of those who believe that machines will one day literally become intelligent, develop self-awareness, and achieve consciousness. Ain't gonna happen. -- Rich P.S. -- I hated the movie AI. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Perhaps some day a mad Dr. Frankenstein will implement massively parallel supercomputing using an array of brains in petri dishes. But it will still be the meat that is intelligent. It's the only substance capable of that. I read an article several months back where a researcher used mice neurons in a petri dish to create basic logic gates. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Wed, 2007-07-11 at 09:06 -0500, Richard Brown wrote: I'm compelled to point out that neural nets, _trained_ on patterns, which patterns themselves are then discarded, have the ability to recognize novel patterns, ones which have never been previously seen, let alone stored. The list of do's and dont's has been discarded, and what to do or not do, in a situation that may never have been seen before, is inferred, not looked-up in a library of rules. There is blurry line and much of it is semantics. Most programs have hand coded programs in them and that's pretty much the extent. neural nets and other ideas of course I welcome as I believe they are better simulators of what a brain should be than simple patterns. I'm really in favor of making the attempt to produce a program that has very little if any domain specific knowledge other than the rules of the game. I'm not claiming it will be better - but it will be more elegant, aesthetically pleasing than rote knowledge blindly applied. In computer chess there is a big deal about the opening book. This is as ugly as it gets - there is ZERO understanding of anything other than doing a database lookup. But it's pretty much a necessity and there is something to be said for benefiting from the knowledge obtained by others - even us humans do it in a big way. So don't think I'm lambasting the idea of using knowledge in crude ways - I'm foremost an engineer and will do whatever it takes. But I'm recommending an approach not based on brute force if possible. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Hi David, (...) I cannot imagine that progress will be made without a great deal of domain knowledge. Depending on what you exactly mean I disagree. I mean progress by the standard usually applied to computer Go: programs that can beat 1D humans on a full board, and then get better. For me progress meant an improvement, not a goal :-). Anyway, very few months ago almost everyone in this list said that UCT/MC was only suitable for 9x9 Go, which was said not representing the Real Game Of Go, and will never (in near future, or even in far future) do well in 19x19. Today, some UCT/MC based programs, e.g. MoGo, can give 5 stones to gnugo on 19x19 in 30 minutes game, without any additional go knowledge. For me it is an evidence that progress can be made very quickly without great deal of domain knowledge. Why we (by we I mean the all community) could not imagine other improvements? 1D humans on a full board is not so far, contrary as you seem to say... I am less sure that knowledge representation in the classical programs is the right expertise for its representation in the MC playouts. Yes, maybe you are right, I don't know. But I think it is not a bad expertise :-). And also it is a very expensive expertise (in the sense that it takes a lot of time to get). So many thing yet to do! To me it sounds like a good news, don't you think? :-) Cheers, Sylvain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Nonetheless, a program that could not only play a decent game of go, but somehow emulate the _style_ of a given professional would be of interest, would it not? Is this the case in chess? If so, I've never heard of it. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On 7/10/07, Chris Fant [EMAIL PROTECTED] wrote: Nonetheless, a program that could not only play a decent game of go, but somehow emulate the _style_ of a given professional would be of interest, would it not? Is this the case in chess? If so, I've never heard of it. I don't think that it is (but I don't know much about computer chess). For a machine to learn the _style_ of anything whatsoever, by my reckoning, is a rather difficult task. As an example, I was once privileged to attend a talk by Donald Knuth in which he described a somewhat difficult task that he was working on, and challenged us to think about, namely, to teach a machine to recognize the _style_ of some arbitrary font. Far more difficult than mere OCR (optical character recognition), wherein one already possesses the entire set of alphanumeric characters and symbols of a particular font, this task was something like the following: Given only an uppercase 'B', the numeral '4', and a lowercase 'a': Reproduce the entire font. As you might well imagine, doing that could prove a bit trickier than OCR. It's almost akin to reading the mind of a calligrapher: What strokes would be used to create a '7', an 'f', or an ampersand (''), given that we know only the three characters above? At what point do we think we have the right answer to such questions? If we think that we are finished, and then compare the font that font we have created against the actual font, then have we failed if it turns out that there are differences? That is, to what degree must our created font match _exactly_ the actual font? Pixel for pixel? Or is there a degree of leeway, within which we may be satisfied that we have succeeded? In a similar way, being able to recognize the _style_ of some particular pro go player is a bit trickier than merely creating a program that plays. It's a different problem altogether. Just as Knuth's problem is harder than OCR, so too is capturing a pro's style a greater challenge than creating a go program. [Disclaimer: I've forgotten the exact details of Knuth's challenge. He had determined that there were three or four characters that had the necessary and sufficient details (loops, serifs, horizontals, verticals, diagonals, etc.) to permit recreating the entire font, for most fonts anyway. I don't remember which characters, nor how many, although I'm sure it was either three or four.] -- Rich ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Don wrote: Of course now we just had to go and spoil it all by imposing domain specific rules. I have done the same and I admit it.It would be fun to see how far we could go if domain specific knowledge was forbidden as an experiment. Once patterns are introduced along with other direct Go knowledge, it's still fun but it feels a bit wrong, kind of like cheating. It's clear that when we do this, we introduce strengths and weaknesses to the program, making it a bit more fragile, less universal or robust. Stronger too, but more susceptible to in-transitivity. I'm on the other side of this issue. In my opinion all kinds of go knowledge are fair game and I'm rather disappointed that so small amounts of domain specific knowledge have been merged with the UCT search approaches. /Gunnar ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Benjamin wrote: I have build just for fun a simple BackGammon engine. [...] Interesting - did you also try it for chess, or do you think there's no point in this? This is a bit of speculation since I don't know enough about chess but I suspect that uniform random simulation in go is about as reliable an evaluation as a plain counting of piece values in chess, except that the latter comes without noise. So doing random simulations in chess would only make life more difficult, unless possibly if the simulation policy was really good. Doing UCT search instead of alpha-beta with some deterministic evaluation function might be an interesting experiment but I suspect alpha-beta is more efficient in that case. Othello seems like a better fit for UCT/MC and I suppose that has already been tested. /Gunnar ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Quoting Yamato [EMAIL PROTECTED]: In other words UCT works well when evaluation/playouts is/are strong. I believe there are still improvements possible to the UCT algorithm as shown by the recent papers by Mogo and Crazystone authors, but what really will make a difference is in the quality in the playouts. Sylvain said that good moves in the playouts do not always improve the performance of UCT. What do you think about this claim? I think it is true to some extent. When a new pattern is added it introduces a bias towards certain shapes. When this happens the UCT search will encounter new types of positions which it cannot handle, becuase it does not have those patterns yet. So it sometimes happens that the program gets worse with pattern A alone but with patterns A and B it gains in strength. A second way of formulating this is that there are no rules without exceptions in go. Thus for every pattern added one creates problems, and one might need to discover those exceptions to get the benefits. And then there are of course exception to the exceptions... So when I think it is possible to improve playout infinitly I would also say it is really really difficult... at least by hand which I am doing now. Here is a problem I have experienced both with Viking5 (my first MC-program) and Valkyria. When ladder reading code is added moves premature on the second line become more popular because everytime a random move of the other color is played on the first line just below the move on the second line then the program can capture that stone in a ladder and get stronger. The solution to this is to prune this kind of moves onf the first line . With Viking I also had problems with code for defending eyshape which I know made the program slightly weaker overall although it played life and death situations better. Here I think that for most such defensive moves in a playout the group is already dead ore alive anyway, and tenuki would be correct. thus using this code would also require the knowledge of what is alive, dead and unstable and this was too complicated for Viking, so in the end I disabled that code. -Magnus -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Hi, Magnus Magnus Persson wrote: Weak tactics is a problem of the playouts in my opinion. UCT as a general search method has thus little to do with ladders and other game specific details. If there are no tactical mistakes in the playouts the problems disappear. Also tactics has a large impact on where territory appears in the playout, since much of the noise in the playouts come from tactical mistakes near the end of the game. I was meaning the specific problems already mentioned in this list where you should not start something (e.g. a ladder) unless you a sure to win. The more you play (if you don't win) the more you lose. The best move is hidden by the increasingly negative evaluation of continuing the ladder another step and losing it. In a 9x9 board, ladders may no be very long, but in 19x19 they can. Of course, in the case of ladders that has simple solutions as forcing the playout to follow the atari-lines, but in more complex situations there is no known (to me) solution. Of course, UCT would find the optimal solution with infinite time, but that is not the question. In fact, it is harder to find the solution with UCT than with non stochastic methods. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Quoting Jacques Basaldúa [EMAIL PROTECTED]: Hi, Magnus Magnus Persson wrote: Weak tactics is a problem of the playouts in my opinion. UCT as a general search method has thus little to do with ladders and other game specific details. If there are no tactical mistakes in the playouts the problems disappear. Also tactics has a large impact on where territory appears in the playout, since much of the noise in the playouts come from tactical mistakes near the end of the game. I was meaning the specific problems already mentioned in this list where you should not start something (e.g. a ladder) unless you a sure to win. The more you play (if you don't win) the more you lose. The best move is hidden by the increasingly negative evaluation of continuing the ladder another step and losing it. In a 9x9 board, ladders may no be very long, but in 19x19 they can. Of course, in the case of ladders that has simple solutions as forcing the playout to follow the atari-lines, but in more complex situations there is no known (to me) solution. Of course, UCT would find the optimal solution with infinite time, but that is not the question. In fact, it is harder to find the solution with UCT than with non stochastic methods. But what I tried to say is that this is easy to fix, if one one adds the following rules to the playouts. If a move can escape from atari (ladder broken): play it. If a move with 2 liberties can be captured in a ladder: play the ladder. If a stone is in a ladder that is broken: capture it before it escapes Essentially the playouts will not play out broken ladders. Now the nice thing about UCT is that when these rules are added the program will play much better in almost all tactcal situations, even those that are not captured explicitely by these rules. The reason is that a lot of noise is removed from the playouts. So my point is that the ladder problem is not much of a problem (although writing fast and bugfree ladder code may be tricky ) for UCT itself. The code that fixes the problem can also be completely independent of the tree search. (But I do use it in the search part too, to improve move ordering for progressive widening). -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Don Dailey wrote: I have posted before about the evils of trying to extract knowledge from human games. I don't think it is very effective compared to generating that knowledge from computer games for several reasons. I would agree if we could have quality games played by computers. In 19x19, of course. But because computer moves are so vulgar when they are tesuji it is because a search has found a killer move, not because the program has good style. The killer moves only apply to that position exactly (or a local subgame whose limits are not trivial to determine). There is not much to learn from killer moves. What programs need to learn is style and from programs you only learn bad habits. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
But you can improve the prior probabilities of your search function by remembering shapes (hopefully more abstract ones in the future, including more knowledge about the neighbourhood) that seemed like good moves before, so I don't share your opinion. Whether or not this knowledge shout also be strongly employed deeper in the search tree (corresponding to the playout part) is another question to me. Benjamin I think trying to learn from human games is usually bad too, for similar reasons. I had at least 3 reasons why I think it's bad, one of them is simple what I call the omission problem, you don't really see (or sample) the reasons certain moves are or are not played. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Thu, 2007-07-05 at 09:47 +0100, Jacques Basaldúa wrote: Don Dailey wrote: I have posted before about the evils of trying to extract knowledge from human games. I don't think it is very effective compared to generating that knowledge from computer games for several reasons. I would agree if we could have quality games played by computers. In 19x19, of course. But because computer moves are so vulgar when they are tesuji it is because a search has found a killer move, not because the program has good style. The killer moves only apply to that position exactly (or a local subgame whose limits are not trivial to determine). There is not much to learn from killer moves. What programs need to learn is style and from programs you only learn bad habits. I'm really advocating self-generated knowledge where the computer itself has an active part in the process, not as a passive observer trying to learn by example only. I see very little value in this method of trying to extract knowledge from a collection of games. What you can expect to learn from human games is pretty limited. A chain is as strong as its weakest link - and computers have too many weak links that must be fixed first. Learning sophisticated and profound concepts from high Dan players by looking at their games is shooting for the moon. A master friend of mine, who was a good chess teacher once told me that if you want to learn to play chess well, it is far more important to have a good teacher teaching you than a strong player.His conclusion was that teaching skill is the overwhelming consideration and his strength was a relatively minor consideration as long as he was stronger than you.(Since then, I've come to believe that even a weaker player can teach, as long as he has something you don't have - we see that with athletic coaches who teach despite the fact that their student is far better than they are.) With machine learning, it's all about teaching skill but the teacher is some kind of learning algorithm. A passive set of games is not a teacher. The quality of the games is just not very important at the current levels of Go play compared to the teaching algorithm.You can almost (but not quite) ignore this factor.That being said, if you can generate useful data by using computers instead, you have far more control over the consistency of the data and all the variables that are important. For instance, in a set of master games what feedback do I have about each move other than that it was chosen? How do I get the opinion of the master player concerning the moves played and not played?The learning signal is almost non-existent and it's generally assumed to be move X is good because the master chose it, the others are bad because he didn't. Because of this and other things too I don't have much belief in computers learning from huge sets of games in this way. The computer has too passive a role in this kind of learning. There is no 2 way interaction between master and student. Far better, if you want to involve human players, is some kind of human assisted learning where games are played and learning takes place by trial and error and direct interaction with the teacher. But this isn't very practical for machine learning which likes thousands of examples to work from. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Wed, 2007-07-04 at 11:34 +0200, Magnus Persson wrote: but what really will make a difference is in the quality in the playouts. I would like to suggest a more abstract view of things. In the purest form of the algorithm there isn't an artificial distinction between the tree and the play-outs. The algorithm is applied as if the whole tree already exists (conceptually) and nodes are updated to the end of the game. We had to impose end nodes and a tree that grows in depth due to the fact that it's impractical to store the whole tree in memory. So we have a tree phase on the one hand, and on the other hand we have a play-out phase that simulates an unexplored tree (but without updates which introduces out of necessity a small inefficiency.) This makes everything a bit of a compromise but a well advised one due to hardware limitations. But then we started imposing our will on the play-outs in order to make them smarter. But we didn't do the same to the tree portion because we now believe they are 2 separate things (even though they really are not.) So I prefer to think of the play-outs and the tree as the same thing. I think whatever is done can be applied to both. For instance Lazarus does a lot of pruning and the pruning rules are the same for tree portion and the play-out portion. Actually, Lazarus saw most of the improvement from the tree pruning when I test each without the other. But I notice that we are now looking at the tree as the search portion and the play-outs as the evaluation function. I think that is incredible because I have always believed that tree search and evaluation are the same thing, just different forms or states. Like water and ice, or matter and energy. It's interesting that chess has this too. Traditionally programs have always had these 3 very crude phases, search, quiescence, evaluation. Modern programs have somewhat blurred these distinctions but it hasn't changed very much. UCT comes along and finally does away with the distinction altogether. Now you can call it all evaluation or search, whatever pleases you. But in it's purest form, UCT with totally random play-outs is a beautiful thing - a recursive evaluation function with zero (almost) domain specific knowledge. Of course now we just had to go and spoil it all by imposing domain specific rules. I have done the same and I admit it.It would be fun to see how far we could go if domain specific knowledge was forbidden as an experiment. Once patterns are introduced along with other direct Go knowledge, it's still fun but it feels a bit wrong, kind of like cheating. It's clear that when we do this, we introduce strengths and weaknesses to the program, making it a bit more fragile, less universal or robust. Stronger too, but more susceptible to in-transitivity. Of course we do this in Chess programs in a big way. We very tediously tell the program what is good and what is bad. It has no choice, it must accept our definition of right and wrong, our morality. However in our great wisdom we provide a search mechanism in order to correct our bad judgments. The search mechanism is an admission that we know we are wrong about many things. Of course you are right - if the play-outs are improved, the quality of the moves will also improve. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Wed, 2007-07-04 at 16:57 -0400, George Dahl wrote: Of course now we just had to go and spoil it all by imposing domain specific rules. I have done the same and I admit it.It would be fun to see how far we could go if domain specific knowledge was forbidden as an experiment. Once patterns are introduced along with other direct Go knowledge, it's still fun but it feels a bit wrong, kind of like cheating. Is it still cheating if the program learns and discover's the patterns itself? Then isn't it just precomputing a few things? Of course it isn't cheating really, but it seems more elegant to me if the computer is doing the figuring out, not the programmer. Of course the programmer has to figure out how to write the program in the first place. But the idea of writing a Go program without any hand-coded Go knowledge is very appealing to me. Of course, there HAS to be Go knowledge, even if it's figured out by the software. In Lazarus, I use several patterns for pruning moves. But those patterns are not generated by ME. Lazarus knows more about Go than I do and so Lazarus generated those patterns (off-line.) Ultimately, I would like programs to figure out on the fly what to do. It's fun to imagine how a program would work if God wrote it. Would there be tons of hard coded knowledge built into it, or would it be a learning meta-system that had facilities for quickly finding out things for itself that it needed to know? - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
And how much would generating patterns from pro games be cheating? How about a system that gives a reward to shapes it actually played in a game, the pro games are then used as seed to start the system.. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
Pro games are cheating unless the program is one of the players. :) You are right though, sometimes compromises must be made when seeding an algorithm. My ideas on using domain knowledge from humans are sort of about maximizing a ratio. The ratio of program performance to domain knowledge added (by humans, directly). Obviously it is hard to quantify these sorts of things, but if program A is 3 times as good (whatever that means) as program B and uses only twice the human given Go knowledge, I would rather have program A. - George On 7/4/07, Benjamin Teuber [EMAIL PROTECTED] wrote: And how much would generating patterns from pro games be cheating? How about a system that gives a reward to shapes it actually played in a game, the pro games are then used as seed to start the system.. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Thu, 2007-07-05 at 00:53 +0200, Benjamin Teuber wrote: And how much would generating patterns from pro games be cheating? How about a system that gives a reward to shapes it actually played in a game, the pro games are then used as seed to start the system.. I have posted before about the evils of trying to extract knowledge from human games. I don't think it is very effective compared to generating that knowledge from computer games for several reasons. Of course I realize this is not a popular point of view! - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Wed, 2007-07-04 at 19:23 -0400, Don Dailey wrote: On Thu, 2007-07-05 at 01:09 +0200, Magnus Persson wrote: Just to disturb the vision a strong go program without hardwired go knowledge I currently think that there are some really important things in Go that are really hard or even impossible to learn with for examples patterns. The ideal program would need to learn procedural skills (algorithms). I'm not saying a program can be as good without hardwired knowledge, I'm just saying it would be a cool thing! And even if you could, it would still require hard coded meta-skills - skills programmed explicitly to enable it to LEARN or discover what it needed. So even if it wasn't direct go knowledge it would be indirect go knowledge. Kind of like, give a man a fish or teach him to fish. - Don - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
On Thu, 2007-07-05 at 01:09 +0200, Magnus Persson wrote: Just to disturb the vision a strong go program without hardwired go knowledge I currently think that there are some really important things in Go that are really hard or even impossible to learn with for examples patterns. The ideal program would need to learn procedural skills (algorithms). I'm not saying a program can be as good without hardwired knowledge, I'm just saying it would be a cool thing! - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
In other words UCT works well when evaluation/playouts is/are strong. I believe there are still improvements possible to the UCT algorithm as shown by the recent papers by Mogo and Crazystone authors, but what really will make a difference is in the quality in the playouts. Sylvain said that good moves in the playouts do not always improve the performance of UCT. What do you think about this claim? -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
I believe this claim is true in two senses: 1) If the computation necessary to find better moves is too expensive, performing many dumb playouts may be a better investment. 2) If the playouts are too deterministic, and the moves are merely pretty good, the program may avoid an important move and thus misjudge the value of a position. Peter Drake http://www.lclark.edu/~drake/ On Jul 4, 2007, at 5:52 PM, Yamato wrote: In other words UCT works well when evaluation/playouts is/are strong. I believe there are still improvements possible to the UCT algorithm as shown by the recent papers by Mogo and Crazystone authors, but what really will make a difference is in the quality in the playouts. Sylvain said that good moves in the playouts do not always improve the performance of UCT. What do you think about this claim? -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
2) If the playouts are too deterministic, and the moves are merely pretty good, the program may avoid an important move and thus misjudge the value of a position. IMO, this is the most interesting part of Computer Go today. How can one possibly design an optimal playout agent when making a playout agent that plays strong is not the solution? The only known method seems to be trial and error. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Explanation to MoGo paper wanted.
Hello all, We just presented our paper describing MoGo's improvements at ICML, and we thought we would pass on some of the feedback and corrections we have received. (http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf) I have the feeling that the paper is important, but it is completly obfuscated by the strange reinforcement learning notation and jargon. Can anyone explain it in Go-programming words? Is the RLOG Evaluation function used for evaluation or for just selecting the best move? (by doing a 1 Ply search). Can anyone explain me, why it is necessary to obfuscate things at all? Why is a move an action and not just a move, a game an episode and not a game? Is it less scientific if coders than myself can understand it? It was pointed out by Donald Knuth in his paper on Alpha-Beta, that the - simple - algorithm was not understood for a long time, because of the inappropriate mathematical notation. For recursive functions, (pseudo-)code is much better suited than the mathematical notation. Actually its pseudo-mathematic notation. Why is this inappropriate notation still used? I have build just for fun a simple BackGammon engine. I think it does what the paper proposses for the Monte-Carlo-Part. It uses a simple evaluation function to select the next move in the Rollout aka Monte-Carlo simulation. The engine does not build up an UCT-tree. It uses UCT only at the root. The rollout always starts at the first ply. The 1ply engine has not the slightest chance against sophisticated BackGammon programm. But the simple minded UCT version is already a serious opponent. By build up an UCT tree one could probably reach top Backgammon level (the effort to do this does not pay. The backgammon market is saturated). The simple engine behaves in a give position and dieces deterministic. But the roll of the dices generates sufficient randomnes. Chrilly ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
I have the feeling that the paper is important, but it is completly obfuscated by the strange reinforcement learning notation and jargon. Can anyone explain it in Go-programming words? The most important thing in the paper is how to combine RAVE(AMAF) information with normal UCT. Like this: uct_value = child-GetUctValue(); rave_value = child-GetRaveValue(); beta = sqrt(K / (3 * node-visits + K)); uct_rave = beta * rave_value + (1 - beta) * uct_value; You do not always have to understand RLGO - they don't use it in the online version of MoGo. It was pointed out by Donald Knuth in his paper on Alpha-Beta, that the - simple - algorithm was not understood for a long time, because of the inappropriate mathematical notation. For recursive functions, (pseudo-)code is much better suited than the mathematical notation. Actually its pseudo-mathematic notation. Why is this inappropriate notation still used? I agree that the pseudo-code is easy to understand. -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
I have build just for fun a simple BackGammon engine. [...] Interesting - did you also try it for chess, or do you think there's no point in this? Regards, Benjamin ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
I have build just for fun a simple BackGammon engine. [...] Interesting - did you also try it for chess, or do you think there's no point in this? The Hydra team has thought about this. Especially the Hydra chess expert GM Lutz. Some endgames are difficult to understand, but the moves are more or less forced. One could play down the line and evaluate once a clear position has been reached. One problem is the definition of clear positon. The even more difficult problem is how to incorporate this in a normal Alpha-Beta framework. How to mix the result of the normal eval with the rollout. The results in Go are spectacular, because the quality of conventional evaluations is low. In chess its at least not that bad. But one could argue, that in BackGammon the quality of the eval is even higher. The simple Rollout programm is not as strong as the best ones. But it is in relation to its eval very strong. It has also a remarkable programming-effort/playing-strength ratio. These things are also done in FPGA and the FPGA code is already much too complicated. FPGA-programming is easier than ASIC-design, but its still much more cumbersome than conventional software development. Just trying out things is not possible. We felt also, that even if it works, the improvement measured in Elos would not be very spectacular. The Elo/Effort ratio is low. I was simply too lazy (or too professional) to give it a try. Chrilly ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
The most important thing in the paper is how to combine RAVE(AMAF) information with normal UCT. Like this: uct_value = child-GetUctValue(); rave_value = child-GetRaveValue(); beta = sqrt(K / (3 * node-visits + K)); uct_rave = beta * rave_value + (1 - beta) * uct_value; Thanks for the translation. The only point I am still missing: What is RAVE(AMAF)? Chrilly ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
The most important thing in the paper is how to combine RAVE(AMAF) information with normal UCT. Like this: uct_value = child-GetUctValue(); rave_value = child-GetRaveValue(); beta = sqrt(K / (3 * node-visits + K)); uct_rave = beta * rave_value + (1 - beta) * uct_value; Thanks for the translation. The only point I am still missing: What is RAVE(AMAF)? RAVE is another name of AMAF(all moves as first) heuristic. The details of AMAF are explained in this paper. http://www.ai.univ-paris8.fr/~bh/articles/acg10-mcgo.pdf -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
We felt also, that even if it works, the improvement measured in Elos would not be very spectacular. The Elo/Effort ratio is low. I was simply too lazy (or too professional) to give it a try. it might be fun (even from a non-FPGA point of view) to try it just to see where it lies versus a convential piece of code on equivalent hardware. the game length is roughly the same, or smaller, and the number of move choices is quite a bit more limited than a 19x19 go board, (although larger than a 9x9 board in the sense that in the endgame the board is often fairly empty rather than full) so it might be surprisingly successful. s. Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
We felt also, that even if it works, the improvement measured in Elos would not be very spectacular. The Elo/Effort ratio is low. I was simply too lazy (or too professional) to give it a try. it might be fun (even from a non-FPGA point of view) to try it just to see where it lies versus a convential piece of code on equivalent hardware. the game length is roughly the same, or smaller, and the number of move choices is quite a bit more limited than a 19x19 go board, (although larger than a 9x9 board in the sense that in the endgame the board is often fairly empty rather than full) so it might be surprisingly successful. Backgammon has the big advantage, that the dices generate the randomness. Its not fully clear how to do this in chess. GM Lutz had more forced variations in mind. Its another matter who to determine forcedness. Inspite all the nasty details the idea sounds interesting. Chrilly ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
I actually have a working chess program at a fairly primitive stage which would be appropriate for testing UCT on chess. My intuition (which is of course subject to great error) tells me that it won't pay off. However, I'm still quite curious about this and will probably give it a try at some point. To give it a fair chance, you couldn't just quickly whip something together, I think you would have to spend a lot of time experimenting, tweaking, debugging, etc to get a good sense of whether it would do the job. Random moves in Chess are problematic. In a Go position, you can play 10,000 random games and this by itself is a rather reasonable evaluation function. Of course the better GO programs impose some bias and control over those games and I believe this would be absolutely crucial in chess. What Chrilly calls mutual stupidity doesn't apply as strongly in Chess (in my opinion.)In other words, if you have a somewhat better position, sampling a number of random games will not reliably measure this. - Don On Tue, 2007-07-03 at 05:09 -0700, steve uurtamo wrote: We felt also, that even if it works, the improvement measured in Elos would not be very spectacular. The Elo/Effort ratio is low. I was simply too lazy (or too professional) to give it a try. it might be fun (even from a non-FPGA point of view) to try it just to see where it lies versus a convential piece of code on equivalent hardware. the game length is roughly the same, or smaller, and the number of move choices is quite a bit more limited than a 19x19 go board, (although larger than a 9x9 board in the sense that in the endgame the board is often fairly empty rather than full) so it might be surprisingly successful. s. Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted.
A long time ago ago I spent a few hours on writing a simple chess program doing UCT-search. I got to the point where it actually played better than random but not very much. It sort of reminded me of the strength of plain MC in 19x19 Go. The problem is that many games become very long in chess as for 19x19. My implementation was also very buggy and some rules of chess were omitted for simplicity. I think one need to use code similar to what is used for quiscience search to make good heavy playouts. Also using endgame databases (or other techniques) to terminate playouts early (saving time) with a correct result may have a large impact. I think a good chessprogrammer should be able to improve the strength of the playouts a lot compared to uniform random moves and a good such chess UCT-program might be positionally strong at the expence of tactical accuracy. But those things would be too complicated for me so I stopped working on it. Quoting Don Dailey [EMAIL PROTECTED]: I actually have a working chess program at a fairly primitive stage which would be appropriate for testing UCT on chess. My intuition (which is of course subject to great error) tells me that it won't pay off. However, I'm still quite curious about this and will probably give it a try at some point. To give it a fair chance, you couldn't just quickly whip something together, I think you would have to spend a lot of time experimenting, tweaking, debugging, etc to get a good sense of whether it would do the job. Random moves in Chess are problematic. In a Go position, you can play 10,000 random games and this by itself is a rather reasonable evaluation function. Of course the better GO programs impose some bias and control over those games and I believe this would be absolutely crucial in chess. What Chrilly calls mutual stupidity doesn't apply as strongly in Chess (in my opinion.)In other words, if you have a somewhat better position, sampling a number of random games will not reliably measure this. - Don On Tue, 2007-07-03 at 05:09 -0700, steve uurtamo wrote: We felt also, that even if it works, the improvement measured in Elos would not be very spectacular. The Elo/Effort ratio is low. I was simply too lazy (or too professional) to give it a try. it might be fun (even from a non-FPGA point of view) to try it just to see where it lies versus a convential piece of code on equivalent hardware. the game length is roughly the same, or smaller, and the number of move choices is quite a bit more limited than a 19x19 go board, (although larger than a 9x9 board in the sense that in the endgame the board is often fairly empty rather than full) so it might be surprisingly successful. s. Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Explanation to MoGo paper wanted. (BackGammon Code)
On 7/3/07, chrilly [EMAIL PROTECTED] wrote: We just presented our paper describing MoGo's improvements at ICML, and we thought we would pass on some of the feedback and corrections we have received. (http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf) They are probably referring to this paper: http://www.cs.ualberta.ca/~mmueller/ps/silver-ijcai2007.pdf No, I am referring to the icml2007 paper. What I said is that ICML paper mentions RLGO evaluation function which is described in this paper: http://www.cs.ualberta.ca/~mmueller/ps/silver-ijcai2007.pdf It's because Go is not only game in the world and certainly not only reinforcement learning problem. They are using a widely accepted terminology. But a very inappropriate one. I have read Suttons book and all the things I know (e.g. TD-Gammon) are completly obfuscated. Its maybe suitable to present generel concepts, but it is extremly complicated to formulate an algorithm in this framework. But the main point is: I think game programmers should be more proud of their work and should present their results in the language of game programming. We are the ones which make progress, not these paper tigers. I agree that game programmers make most of the progress in game domain. When I wrote the Null-Move article, some referees pointed out, that the article is not scentific enough. The NullMove was already known and even published before. But only after this non-scientific article it became a must have in chess programming. The pseudo-code had a bug and I see till today this bug in open-source chess programms. Can You share the source? Yes. See attached Archive. The interface and the UCT part are rather primitive. The move-generator is better/faster than the ones I have seen in other public code (but can be certainly improved). The evaluation was published by G.Tesauro, but the implementation is more efficient. It is - according to G.Tesauro - a baseline for an evaluation. Every usefull evaluation function should clearly beat this one. The purpose of the code was to study the effect of Monte-Carlo sampling. I was deeply impressed how much better the Rollout/Monte-Carlo version plays. Thanks, I asked for the source, mostly because I expect to learn something from you. Best regards, Lukasz Chrilly ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/