Re: [computer-go] Hash tables
On Jul 6, 2009, at 10:09 PM, Peter Drake wrote: I suppose I could also include first child / next sibling pointers. I wouldn't actually use them when performing playouts, but they would greatly speed up the mark phase of marking and sweeping. Hmm. That would work for a tree, but this is a DAG. Consider this situation: A / \ B C \ / \ D E (For simplicity, ignore the fact that an actual transposition must be at least three moves long.) Here D can be reached from either B or C, but E can only be reached from C (perhaps due to ko). With first-child/next-sibling pointers, that might look like this: A | B-C |/ D-E If we then actually move to B and try to keep the tree, we'll end up keeping E, even though it's really an unreachable node. That's probably acceptable: we can keep a few unreachable nodes, as long as we keep all of the real ones. There might be another problem, though. Suppose, from the situation above, I add F and G as children of B: A | B-C | \ F-G-D-E Later, I discover that F is also a child of C. I look through C's apparent children (D and E) and discover that F is not among them. How do I add F to the list of C's children without some serious list traversal (and surgery)? Perhaps a better solution is for each node to point to a linked list of its children. The nodes in these lists (allocated from a pool at load time, naturally) would each point to a search tree node and the next list node. Peter Drake http://www.lclark.edu/~drake/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
Although the number of games explored is very limited relative to the total number of possible games, those games are in some sense representative of what happens if you start with a particular move. That's why they can help to create a ranking that tells you something about which moves are better than others. The move to heavy playouts is about making the sample games even more representative so that they yield more useful information. Others on this list have reported in the past that the randomness is actually very important. Playouts that are very heavy, no matter how clever they are, actually reduce the performance because they narrow the number of games too much. You should also read up on the all moves as first (AMAF) technique. This is even more surprising because it attributes the outcome of a random game to every move of that colour during the random game, as if that was the move that had been played first. This generates information to help rank the moves even more quickly. Oliver On Tue, Jul 7, 2009 at 2:15 AM, Fred Hapgood hapg...@pobox.com wrote: I have a really basic question about how MC works in the context of Go. Suppose the problem is to make the first move in a game, and suppose we have accepted as a constraint that we will abstain from just copying some joseki out of a book -- we are going to use MC to figure out the first move de novo. We turn on the software and it begins to play out games. My question is: how does the software pick its first move? Does it move entirely at random? Sometimes it sounds that way MC works is by picking each move at random, from the first to the last, for a million games or so. The trouble is that the number of possible Go games is so large that a million games would not even begin to explore the possibilities. It is hard to imagine anything useful emerging from examining such a small number. So I'm guessing that the moves are not chosen at random. But even if you reduce the possibilities to two options per move, which would be pretty impressive, you'd still run out of your million games in only twenty moves, after which you would be back to picking at random again. What am I missing?? http://www.BostonScienceAndEngineeringLectures.com http://www.pobox.com/~fhapgood http://www.pobox.com/%7Efhapgood ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
Quoting Oliver Lewis ojfle...@gmail.com: Others on this list have reported in the past that the randomness is actually very important. Playouts that are very heavy, no matter how clever they are, actually reduce the performance because they narrow the number of games too much. I would like to disagree with this statement. I think it is difficult to make a good heavy playouts, but it is not impossible. Failing to make a playout stronger through heaviness does not prove anything. It just mean one has failed. If I could make a heavy playout of 1 Dan strength and then run in MC tree search. I am sure it would be stronger than the playout itself. The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the same. But Fuego can play 5 times more playouts per second on the hardware that results in Fuego being slightly stronger than Valkyria at the moment. It just cannot be that having strong playout leads to worse play in general. It is just a matter of not slowing down the code too much and add just those heavy elements that do increase playing strength. -Magnus -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
On Tue, Jul 07, 2009 at 09:16:25AM +0200, Oliver Lewis wrote: You should also read up on the all moves as first (AMAF) technique. This is even more surprising because it attributes the outcome of a random game to every move of that colour during the random game, as if that was the move that had been played first. This generates information to help rank the moves even more quickly. Actually, it is not so surprising. Since the playouts are random, they have no idea of urgency - they leave large groups in atari for many moves, and it is a toss of a coin who plays to the crucial point first. Most likely it does not happen on the first move. But if that is the game-deciding point, it makes sense to give credit to who ever got to play it. -H -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
Fred, others have already indicated that you are missing the tree part of Monte Carlo Tree Search, but there is something else to add. If you run uniform random playouts on an empty 9x9 board, let's say a million of them for each point, you will see a probability distribution emerging that roughly shows the centre 3x3 or 4x4 points as having a high probability of winning (around 50%, depending on komi), and you will see the edges and first line having a much lower probability. This is not going to help you choose between the points in the centre. Every time you run the simulation, a different point will be selected as best - because the state space will be inadequately explored, as you say - but you will be able to choose a good move. On 19x19, the space is so inadequately explored that running a uniform Monte Carlo simulation on an empty board is useless (i.e. you will get completely different results every time you run it) and further heuristics either during playouts or during the tree search phase are needed. Christian Fred Hapgood wrote: I have a really basic question about how MC works in the context of Go. Suppose the problem is to make the first move in a game, and suppose we have accepted as a constraint that we will abstain from just copying some joseki out of a book -- we are going to use MC to figure out the first move de novo. We turn on the software and it begins to play out games. My question is: how does the software pick its first move? Does it move entirely at random? Sometimes it sounds that way MC works is by picking each move at random, from the first to the last, for a million games or so. The trouble is that the number of possible Go games is so large that a million games would not even begin to explore the possibilities. It is hard to imagine anything useful emerging from examining such a small number. So I'm guessing that the moves are not chosen at random. But even if you reduce the possibilities to two options per move, which would be pretty impressive, you'd still run out of your million games in only twenty moves, after which you would be back to picking at random again. What am I missing?? http://www.BostonScienceAndEngineeringLectures.com http://www.pobox.com/~fhapgood ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Christian Nentwich Director, Model Two Zero Ltd. +44-(0)7747-061302 http://www.modeltwozero.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Experimentation (was: Really basic question)
The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the same. But Fuego can play 5 times more playouts per second on the hardware that results in Fuego being slightly stronger than Valkyria at the moment. Indeed - this makes me wonder why I keep seeing papers where different versions of algorithms are compared with the same number of playouts, rather than under the same time limit. What is the motivation in this? I cannot conceive of any good reason for running an experiment this way, so I would be interested in opinions. It seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? Christian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
What is the motivation in this? I cannot conceive of any good reason for running an experiment this way, so I would be interested in opinions. It seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? * The heavier algorithm might be unoptimized; * The heavier algorithm might be easier to parallelize (or put into hardware); * The scaling behaviour might be different. E.g. if fuego and Valkyria are both run with 10 times more playouts the win rate might change. Just to dismiss an algorithm that loses at time limits that happen to suit rapid testing on today's hardware could mean we miss out on the ideal algorithm for tomorrow's hardware. (*) * By analyzing why the win rate of the super-heavy algorithm is better might give other people ideas for lighter but still effective playouts. Darren *: As an example, monte carlo itself was ignored for the first 10 years of its life because traditional programs were stronger on the same hardware. -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
Darren Cook wrote: seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? * The heavier algorithm might be unoptimized; That is probably a good pragmatic point, but that does not make the experiment any more valid. You are already designing the algorithm so that it performs better at the same number of playouts, by trading off speed. I don't think it is then valid to also use that as proof that it performs better - the experiment is slanted against the lighter, faster algorithm. Unfortunately, it is unoptimized, while usually a good argument, interferes with one of the main dimensions of the experiment, which is speed. * The heavier algorithm might be easier to parallelize (or put into hardware); That seems unintuitive, but I confess I have no experience whatsoever in that respect. * The scaling behaviour might be different. E.g. if fuego and Valkyria are both run with 10 times more playouts the win rate might change. Just to dismiss an algorithm that loses at time limits that happen to suit rapid testing on today's hardware could mean we miss out on the ideal algorithm for tomorrow's hardware. (*) I am not sure I follow this argument. How do you intend to prove that, unless you run the algorithms with 10 times more playouts? In that case, I would still argue that you should run them with X times longer time limits, not with 10 times more playouts, unless you can assume (with proof or good evidence, so you can set differential numbers of playouts between the two) that tomorrow's hardware will favour one algorithm more than the other. * By analyzing why the win rate of the super-heavy algorithm is better might give other people ideas for lighter but still effective playouts. This I can accept. In general, I do think it is interesting to see the win rate at the same number of playouts as background analysis, but I don't see it as a convincing evidence of advancement over other algorithms. Christian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
* The scaling behaviour might be different. E.g. if fuego and Valkyria are both run with 10 times more playouts the win rate might change I am not sure I follow this argument. How do you intend to prove that, unless you run the algorithms with 10 times more playouts? I think showing it is similar or better with same number of playouts is a good first step - the second experiment takes 10 times as long to run :-) Darren -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
Quoting Darren Cook dar...@dcook.org: * The scaling behavior might be different. E.g. if Fuego and Valkyria are both run with 10 times more playouts the win rate might change. Just to dismiss an algorithm that loses at time limits that happen to suit rapid testing on today's hardware could mean we miss out on the ideal algorithm for tomorrow's hardware. (*) I just happened to have experimental data on exactly this topic. This is Valkyria vs Fuego where I scale the number of playouts (Sims) x4 in each row. SimsWinrate Err N WR EloDiff 2 99.20.4 500 0.992 837 8 98.20.6 500 0.982 696 32 94.21 500 0.942 484 128 88.81.4 500 0.888 360 512 82 1.7 500 0.82263 204883.21.7 499 0.832 278 819281.31.7 497 0.813 255 32768 75.53.6 139 0.755 196 The data shows clearly that the 0.3.2 version of Fuego I use, probably plays really bad moves with a high frequency in the playouts. With more playouts a lot of these blunders can be avoided I guess and the win rate goes down from 99% towards 80%. The question here if it goes asymptotically towards 80% or perhaps 50% with more simulations. Unfortunately I cannot continue this plot because I run out of memory and it takes ages to finish the games. So the question is then: are there a fixed gain of the heavy playouts with more than 512 simulations or does the effect of the heavy playout get less and less important with larger tree size. Note also that this also not only a matter of having heavy playouts or not. There is also a difference in tree search since Valkyria and Fuego probably search their trees differently, and it could be that Valkyria search deep trees inefficiently. Maybe I should run a similar test against a light version of Valkyria to control for the search algorithm. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scoring - step function or sigmoid function?
My conclusion was that the winning percentage is more than just an estimate of how likely the player is to win. It is in fact a crude estimator of the final score. Going back to your original comment, when choosing between move A that leads to a 0.5pt win, and move B that leads to a 100pt win, you should be seeing move B has a higher winning percentage. Darren Good point. Wins will occur in clusters, so win-rate and score go hand in hand. MC algorithms seem to be treacherous ground when it comes to anticipating consequences. Here's another shot: MC bots play handicap games poorly. When they take w they go into maniac mode because they can't find any wins. When they take b they dither around until the game gets close. What is entirely missing are the concepts of catching up or of preserving a lead. So my suggestion would be to pretend that there is a komi that almost compensates the handicap. A numerical example: a 4 stone handicap is worth about 45 points. Start with a virtual komi of about 35 points, and decrease that value to 0 during the course of the game. This works in both directions. If the bot has w he will be pessimistic, but not suicidal. If he has b he will be optimistic, but not so lazy. The rate of giving rope should depend on the number of moves played and on the winning percentage. If the winning percentage drops early, reduce the virtual komi more sharply. One advantage of this approach would be that it wouldn't tinker with even game stategies. A more elaborate scheme would be to make several preliminary searches at different komi levels. The goal would not be to find the best move. The search would only try to find the win rate for the komi. Depending on how far the game progressed, giving a virtual komi will be worth some win rate reduction. (Or taking a virtual komi will increase the winrate, making the bot play less maniacal than playing a string of kothreats in an unfavorable position. After a decision on the komi is reached, a search is done for this komi to find the move to be played. Towards the end of the game the virtual komi must allways be 0. This strategy might even be useful for even games, when the winrate strongly favors one side before the late endgame. That would revert to the handicap game situation.(Except that a disparity of strength is not presumed) Stefan ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
On Tue, Jul 7, 2009 at 3:49 AM, Magnus Persson magnus.pers...@phmp.sewrote: Quoting Oliver Lewis ojfle...@gmail.com: Others on this list have reported in the past that the randomness is actually very important. Playouts that are very heavy, no matter how clever they are, actually reduce the performance because they narrow the number of games too much. I would like to disagree with this statement. I think it is difficult to make a good heavy playouts, but it is not impossible. Failing to make a playout stronger through heaviness does not prove anything. It just mean one has failed. If I could make a heavy playout of 1 Dan strength and then run in MC tree search. I am sure it would be stronger than the playout itself. I agree with everything you said. In the Mogo papers it is claimed that having stronger playouts does not NECESSARILY lead to stronger play in general. I don't believe everything I read, but I think there may be something to this.Also, having a random element to this does not imply weakening the play, perhaps it's more like varying the playing style. If the Mogo team is correct the formula seems to be that there is something in the playouts that can compliment the search in some way not fully understood. As you gradually go from random to deterministic you cease to have Monte Carlo and you have something that is more like a best first search.It may be that in a few years our programs will gradually transition away from MC and more toward TS using best first methods. Maybe that is really what we have discovered and the MC part is distracting us? The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the same. But Fuego can play 5 times more playouts per second on the hardware that results in Fuego being slightly stronger than Valkyria at the moment. The search is good at some things and poor at other things. These 2 aspects need to work together in a complimentary way and in my opinion will mean more domain specific knowledge. Your observation concerning Fuego and Valkyria indicate that there is a lot of overlap - you can make up search with knowledge and knowledge with search. I believe the huge lack of progress over the years (until recently) has been the blind failure to recgonize that you cannot cover everything with knowledge but we must not move too far in the other direction either. Domain specific GO knowledge is too brittle to do it all, but should be provided as hints to a search. That is exactly how us humans do it. We try to back up our positional judgement and knowledge with concrete analysis.When the knowledge, understanding and intuition is strong, less analysis is needed and visa versa. I have had many discussions with Larry Kaufman, who works with the Rybka chess team - currently the strongest chess program in the world. I think some of the things we have talked about applies very much to GO. It is very interesting to me that even though he is heavily involved in the knowledge engineering aspect of the program, he seems to feel that Rybka is confined by the knowledge part.He tells me (and this applies to ALL programs, not just Rybka), that there are certain positions, ideas or themes where Rybka is a victim of it's evaluation function. Adding an additional order of magnitude more time to the search is not going to change the basic misconception if the evaluation just doesn't understand the position. So you can have 2 equally strong chess programs that are playing stronger than any human player, choosing a different move in the same position and one can be correct and the other wrong - and the one that is wrong may not find the correct move even thinking 10X longer. That is a pretty depressing thought for GO programming because if it's a problem in Chess, then it can only be worse for GO. So I suspect that from your description of the differences between Valkyra and Fuego there will be huge differences in which positions Valkyra plays better vs Fuego. One thing I have found in chess is that each piece of knowledge has side-effects. Every new rule of thumb that you impose, makes your program a bit more dogmatic.I believe the trick is figuring out how not to make it too dogmatic, while still giving it a sensible set of heuristics to guide it along. It just cannot be that having strong playout leads to worse play in general. It is just a matter of not slowing down the code too much and add just those heavy elements that do increase playing strength. Yes, I guess I just repeated you but in a different way. - Don -Magnus -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Experimentation (was: Really basic question)
Experiments with equal playouts are much easier to control to get accurate results. David What is the motivation in this? I cannot conceive of any good reason for running an experiment this way, so I would be interested in opinions. It seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? Christian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Anchor on cgos 19x19?
Many Faces has won almost 100 games in a row to get a high rating, but without an anchor, that rating is probably not accurate. Is anyone willing to put up an anchor or a stronger program that has lots of games and a stable rating? David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation (was: Really basic question)
On Tue, Jul 7, 2009 at 6:31 AM, Christian Nentwich christ...@modeltwozero.com wrote: The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the same. But Fuego can play 5 times more playouts per second on the hardware that results in Fuego being slightly stronger than Valkyria at the moment. Indeed - this makes me wonder why I keep seeing papers where different versions of algorithms are compared with the same number of playouts, rather than under the same time limit. Because this is the right testing methodology to use. The first thing you want to know is if the core idea is good. This is because you will never know if you implemented it in the fastest possible way. But once you know that the idea gives you better results with the same number of playouts you have identified something about it that is superior - then you can go from there. There are two aspects that you are concerned about with tests like this. The first and most important thing is the theoretical soundness of the idea or approach being used.The second is the engineering issues, which are really quite open ended and tough to nail down accurately. Not only that, but you can kill 2 birds with one stone - if the theory is not sound, then there is no need to pursue the engineering aspect. There is probably no great crime in doing it your way if your only goal is to produce a strong program, but if your test fails you don't really know if it failed because the idea is stupid or if your implementation is unduly slow. If you are writing a paper you certainly do not want to claim results based solely on just your particular implementation of an idea that might be bad anyway. There is nothing wrong with presenting an engineering paper about an interesting way to implement something, but even then it would be a pretty silly paper if you did not present at least some analysis on why someone would WANT to implement this thing i.e. it is a commonly used thing (a new sorting algorithm) or has some practical application that you can identify. What is the motivation in this? I cannot conceive of any good reason for running an experiment this way, so I would be interested in opinions. It seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? As I said, when you are writing a paper you have to prove that something is theoretically sound, at least empirically. If you are writing an engineering paper you might present an interesting implementation of some idea, but it should be an idea that has first been shown to be interesting in some way. For instance a new faster sorting algorithm is interesting. But you certainly don't want to present evidence that YOUR IMPLEMENTATION of YOUR IDEA is no good when you have not even attempted to establish whether the idea itself is viable. - Don Christian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
Don, I see where this argument comes from, but you are not taking it to its conclusion. Unless you can, in the end, show that your algorithm can outperform another one with the same time limit, you have not proved that it is an advance. That is why tournaments are played with time limits, not limits on the number of playouts. Time is an important dimension of the game of Go. Here is why: if I follow your argument to its conclusion, I can come up with an algorithm that spends five minutes per move looking through databases of moves, performing simulations, and god knows what, and show a 20% improvement over an algorithm that thinks for 5 seconds. That is an interesting intermediate result. But would you then allow me to write obviously it will be just as quick as soon we've done a bit of optimisation in the epilogue, and accept that? What if there is no way to optimise it? What if the old algorithm will always scale better with the same time limit, on faster hardware? I guess you might let the argument pass under tight conditions: the other program is a version of the same program, and architecturally similar; and no algorithms beyond linear complexity per move were added, or something of the sort. You mix proof and evidence a few times in the same paragraph; the sorting algorithm is probably not a good comparison. With a sorting algorithm, you would first prove its big O and omega complexities, maybe look at average cases too. I don't see a lot of people trying to prove that their algorithms are improvements in Go, because it is generally too complex; so evidence is all we get. Christian Don Dailey wrote: On Tue, Jul 7, 2009 at 6:31 AM, Christian Nentwich christ...@modeltwozero.com mailto:christ...@modeltwozero.com wrote: The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the same. But Fuego can play 5 times more playouts per second on the hardware that results in Fuego being slightly stronger than Valkyria at the moment. Indeed - this makes me wonder why I keep seeing papers where different versions of algorithms are compared with the same number of playouts, rather than under the same time limit. Because this is the right testing methodology to use. The first thing you want to know is if the core idea is good. This is because you will never know if you implemented it in the fastest possible way. But once you know that the idea gives you better results with the same number of playouts you have identified something about it that is superior - then you can go from there. There are two aspects that you are concerned about with tests like this. The first and most important thing is the theoretical soundness of the idea or approach being used.The second is the engineering issues, which are really quite open ended and tough to nail down accurately. Not only that, but you can kill 2 birds with one stone - if the theory is not sound, then there is no need to pursue the engineering aspect. There is probably no great crime in doing it your way if your only goal is to produce a strong program, but if your test fails you don't really know if it failed because the idea is stupid or if your implementation is unduly slow. If you are writing a paper you certainly do not want to claim results based solely on just your particular implementation of an idea that might be bad anyway. There is nothing wrong with presenting an engineering paper about an interesting way to implement something, but even then it would be a pretty silly paper if you did not present at least some analysis on why someone would WANT to implement this thing i.e. it is a commonly used thing (a new sorting algorithm) or has some practical application that you can identify. What is the motivation in this? I cannot conceive of any good reason for running an experiment this way, so I would be interested in opinions. It seems to me that making algorithms heavier and then demonstrating that they are stronger with the same number of playouts misses the point - why would one not run an experiment under the same time conditions instead? As I said, when you are writing a paper you have to prove that something is theoretically sound, at least empirically. If you are writing an engineering paper you might present an interesting implementation of some idea, but it should be an idea that has first been shown to be interesting in some way. For instance a new faster sorting algorithm is interesting. But you certainly don't want to present evidence that YOUR IMPLEMENTATION of YOUR IDEA is no good when you have not even attempted to establish whether the idea itself is viable. - Don Christian ___ computer-go mailing list
Re: [computer-go] Experimentation
Magnus, along the lines of the argument I am trying to make: did you try your experiments with time limits from 30 seconds per game to five minutes per game (say), rather than playouts? Using gogui-twogtp this is relatively easy to achieve. I am obviously not associated with Fuego, but I guess it is reasonable to assume that Fuego's architecture was not designed to operate at limits like 2, 8 or 32 simulations in the same way Valkyria was. It is an interesting study in its own right for scalability purposes; but to go on to infer strength from it would seem like comparing apples and oranges. Christian Magnus Persson wrote: Quoting Darren Cook dar...@dcook.org: * The scaling behavior might be different. E.g. if Fuego and Valkyria are both run with 10 times more playouts the win rate might change. Just to dismiss an algorithm that loses at time limits that happen to suit rapid testing on today's hardware could mean we miss out on the ideal algorithm for tomorrow's hardware. (*) I just happened to have experimental data on exactly this topic. This is Valkyria vs Fuego where I scale the number of playouts (Sims) x4 in each row. SimsWinrateErrNWREloDiff 299.20.45000.992837 898.20.65000.982696 3294.215000.942484 12888.81.45000.888360 512821.75000.82263 204883.21.74990.832278 819281.31.74970.813255 3276875.53.61390.755196 The data shows clearly that the 0.3.2 version of Fuego I use, probably plays really bad moves with a high frequency in the playouts. With more playouts a lot of these blunders can be avoided I guess and the win rate goes down from 99% towards 80%. The question here if it goes asymptotically towards 80% or perhaps 50% with more simulations. Unfortunately I cannot continue this plot because I run out of memory and it takes ages to finish the games. So the question is then: are there a fixed gain of the heavy playouts with more than 512 simulations or does the effect of the heavy playout get less and less important with larger tree size. Note also that this also not only a matter of having heavy playouts or not. There is also a difference in tree search since Valkyria and Fuego probably search their trees differently, and it could be that Valkyria search deep trees inefficiently. Maybe I should run a similar test against a light version of Valkyria to control for the search algorithm. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Christian Nentwich Director, Model Two Zero Ltd. +44-(0)7747-061302 http://www.modeltwozero.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
On Tue, 2009-07-07 at 17:09 +0100, Christian Nentwich wrote: [..] Unless you can, in the end, show that your algorithm can outperform another one with the same time limit, you have not proved that it is an advance. That is why tournaments are played with time limits, not limits on the number of playouts. Time is an important dimension of the game of Go. This is somewhat funny: normally, it is Don who argues along these lines.. ;-) Hellwig ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Really basic question
megasnippage of Don's post: Adding an additional order of magnitude more time to the search [ in Rybka] is not going to change the basic misconception if the evaluation just doesn't understand the position. [ Don suggests that this applies even more so to Go ] Amen! We've covered variants of this theme a few times. I believe that when the playout component knows as much about life-and-death, corner plays, nakade, running fights, semeai, and seki as a dan-level player, the power of the tree search will leverage that knowledge far better than at present. Terry McIntyre terrymcint...@yahoo.com “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Experimentation
If your program is limited by memory size, then it makes sense to limit experiments by trial count. Running on your development computer, you might be limited by clock time. Running on competition hardware, you might not be. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
Brian Sheppard wrote: Running on your development computer, you might be limited by clock time. Running on competition hardware, you might not be. Only if the algorithm doesn't scale. Which makes it uninteresting to begin with. -- GCP ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
On Tue, Jul 7, 2009 at 12:09 PM, Christian Nentwich christ...@modeltwozero.com wrote: Don, I see where this argument comes from, but you are not taking it to its conclusion. Unless you can, in the end, show that your algorithm can outperform another one with the same time limit, you have not proved that it is an advance. That is why tournaments are played with time limits, not limits on the number of playouts. Time is an important dimension of the game of Go. You are the one that started this conversation with a question about why authors of papers seem to be concerned more with the theoretical properties of algorithms than the practical implementation details. I'm just trying to give you the answer. If you don't like the answer, why did you ask the question? You may not like it, but I'm telling you that in general these kind of papers are about ideas, not implementations. A good paper will have a primary focus on one or the other. Here is why: if I follow your argument to its conclusion, I can come up with an algorithm that spends five minutes per move looking through databases of moves, performing simulations, and god knows what, and show a 20% improvement over an algorithm that thinks for 5 seconds. That is an interesting intermediate result. But would you then allow me to write obviously it will be just as quick as soon we've done a bit of optimisation in the epilogue, and accept that? I never said that.You constructed a ridiculous example here that does NOT produce an interesting result.Your basic idea has to be interesting in the first place if you are going to write a paper that someone might actually want to read. If you can produce a result that gives a 20% improvement and the idea is not ridiculous, then you may have an interesting paper even if it's difficult to get the performance you need to put in a practical program. It would certainly be a badly written paper if your primary conclusion was that we couldn't get it to run fast enough. The first question any reader would ask is how much improvement is it with the same number of playouts? What would your answer be? We didn't try that because the only thing that matters to me is that it plays well on my computer with the given time contstraints.(That may be true, and there is nothing wrong with that attitude, but it's not one you would focus a paper on.) There are a ton of algorithms out there that do not run well on a given architecture, but may run well on some other. Or it may be highly feasible with dedicated hardware, or perhaps be particularly easy to parallelize.If the basic idea is sound, that is a much more important result than how much time YOUR implementation took. If you draw conclusions based on that you just have a paper that has your opinion in it.You become a philosopher instead of a scientist. There is nothing wrong with presenting the idea and performance figures but if you want others to have some respect for your paper you need to make sure you are not drawing conclusions based on YOUR implementation and hardware platform. You mix proof and evidence a few times in the same paragraph; So what? I'm not confusing the terms. You just mentioned it in the same sentence but that was not wrong either. It's common when you are dealing with inexact sciences like games programming to accept as proof strong empirical evidence. It's called empirical proof. the sorting algorithm is probably not a good comparison. With a sorting algorithm, you would first prove its big O and omega complexities, maybe look at average cases too. I don't see a lot of people trying to prove that their algorithms are improvements in Go, because it is generally too complex; so evidence is all we get. Yes, we are not dealing with an exact science here.That is why you need to at least show that in a given test program the algorithm has merit by isolating the theory from the implementation details as much as possible. It's certainly more interesting to me to know that the basic idea works in other programs and I can speculate whether it is likely to work in mine. I don't want also to have to speculate on whether you implemented it efficiently or not, whether there is some advantage to some platforms over others, etc. Now I'm not saying it's bad to include performance numbers - that might be interesting to look at but it's second order information - it's not the key thing.Whether the paper author could make it work from a performance standpoint is not as important as whether the basic idea has some feasibility. - Don Christian Don Dailey wrote: On Tue, Jul 7, 2009 at 6:31 AM, Christian Nentwich christ...@modeltwozero.com mailto:christ...@modeltwozero.com wrote: The problem I think is to find a good tradeoff between heavyness and speed. In my test with Valkyria vs Fuego, Valkyria is superior when the number of playouts are the
[computer-go] Experimentation
Running on your development computer, you might be limited by clock time. Running on competition hardware, you might not be. Only if the algorithm doesn't scale. Perhaps there is a misunderstanding? A scalable algorithm is limited by the hardware it runs on. The limit may differ when you change hardware. My dev machine is slow and old. Most of my testing is limited by clock time. If I switch to a modern computer then the limits would change quite a bit. So I am satisfied if a change breaks even while making the program slower. Which makes it uninteresting to begin with. Well, I thought it was interesting. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
I have not used time exactly to match up Valkyria and Fuego. But also with fixed numbers of simulations one can match them up closely in processor time. And then I do that Valkyria wins something like 41-45% of the time. I never intentionally designed Valkyria to work well small number of simulation as in these tests, but in principle you have to do that no matter how many simulations you run per move, because you will always have few simulations in the leaves of the tree. And if the leaves are evaluated strongly then the nodes nearer to the root also will benefit. Magnus Quoting Christian Nentwich christ...@modeltwozero.com: Magnus, along the lines of the argument I am trying to make: did you try your experiments with time limits from 30 seconds per game to five minutes per game (say), rather than playouts? Using gogui-twogtp this is relatively easy to achieve. I am obviously not associated with Fuego, but I guess it is reasonable to assume that Fuego's architecture was not designed to operate at limits like 2, 8 or 32 simulations in the same way Valkyria was. It is an interesting study in its own right for scalability purposes; but to go on to infer strength from it would seem like comparing apples and oranges. Christian Magnus Persson wrote: Quoting Darren Cook dar...@dcook.org: * The scaling behavior might be different. E.g. if Fuego and Valkyria are both run with 10 times more playouts the win rate might change. Just to dismiss an algorithm that loses at time limits that happen to suit rapid testing on today's hardware could mean we miss out on the ideal algorithm for tomorrow's hardware. (*) I just happened to have experimental data on exactly this topic. This is Valkyria vs Fuego where I scale the number of playouts (Sims) x4 in each row. SimsWinrateErrNWREloDiff 299.20.45000.992837 898.20.65000.982696 3294.215000.942484 12888.81.45000.888360 512821.75000.82263 204883.21.74990.832278 819281.31.74970.813255 3276875.53.61390.755196 The data shows clearly that the 0.3.2 version of Fuego I use, probably plays really bad moves with a high frequency in the playouts. With more playouts a lot of these blunders can be avoided I guess and the win rate goes down from 99% towards 80%. The question here if it goes asymptotically towards 80% or perhaps 50% with more simulations. Unfortunately I cannot continue this plot because I run out of memory and it takes ages to finish the games. So the question is then: are there a fixed gain of the heavy playouts with more than 512 simulations or does the effect of the heavy playout get less and less important with larger tree size. Note also that this also not only a matter of having heavy playouts or not. There is also a difference in tree search since Valkyria and Fuego probably search their trees differently, and it could be that Valkyria search deep trees inefficiently. Maybe I should run a similar test against a light version of Valkyria to control for the search algorithm. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Christian Nentwich Director, Model Two Zero Ltd. +44-(0)7747-061302 http://www.modeltwozero.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Magnus Persson Berlin, Germany ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
Digesting Don's remarks, it seems that a reasonable method would start with equal numbers of simulations. It could then proceed to try various optimizations, and compare algorithms which use equal time. It makes perfect sense for the ultimate goal to be better performance using the same time or less, but it might be easier to progress stepwise - first tweak the top-level design, then tweak the performance - in order to separate out the different inputs to the experiment. Terry McIntyre terrymcint...@yahoo.com “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Experimentation
something i've played with a little bit: only look at algorithms with the following property: * they every so often update an in-memory score for each board position. you can then run a timing loop around this and just make the highest-scoring valid move the play. you can use a signal handler to dump the list at any time. s. 2009/7/7 terry mcintyre terrymcint...@yahoo.com: Digesting Don's remarks, it seems that a reasonable method would start with equal numbers of simulations. It could then proceed to try various optimizations, and compare algorithms which use equal time. It makes perfect sense for the ultimate goal to be better performance using the same time or less, but it might be easier to progress stepwise - first tweak the top-level design, then tweak the performance - in order to separate out the different inputs to the experiment. Terry McIntyre terrymcint...@yahoo.com “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scoring - step function or sigmoid function?
MC bots play handicap games poorly. ... So my suggestion would be to pretend that there is a komi that almost compensates the handicap. The same ideas has been suggested before [1]. The counter comments were mostly that hard to get right or tried that briefly and it didn't work well. Darren [1]: I think starting here, and then the dozen or so followup messages. http://computer-go.org/pipermail/computer-go/2008-August/015859.html -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/