On Mon, Dec 15, 2008 at 5:47 PM, Don Dailey <[email protected]> wrote: > Is Jrefgo the pure version that does not use tricks like the futures > map? If you use things like that, all bets are off - I can't be sure > this is not negatively scalable.
I don't know, although I was under the impression that I had downloaded the "pure" version. I found a reference to the source here on the list, and downloaded and compiled that. When I get back home, how would I quickly determine which is the case? > You cannot draw any reasonable conclusions by stopping after 10 moves > and letting gnugo judge the game either. Why didn't you play complete > games? I think that complete games would have to be at least one of: 1. Against a similarly weak opponent. This casts doubt on whether the results apply against other opponents. 2. Unlikely to be won by an AMAF program. This makes their differences hard to measure. 3. Played with handicap stones. The granularity seems too coarse on 9x9. Nevertheless, it might be worthwhile to try this. 4. Played with a komi that is very far from an even game. In 9x9, this would mean that the better player must control the entire board in order to win. At that point, komi no longer becomes useful as a means for providing a handicap. Originally, (about two years ago) I ran studies such as this in order to tune parameters that affected the playouts, and that I thought could probably have different optimum values at different points in the game. Playing against an opponent that is generally stronger makes it more likely that the improvements I find are likely to apply to opponents in general, rather than simply tuning my program against one particular opponent. Playing against a close relative of the same program (e.g., pitting 5k against 100k directly) gives misleading results, in my experience. Often, both programs will be blind to the same lines of play, allowing genuinely bad moves to go unpunished. On Mon, Dec 15, 2008 at 6:10 PM, Mark Boon <[email protected]> wrote: > It would have been much more persuasive if you had simply run a 5K > playout bot against a 100K bot and see which wins more. It shouldn't > take much more than a day to gather a significant number of games. I may do that, although personally I would be far more cautious about drawing conclusions from those matches, as compared to ones played against a strong reference opponent. But I guess other people feel differently about this. Anyway, the results would still be interesting to me no matter which way they went, even if they failed to convince me of anything. > I did in fact put up a 100K+ ref-bot on CGOS for a little while, and > it ended up with a rating slightly (possibly insignificantly) higher > than the 2K ref-bot. Maybe I didn't put it there long enough, > certainly not for thousands of games. But it didn't look anywhere near > to supporting your findings. That doesn't particularly disagree with my conclusions either. For example, I would guess that the best overall performance is somewhere around 5k-10k, so a program with a setting in that range would obtain a higher rating than either of 2K and your "100K+". I could easily be wrong about that, though. > I say 100K+ because I didn't set it to a specific number, just run as > many as it could within time allowed. Generally it would reach well > over 100K per move, probably more like 250K-500K. That should only > make things worse according to your hypothesis. Yes, this is what sparked the conversation originally. When you reported that a while ago, my reaction was, "Of course that won't work very well; you're running way too many simulations." I was actually a bit surprised that noone else thought that this was as bad as I think it is. > So although I think the result of your experiment is very curious, I > think it might be a bit hasty draw your conclusion. Yes, it very well may be. As I mentioned, I ran a number of similar experiments a couple years ago, for which I unfortunately lost the results. My recollection is that they typically indicated the same thing, across a number of variations on my own program. Performance would improve up to a point, then degrade as the program's behavior became essentially deterministic. But I may have made mistakes in those tests, or I could be misremembering. On Tue, Dec 16, 2008 at 12:20 PM, Don Dailey <[email protected]> wrote: > A monte carlo bot like refbot, in most positions is going to converge on > some specific move. I think in the starting position it "wants" to > play e5 and it is going to play e5 with an infinite number of playouts, > whether than is the best move or not. There will be many situations > where the move it "wants" to play is not the best, and so you can > surmise that it's more likely to play a good move with fewer playouts. Incidentally, when I get home, I'll post the line of play that follows those moves with the highest (asymptotic) Monte Carlo values, according to jrefgo. I have about 18 moves calculated with high accuracy. > However, that by no means implies that it will play better with fewer > playouts. It may play the worst move on the board too - the chances of > that happening increases as the number of playouts drops. So this cuts > both ways. Yes, it certainly does cut both ways. But I think it should not be too hard to convince yourself that if the worst move on the board happens to have a low Monte Carlo value, the program will detect this much more quickly than it will detect comparatively slight differences between the top few moves. The move with the best Monte Carlo value is, according to my own understanding, a reasonable-looking move that nevertheless often demonstrates clear "misconceptions" of that methodology. I'm sure that it's rare for it to be the "worst" move on the board. But I also think that it is also generally uncommon for it to be the best move. Here's another experiment that I might try: Pick a few individual positions from some games played by high-level players. Then, calculate very accurate Monte Carlo values for the moves from that position, as well as win rates for those moves, based on top programs. From that information, I should be able to quickly run simulations to find an overall expected win rate that would result from playing the current best move of a Monte Carlo program, followed by the moves played by the reference program. As the accuracy of the Monte Carlo program increases, do more of these approach their asymptote from above or below? (Is it possible that this depends only on the relative win rates of the top two moves?) > Another way to look at this is that "playing better" may not correlate > well with increasing your chances of playing the best move - it's more > about increasing your chances of playing a "good" move, or in my own > personal opinion it's about avoiding bad moves. I think that last part is exactly right. And I believe that current Monte Carlo methods only really manage to avoid the very worst of the bad moves, regardless of how many playouts they run. Weston _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
