Re: [computer-go] Why not forums?
Dmitry Kamenetsky wrote: I have been reading this list for nearly a year now and it is very discouraging to receive so much criticism for my first post. The yahoo groups was merely an example to show how easy it is to get a forum started. I also agree that yahoo appends too much spam to its forums and I am sure there are many much much better free forums out there. The forums that I really like are the TopCoder forums (http://forums.topcoder.com/). I like them for these reasons: * One can post in various sections. The sections we can have here could be: Monte Carlo Go, Search in Go, Learning in Go, CGOS, KGS, Human Go. * Threads are easy to find and each thread has a post count. The post count is a good indication of how interesting that thread is. For example if there are many threads that I haven't had time to read, then I will first read the ones with the most post count. * Different viewing options: flat (newest first), threaded or tree. These can be useful for various purposes. * Each post has a '+' and '-' associated with it. This means that if you agree with the post then you simply press the '+' button and the plus count goes up, similarly if you disagree you press the '-' button. This serves two purposes: you don't have to post extra posts just to show your agreement/disagreement, which saves space and your time; also this is a great way to make votes - those in favour press '+', those against press '-'. * Each post is associated with a date and time. Also it is easy to ressurect threads that are years old. * If you had a typo or a mistake in your post, you can easily edit it. This is extremely useful. * It is not necessary, but it is always nice to see who you are talking to. * There is a very powerful message searching engine, which incorporates: section type, date range and member name. * You can watch threads that are of interest to you. I hope I have given some good reasons for having a forum. Since so many people here are against losing the list, why not the following: we keep the list, but give members the option of using a forum? This way we can all be happy :) The reason you are receiving so much critisism is a fundamental missunderstanding on your part. Having a mailing list does not exclude any of the things you mention, but rather grants exactly the freedom you point out in your last sentence. Most of the features you mention, I already have in my client, and the ones I do not have, I lack mainly because I do not want them. If you want more features, the solution is not to change the back end; it is to change your personal front end. Get a better client. Don't enforce your preferences on everyone. Furthermore it is a common attitude that web forums have a much lower quality of discussion than mailing lists and newsgroup. Some of this probably comes from the availability of and easy of setting up web forums over mailing lists or worse, a newsgroup, so an average forum will have less competent admins, moderators, and in the end, users. But some of it can definitely be ascribed to some of the points you mention in favor of forums; the ability to edit posts and the ability to simply agree or disagree without adding any content, for instance, have an effect on what kind of debate is held, and I do not think it is an overall positive one. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: Re[4]: [computer-go] Why not forums?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Feb 7, 2007, at 5:20 , Dmitry Kamenetsky wrote: I hope I have given some good reasons for having a forum. Since so many people here are against losing the list, why not the following: we keep the list, but give members the option of using a forum? This way we can all be happy :) Like I said, we could get both if we have a two way gateway. This way every post from the mailing list ends up in the forum and every forum post ends up on the mailing list. If everyone (especially our dear list moderator) is OK with it, I'd try to set it up. Urban - -- http://bettong.net | Urban's Blog http://computer-go.bettong.net | Planet Computer-Go -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (Darwin) iD8DBQFFyYpOggNuVCIrEyURArwIAJ9avTgfuaHcnIkBr9SFjvzPyonJLwCfRB1V 4xG6BnPtEIkHVC32eJKLzBg= =8YyO -END PGP SIGNATURE- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)
As I have spent a lot of time trying to guess what could be done for Quasi-Monte-Carlo or other standard forms of Monte-Carlo-improvements in computer-go, I write below my (humble and pessimistic :-) ) opinion about that. Let's formalize Monte-Carlo. Consider P a distribution of probability. Consider that you want to estimate Pf, the expectation of f under distribution P. You can get values of f at some points; note hP ($\hat P$ is a usual notation) the empirical distribution of probability, i.e. Pf = unknown expectation of f for distribution P hPf = (\sum_{i=1}^n f(x_i) )/n In standard naive Monte-Carlo, the x_i are independent identically distributed according to distribution P. One can sample without replacement in the discrete case (no more indep. then, but almost :-) ). Biasing Monte-Carlo (importance sampling) consists in replacing P by some P', which samples more often points in more important areas. Standard tools consist in using P' with density proportional to the product (i) density of P (ii) local standard deviation of f (that is unknown and must be approximated) This does not introduce a bias (on average, hPf remains close to Pf), as we change the weighting of points; hPf is now: hPf = (\sum_{i=1}^n P(x_i)f(x_i)/P'(x_i) )/n and the x_i are i.i.d according to P' in the most simple case (but other better cases are possible, e.g. sampling in strata). Let's consider how this can be applied to Monte-Carlo-Go. Designing a good random player is not importance sampling as above. There's no reweighting. We change what is estimated; we do not want to know what is the probability of winning for a uniform random player; we just want a better player, so that probabilities estimated by this player will lead to a better MC-based player. Improving the random player is very important, but this is not importance sampling. Why not adding importance sampling ? Because this requires a good estimate of important areas, in the sense e.g. areas with high standard deviation. This is, in my humble opinion, nearly as hard as designing a good Go-evaluation-function. Of course, I might miss something; but I think that if you can quickly find areas with significantly higher of lower probabilities of winning, then you have improved your random player, and you don't want to reweight anything, and you are not interested in the probability of winning for the basic random player - you want to move to the new random player. This is only improving the random player; you change the distribution, because you guess that it will be a better random player (and you hope that your MC-based player will be better). Quasi-Monte-Carlo methods consist in replacing hP by a sum on a deterministic or moderately randomized sample so that the convergence from hPf to Pf is faster (than the O(1/\sqrt{n}) usual in MC-methods). This is somewhat orthogonal to importance sampling (but both techniques are difficult to merge in one technique, by the way). In standard QMC, the distribution is not modified, but points are not chosen by independent identically distributed sampling (by the way, this is also not the case with most importance-samplings - but in QMC the focus is on reducing the random-part). (It is now often considered that QMC must be only partially derandomized, as randomized QMC (as opposed to strictly deterministic QMC) is better, especially for large dimension; but, nevermind, let's consider a QMC method, independently of how it is designed; we can just pick up a QMC-sequence from the state of the art, without (at first) trying to understand how it works.) But, QMC is very efficient in continuous domains, sometimes in specific discrete domains also, but I don't see how to plug QMC in tree-search, or even just in a Go-ban sampling for introducing QMC in a particular node. It looks appealing, but you need something like a distance between moves that is related to the probability of winning, and even if you have such a distance, designing the QMC-sequence is far from straightforward. However, a simple form of derandomization, the so-called jittered-sampling, can perhaps be applied. instead of randomly sampling a go-ban, split (partition) the go-ban in k zones, and pick the i^{th} point in the 1+(i%k) zone; (% means modulo) zone. (this in the case of a partition in areas with same area, but this can be adapted to unequal areas) Perhaps it is a good idea, but I have not enough Go-knowledge for designing such zones. Another possible tool for improving Monte-Carlo comparisons is introducing correlations between explorations of various nodes. This possibility, consisting in correlating choices in nodes B and C. should improve the stability of the comparison between B and C, therefore improving the choice performed in node A if A has sons B and C. However, this introduce a strong variance-increase at node A if A is the father of B and C (much
Re: [computer-go] Why not forums?
Dmitry Kamenetsky [EMAIL PROTECTED] writes: The forums that I really like are the TopCoder forums (http://forums.topcoder.com/). I like them for these reasons: And really everything of these reason are features of good E-Mail clients (except the +/- voting). So with the right client you get all this out of the box plus much more flexibility (mail is more standardized than web formus so you have much more options like converters, gateways, IMAP, clients,...). So this is all about flexibility -- nearly all web forum software are really jails with no easy alternative access, but with mailing lists you have plenty of alternatives (including your own web server with your forum software into which you can inject all these mails). -- Until the next mail..., Stefan. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Why not forums?
Hi, So this is all about flexibility -- nearly all web forum software are really jails with no easy alternative access, but with mailing lists you have plenty of alternatives (including your own web server with your forum software into which you can inject all these mails). I agree. Setting up a separated forum would only harm the community - so if people want a (quasi-)forum it should just be an interface to the normal list. Cheers, Benjamin ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)
I could see a case where it is possible to reduce a variance of a single variable even in the 0-1 case. Let us say that black has about 5% chances of winning. If we could (exactly) double the chances of black winning by changing the nonuniform sampling somehow (say, enforce bad moves by white), we could sample from that and divide the estimated black's winning chance in the end by 2. This would of course be very difficult in practice. (A binary random variable gives more information when the chances are closer to 50-50.) This could be useful in practice in handicap games, by for instance enforcing a black pass with 1% chance every move. Sampling would be distorted towards white win, which is realistic since white is assumed to be a stronger player, anyway. I don't understand this line of reasoning. Let my try again using the handicap example. Let's say MC player is given a huge handicap. In the simulations, it is winning all of its games, so there is no information helping to select the next move. Using information theory, each play-out gives one bit of information if the chances are 50:50, but if the chances are unbalanced, the information content is lower. (see http://en.wikipedia.org/wiki/Binary_entropy_function ) In the extreme case, there is no information at all. Now, let us use distorted MC where we enforce black to pass with a few percent chance every move. White begins to win some of the simulations, so MC is useful again. How this is related to reducing the variance? Let us say that a black move leads to a white win with probability p very close to zero. Let us also assume that distorting the simulations doubles white's chances to 2p. Using normal MC, the variance of our estimate of p using N samples is p*(1-p)/N and using distroted MC, the variance of 2p is 2p*(1-2p)/N estimating p by using the estimate of 2p, the variance is divided by 4: p*(1-2p)/2N which is less than p*(1-p)/N. In practice, we cannot know that distorting would increase the chances exactly by doubling them, but if we use the same distortion to estimate all moves, we can still compare them. Tapani ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
On Wed, Feb 07, 2007 at 12:06:40PM +0200, Tapani Raiko wrote: Let my try again using the handicap example. Let's say MC player is given a huge handicap. In the simulations, it is winning all of its games, so there is no information helping to select the next move. This situation happens in normal games too, once one player is so much ahead that it wins almost no matter what. It leads into really stupid-looking endgames, where live groups are allowed to die, and dead ones are allowed to be rescued. All this could be avoided by a simple rule: Instead of using +1 and -1 as the results, use +1000 and -1000, and add the final score to this. The purpose of the large constant (1000) is to make sure that it prefers any win to any loss (so that large_win + small_loss small_win + small_win). One could even add another term in the result, favouring games that end early (for the winner) or postpone them (for the looser), in hope of allowing the opponent more chances to make mistakes. As far as I can see, this ought to fit straight in to any MC or UCT program. It may not improve the winning chances, but it sure should make the programs play look more reasonable. Just my humble idea. Feel free to shoot down (with serious arguments), and/or use where ever you like. I would like to hear if this makes any practical difference, if anyone tries. - Heikki -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
Matt Gokey wrote: But what are some of the reasons MC is not even better? 1-Since MC engines don't deal with tactics directly, they're not likely going to play tactical sequences well for low liberty strings, securing eye space, cutting and connecting, ko fights, or ladders, etc. 2-Also because most of the play-outs are usually nonsense, they may have trouble dealing with meaningful nuances because the positions that will lead to these distinctions just don't arise with enough statistical frequency in the play-outs to affect the result. Yet when very selective moves are used in the play-outs, too many possibilities can be missed. 3-Finally, with 19x19 anyway, the size of the board and game tree probably limits the practical effectiveness of the sampling and move ordering. I don't try to address this last point any further in this message. Very good analysis and I would like to contribute a 4th reason: As Luke Gustafson pointed out, we have to contemplate the simulation as a _stochastic process_. We want to determine the conditional probability of a win given a particular move is made. And that depends on the _length of the simulation_. Dramatically! This is a reason against scalability of global search MC/UCT. If the simulation is 500 moves long (Chinese rules with recaptures, etc.) the observed variance at an early move blurs out everything. Just a simple stochastic process: Count a dollar each time you correctly predict a p=1/2 coin thrown n=500 times. The expected average is (obviously) 250, but the expected variance of that measure is n·p·(1-p) = 125 proportional to n. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: Re[4]: [computer-go] Why not forums?
I have been reading this list for nearly a year now and it is very discouraging to receive so much criticism for my first post. it means that you've touched upon an important issue to the readers of the list. this, in and of itself, means that you are getting valuable information from the list readers about what's important to them. keep in mind that any criticism that you received wasn't criticism of you, just of the idea of a forum. it could be that many of the list members are a bunch of barnacles who prefer to receive most of their information through a single interface. i'd certainly count myself in the barnacle crowd -- i think that the information in this list doesn't need much 'jazzing up', and that an arbitrary mail client gives me excellent capacity to participate in the list. low-ascii serves as a great carrier of information for me, and keeps the distraction level down. without the easy ability to communicate with other list members using shockwave flash or the urgent need to create 5.1 HD movies of computers battling it out mano-a-mano, i find myself more rapidly focused on the topic of finding good go playing algorithms. with that having been said, let me try to address some of the ways in which email lists can be finessed to service many of the needs that you bring up. it could be that with a decent mail or news client, all of your worries will be like yesterday's mashed potatoes. * One can post in various sections. The sections we can have here could be: Monte Carlo Go, Search in Go, Learning in Go, CGOS, KGS, Human Go. Subject: lines work for this purpose, and additionally allow for the easy creation of new sections. * Threads are easy to find and each thread has a post count. The post count is a good indication of how interesting that thread is. For example if there are many threads that I haven't had time to read, then I will first read the ones with the most post count. You might want to use a mail client that allows you to thread messages by From: and by Subject:. Many usenet news readers do this automagically, but autosorting your inbound mail by sender and subsorting by subject line will get you 90% of the way there. * Different viewing options: flat (newest first), threaded or tree. These can be useful for various purposes. Again, most usenet news clients and many mail readers will allow you to do this -- you can treat the threads as a way to diminish the importance of old topics, or expand them and subsort in order to find exactly what you're looking for, quickly. * Each post has a '+' and '-' associated with it. This means that if you agree with the post then you simply press the '+' button and the plus count goes up, similarly if you disagree you press the '-' button. This serves two purposes: you don't have to post extra posts just to show your agreement/disagreement, which saves space and your time; also this is a great way to make votes - those in favour press '+', those against press '-'. this is indeed an advantage. * Each post is associated with a date and time. Also it is easy to ressurect threads that are years old. the From field of all mail messages contains this information, and many mail clients will allow you to see this information, sort your messages using it, selectively subsort messages inside threads by date, etc., etc. the Date: field is another one that many mail and news clients will let you thread, sort and subsort with. the world is your oyster. an IMAP mail client is one way to keep track of very old messages in a convenient way, and usually has threading and sorting capabilities built-in as a necessary function for most users. * If you had a typo or a mistake in your post, you can easily edit it. This is extremely useful. here i might disagree. i can see the advantage of being able to fix something that you later realize was wrong, but being required to take the extra effort to get it right in the first place can lead to more thoughtful posting. it doesn't mandate it by any means, but it certainly seems to encourage it. it's a kind of self-moderation imposed by the interface. * It is not necessary, but it is always nice to see who you are talking to. a pasty, potato chip smeared face greets you with lips stretched into a wide, badly socialized grin and eyes that stare blankly into dimly-lit monitor space as unbrushed, poorly-aligned teeth wedged uncomfortably against a crimson gumline provide the main visual contrast to the field of empty soda bottles and cans that litter the backdrop of a room desperately in need of a good vacuuming. that's the horrorshow you're looking forward to, my friend. * There is a very powerful message searching engine, which incorporates: section type, date range and member name. a surprising number of mail and usenet news clients have this same functionality. * You can watch threads that are of interest to you. i'm sorry to
Re: [computer-go] MC Go Effectiveness
Jacques Basaldúa wrote: Very good analysis and I would like to contribute a 4th reason: As Luke Gustafson pointed out, we have to contemplate the simulation as a _stochastic process_. We want to determine the conditional probability of a win given a particular move is made. And that depends on the _length of the simulation_. Dramatically! This is a reason against scalability of global search MC/UCT. If the simulation is 500 moves long (Chinese rules with recaptures, etc.) the observed variance at an early move blurs out everything. Just a simple stochastic process: Count a dollar each time you correctly predict a p=1/2 coin thrown n=500 times. The expected average is (obviously) 250, but the expected variance of that measure is n·p·(1-p) = 125 proportional to n. Good point. This leads to another thought that I have been wondering about. That is I question whether using more time to search more simulations in the opening is the best approach. For the opening, selecting reasonable robust moves that tend to lead to more favorable options is probably a good objective. The lengths of the simulation are perhaps too long to expect anything better. Later towards the pre-middle to middle game it is very critical to play such that the positions tactical potential is exploited such to secure connections and eye space, etc. It would seem to me that focusing the highest concentration of time and number of simulations during this part of the game might be most advantageous. It would be interesting for someone with a decent MC player to do an experiment like this with one version concentrating highest number of simulations in the opening and one concentrating in the middle game, but otherwise equal and see which version wins more often. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
Two qick comments: Quoting Matt Gokey [EMAIL PROTECTED]: Jacques Basaldúa wrote: Very good analysis and I would like to contribute a 4th reason: against scalability of global search MC/UCT. If the simulation is 500 moves long (Chinese rules with recaptures, etc.) the observed variance at an early move blurs out everything. This is not a problem for scalability for MC/UCT. It just means that a program that is 5 kyu with 10 minutes on 9x9 might be 35 kyu with 10 minutes on 19x19. Yet for both 9x9 and 19x19 it holds that a bugfree implementation of MC/UCT will assymptotically reach perfect play when thinking time goes towards infinity. True is that going from 9x9 to 19x19 is frustrating. But this is because 19x19 is more complex than 9x9. An MC/UCT program is still very much better than random play because random plays need more than 100 free handicap stones to have a chance on 19x19. Good point. This leads to another thought that I have been wondering about. That is I question whether using more time to search more simulations in the opening is the best approach. For the opening, selecting reasonable robust moves that tend to lead to more favorable options is probably a good objective. The lengths of the simulation are perhaps too long to expect anything better. Later towards the pre-middle to middle game it is very critical to play such that the positions tactical potential is exploited such to secure connections and eye space, etc. It would seem to me that focusing the highest concentration of time and number of simulations during this part of the game might be most advantageous. It would be interesting for someone with a decent MC player to do an experiment like this with one version concentrating highest number of simulations in the opening and one concentrating in the middle game, but otherwise equal and see which version wins more often. My experience with Valkyria is that most time must be allocated towards early moves. If you make a mistake on 9x9 it is very hard to come back. MC/UCT programs such as MoGo are very good at defending a won position. Therefore it is important to get a won position as early as possible. The longer it thinks the better it plays. In the opening it always critical to find the best move - but later on games are often either won or lost so saving time for those positions are not so important. Valkyria saves time in the opening by using an opening library, but as soon as it leaves the library it spends a lot of time on the first move. Moves it spends a lot of time on is also stored in the library. And later on I might correct moves that was played in lost hand myself. I am actually rarely satisfied with the opening moves of Valkyria. It needs more time for those moves... -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Why not forums?
I've had a computer go forum running for a while but has low traffic. http://www.olympuschess.com/phpBB2 -josh On 2/4/07, Dmitry Kamenetsky [EMAIL PROTECTED] wrote: Hello everyone, Why can't we use proper forums instead of this outdated list? Forums are easier to keep track of and search for messages. As a start we can use Yahoo groups. What do you think? Cheers, Dmitry ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
-Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Wed, 7 Feb 2007 5:34 AM Subject: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)) On Wed, Feb 07, 2007 at 12:06:40PM +0200, Tapani Raiko wrote: Let my try again using the handicap example. Let's say MC player is given a huge handicap. In the simulations, it is winning all of its games, so there is no information helping to select the next move. This situation happens in normal games too, once one player is so much ahead that it wins almost no matter what. It leads into really stupid-looking endgames, where live groups are allowed to die, and dead ones are allowed to be rescued. All this could be avoided by a simple rule: Instead of using +1 and -1 as the results, use +1000 and -1000, and add the final score to this. The purpose of the large constant (1000) is to make sure that it prefers any win to any loss (so that large_win + small_loss small_win + small_win). One could even add another term in the result, favouring games that end early (for the winner) or postpone them (for the looser), in hope of allowing the opponent more chances to make mistakes. As far as I can see, this ought to fit straight in to any MC or UCT program. It may not improve the winning chances, but it sure should make the programs play look more reasonable. Just my humble idea. Feel free to shoot down (with serious arguments), and/or use where ever you like. I would like to hear if this makes any practical difference, if anyone tries. Intuitively, it seems like this should work. You only give the winning margin a small weight, or only use it to break ties, or only apply it after the game is already decided. I've tried many variations, as have others, including your exact algorithm above. It can make some moves look a little prettier, but it always causes problems and I have to take it out. I have my theories as to why that is, but for brevity's sake, in my experience, giving any consideration to winning margin is detrimental. After the game is decided, there are more elegant ways to bring it to a close. I came to this conclusion reluctantly. - Dave Hillis Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
I should have mentioned that I have only tested on 9x9. For larger boards, I don't know. - Dave Hillis ` Intuitively, it seems like this should work. You only give the winning margin a small weight, or only use it to break ties, or only apply it after the game is already decided. I've tried many variations, as have others, including your exact algorithm above. It can make some moves look a little prettier, but it always causes problems and I have to take it out. I have my theories as to why that is, but for brevity's sake, in my experience, giving any consideration to winning margin is detrimental. After the game is decided, there are more elegant ways to bring it to a close. I came to this conclusion reluctantly. Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
If I recall correctly, someone spoke of constraining the opening moves to the 3rd,4th,and 5th lines in the absence of nearby stones, or something to that effect. What was the impact of this experiment? I notice the recent discussion of the need for a lot of thinking time to find good opening moves; would this thinking time be reduced with a better-than-random selection of opening moves during the playouts? If fuseki and joseki databases were used to bias the playouts, would the speed and quality of opening play improve? One concern about using expert knowledge of this sort - what happens when one's opponent plays out of book? Human players often find this frustrating - we sense that moves which deviate from joseki are wrong, but finding the correct refutation under time pressure is not always easy. To address this concern: at least one MC player uses an opening book; would it be profitable to automatically analyze past losses, and devote a few 100k playouts to look for better replies to plays which were favored by opponents in past games? This would be the analogue to advice often given to human players: study your own lost games; look for improvements. Unfortunately, an opening book does not generalize well; learning a better move for position Xi won't help with position Xj which differs by even one stone from Xi. What sort of move generators would cope with the vast number of similar positions? A single stone ( a ladder-breaker or a key point which makes or denies a second eye, for instance ) can make a large difference in the score, but one hopes that MC playouts would discover the negative consequences of moves which are almost but not quite right. Don't pick lemons. See all the new 2007 cars at Yahoo! Autos. http://autos.yahoo.com/new_cars.html ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
What sort of sampling was used for the playouts? For this variable ( incorporating some information about the score vs only the win-loss variable ), does it make a difference whether playouts are totally random or incorporate varying degrees of similitude to good play? From: [EMAIL PROTECTED] [EMAIL PROTECTED] I should have mentioned that I have only tested on 9x9. For larger boards, I don't know. - Dave Hillis ` Intuitively, it seems like this should work. You only give the winning margin a small weight, or only use it to break ties, or only apply it after the game is already decided. I've tried many variations, as have others, including your exact algorithm above. It can make some moves look a little prettier, but it always causes problems and I have to take it out. I have my theories as to why that is, but for brevity's sake, in my experience, giving any consideration to winning margin is detrimental. After the game is decided, there are more elegant ways to bring it to a close. I came to this conclusion reluctantly. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
terry mcintyre wrote: If I recall correctly, someone spoke of constraining the opening moves to the 3rd,4th,and 5th lines in the absence of nearby stones, or something to that effect. What was the impact of this experiment? For what it's worth, I tried a number of experiments along these lines and none of them produced any improvement in play whatsoever. I'm still baffled by this. Sylvain says, both in his paper and in a message posted here, that it seems that one must try to constrain moves based on sequences of moves rather than individual moves in isolation. All rather mysterious to me. -Richard ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC))
The tests that involved factoring in the margin of victory while the game was still in play, used pure random playouts (or relatively close to it.) Later, I did some tests with esthetics as a goal in itself, and for these, I used what I call a heavy playout. I doubt that the playout type makes a difference but I don't know that for sure. If there was significant curiosity on this list about a specific configuration, I could probably run a quick test. - Dave Hillis -Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Wed, 7 Feb 2007 12:42 PM Subject: Re: [computer-go] MC approach (was: Monte Carlo (MC) vs Quasi-Monte Carlo (QMC)) What sort of sampling was used for the playouts? For this variable ( incorporating some information about the score vs only the win-loss variable ), does it make a difference whether playouts are totally random or incorporate varying degrees of similitude to good play? From: [EMAIL PROTECTED] [EMAIL PROTECTED] I should have mentioned that I have only tested on 9x9. For larger boards, I don't know. - Dave Hillis ` Intuitively, it seems like this should work. You only give the winning margin a small weight, or only use it to break ties, or only apply it after the game is already decided. I've tried many variations, as have others, including your exact algorithm above. It can make some moves look a little prettier, but it always causes problems and I have to take it out. I have my theories as to why that is, but for brevity's sake, in my experience, giving any consideration to winning margin is detrimental. After the game is decided, there are more elegant ways to bring it to a close. I came to this conclusion reluctantly. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Any questions? Get answers on any topic at Yahoo! Answers. Try it now. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
Don Dailey wrote: On Wed, 2007-02-07 at 11:34 +0100, Heikki Levanto wrote: All this could be avoided by a simple rule: Instead of using +1 and -1 as the results, use +1000 and -1000, and add the final score to this. Heikki, I've tried ideas such as this in the past and it's quite frustrating - everything that tries to take territory scoring into account weakens the program. If you just need to see prettier moves, I think it is good enough to priortize the moves using some other algorithm at the root of the tree. If you just cover the case where a program is easily winning or losing it will play nicer but not stronger. Don, do you have any theories or information about why this is the case? I would think either way the algorithm should always prefer higher average win probabilities, but faced with alternatives where the win probabilities are same or nearly the same but the average winning margins are higher for one alternative, wouldn't it be better to take the path with the better margin? I mean it may in fact be wrong about the win/loss classifications so choosing the better scores would seem to make sense within reason as long as it's not greedy. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
Magnus Persson wrote: Quoting Matt Gokey [EMAIL PROTECTED]: (snip) Good point. This leads to another thought that I have been wondering about. That is I question whether using more time to search more simulations in the opening is the best approach. For the opening, selecting reasonable robust moves that tend to lead to more favorable options is probably a good objective. The lengths of the simulation are perhaps too long to expect anything better. Later towards the pre-middle to middle game it is very critical to play such that the positions tactical potential is exploited such to secure connections and eye space, etc. It would seem to me that focusing the highest concentration of time and number of simulations during this part of the game might be most advantageous. It would be interesting for someone with a decent MC player to do an experiment like this with one version concentrating highest number of simulations in the opening and one concentrating in the middle game, but otherwise equal and see which version wins more often. My experience with Valkyria is that most time must be allocated towards early moves. If you make a mistake on 9x9 it is very hard to come back. MC/UCT programs such as MoGo are very good at defending a won position. Therefore it is important to get a won position as early as possible. The longer it thinks the better it plays. In the opening it always critical to find the best move - but later on games are often either won or lost so saving time for those positions are not so important. I agree with this last statement for the early end-game / end-game phase. For the pre-middle to middle I'm not sure. This is the critical part of the game where you need to solidify the winning positions. As long as reasonable opening moves that provide robust options are made and it can play into the middle game stronger than the opponent, winning positions should result. I'm not saying not to spend time up front, just less than when in the middle. On 9x9, the stage where this becomes tactically critical is probably much earlier than on 19x19. Even with 9x9 I predict an MC/UCT program that spent X simulations per move for the first 8 moves, and 2X for the next 8 would tend to beat the same player doing the reverse, where X is some reasonably large number of simulations needed to open decently. I could be wrong, but it would be an interesting experiment anyway. However IMO 9x9 is too small to see the real complications come out. Valkyria saves time in the opening by using an opening library, but as soon as it leaves the library it spends a lot of time on the first move. Moves it spends a lot of time on is also stored in the library. And later on I might correct moves that was played in lost hand myself. I am actually rarely satisfied with the opening moves of Valkyria. It needs more time for those moves... I was wondering if anyone was combining an opening library with MC/UCT by running a large number of the simulations and storing the results. This seems like a pretty natural extension to save simulation time in the opening. How strong a player are you? You are probably unfairly evaluating Valkyria based on your strong expert play/perspective. I'm rather amazed that MC simulations find good moves at all given that most of the playouts are nonsense games. That is why I say MC/UCT is finding reasonably robust moves with more favorable options (strategic play) not necessarily great/best moves. Because of mostly meaningless playouts it misses nuances and tactics deeper into the game that would show otherwise. It seems the deeper the simulations the worse this effect would be. -Matt ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
If it only did one playout you would be right, but imagine the following cases: case 1: White wins by .5 x 100, Black wins by .5 x 100 case 2: White wins by 100.5 x 91, Black wins by .5 x 109 the method that takes into account score would prefer the second case even though it has a lower winning percentage that may be represented by the fact that white is making an overplay for instance. Obviously this is just one example, but there are many cases like this and overplays tend to be priveledged in a sense I would suspect with this kind of algorithm. - Nick On 2/7/07, Matt Gokey [EMAIL PROTECTED] wrote: Don Dailey wrote: On Wed, 2007-02-07 at 11:34 +0100, Heikki Levanto wrote: All this could be avoided by a simple rule: Instead of using +1 and -1 as the results, use +1000 and -1000, and add the final score to this. Heikki, I've tried ideas such as this in the past and it's quite frustrating - everything that tries to take territory scoring into account weakens the program. If you just need to see prettier moves, I think it is good enough to priortize the moves using some other algorithm at the root of the tree. If you just cover the case where a program is easily winning or losing it will play nicer but not stronger. Don, do you have any theories or information about why this is the case? I would think either way the algorithm should always prefer higher average win probabilities, but faced with alternatives where the win probabilities are same or nearly the same but the average winning margins are higher for one alternative, wouldn't it be better to take the path with the better margin? I mean it may in fact be wrong about the win/loss classifications so choosing the better scores would seem to make sense within reason as long as it's not greedy. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On Wed, 2007-02-07 at 14:10 -0600, Matt Gokey wrote: I was wondering if anyone was combining an opening library with MC/UCT by running a large number of the simulations and storing the results. This seems like a pretty natural extension to save simulation time in the opening. This is not particular effective if you mean to just save the tree. But here is what I do about the opening which seems to help some: 1. Note the very first move out of book that the program is asked to search. Write the move sequence to a file. 2. Have a program running off-line that processes the positions first encountered out of book. The program searches each position 8 times using several minutes per search. This is to provide variety. The moves returned from the search are placed into the opening book. 3. The moves are played in the same proportion they were chosen. For instance if e5 is chosen 7 times and d5 1 time, the program will play e5 7 out of 8 times. The book searching code of course does all the board rotations and reflections - and when a move is chosen a random orientation is chosen if it's appropriate.For instance after the opening move e5 if d5 is a choice then it's equally likely to play d5, f5, e4 or e6. I'm building the book off-line and new positions are presented faster than new book responses can be generated.So I'm taking Lazarus off-line once in a while so the book building procedure can catch up. I don't know how much this really helps - it seems to not help much until the book gets fairly large.There are 2 ways it helps of course - the opening moves are deep searched so presumably they might be higher quality, and the time saved can be spent to improve the quality of the later moves. Lazarus spends a lot more time on early moves so this makes it possible to save a lot of time on the first 2 or 3 moves and spend it on later moves. I'm in a building phase now - the book is still quite small and has 91 positions at the moment. Most of the moves it's generating are 3 or 4 ply (moves) into the game, but some are as deep as 10 ply into the game. Probably positions from opponents who have fixed playing algorithms or books. At some point I intend to do some kind of analysis on the win/loss percentage of certain moves. One possibility is to mini-max the book lines. This way of making a book has severe disadvantages. I think it works great on 9x9 but if the program is improved, the book is suddenly not very relevant and you have to start over. Since I am not a strong player I have no way of fixing up the book. I can't recognize when it chooses a weak move or when a stronger move is available even if it doesn't like it. I have a hash funciton that creates a 64 bit cannonical hash of the position. The same hash is produced after a rotation for example. A book entry is a record with 3 fields, a priority value (how many times the deep search selected this position), the cannonical hash of the parent position and the cannonical hash of the posiiton AFTER the move is played.This makes collision very unlikely.The cannonical hash takes into consideration simple ko, so if a ko is possible it will hash differently than the same position where the ko is illegal. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote: I have a hash funciton that creates a 64 bit cannonical hash of the position. The same hash is produced after a rotation for example. A book entry is a record with 3 fields, a priority value (how many times the deep search selected this position), the cannonical hash of the parent position and the cannonical hash of the posiiton AFTER the move is played.This makes collision very unlikely.The cannonical hash takes into consideration simple ko, so if a ko is possible it will hash differently than the same position where the ko is illegal. Here is some more detail to make this clearer: typedef struct { intcount; // number of times this position/response seen (priority) u64key; // cannonical position key u64resp;// resulting cannonical position } book_entry; These book_entry records are stored in a file and I keep them sorted. So the procedure to see if there is a book move is to binary search the file on the parent position key, collect all of these records together (making sure there is a legal move which leads to the cannonical response key) and then choose one of the available moves in proportion to the priority values (the count field.) - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
That drives me nuts! Minimax search would eliminate bad lines of play whenever a refutation is found. A good opponent would not play badly, and the quantity of possible bad moves should not affect the evaluation of good moves - but that seems to be what MC does, averaging out all moves regardless of whether they are known to be good, have been refuted, or are of indeterminate status. What am I missing? Terry McIntyre - Original Message From: Nick Apperson [EMAIL PROTECTED] If it only did one playout you would be right, but imagine the following cases: case 1: White wins by .5 x 100, Black wins by .5 x 100 case 2: White wins by 100.5 x 91, Black wins by .5 x 109 the method that takes into account score would prefer the second case even though it has a lower winning percentage that may be represented by the fact that white is making an overplay for instance. Obviously this is just one example, but there are many cases like this and overplays tend to be priveledged in a sense I would suspect with this kind of algorithm. - Nick Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Paper presents results on proximity heuristic
You should never play suicide, whether multiple or single stone in the play-out portion of the search - ESPECIALLY when it's not legal anyway in the rule-set you are using. Although suicide can occasionally be the best move in some rule-sets, I think it weakens your program to include it, even in the UCT tree portion of an MC search. If the rule-set allows suicide, I believe it's best to just let your program accept suicide moves from the opponent. - Don On Wed, 2007-02-07 at 13:23 -0800, Christoph Birk wrote: On Wed, 7 Feb 2007, Chris Fant wrote: Your paper does not mention suicide. Prohibiting single-stone suicide during playout gave me a nice increase in playout rate and strength. Does you program allow multiple-stone suicide during playout? myCtest does NOT allow it; and neither does GenericMC_ by Don. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
On Wed, 2007-02-07 at 14:08 -0600, Matt Gokey wrote: Don, do you have any theories or information about why this is the case? Not really. In truth the only thing that matters is to increase your winning percentage - not your score. There seems to be no point in tampering with this. - Don I would think either way the algorithm should always prefer higher average win probabilities, but faced with alternatives where the win probabilities are same or nearly the same but the average winning margins are higher for one alternative, wouldn't it be better to take the path with the better margin? I mean it may in fact be wrong about the win/loss classifications so choosing the better scores would seem to make sense within reason as long as it's not greedy. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Hello, David Doshay wrote: On 3, Feb 2007, at 2:51 AM, Sylvain Gelly wrote: the speed of the best simulation policy in MoGo is 0.6 * the speed of the uniform random one. I think that this is very good. You give up less than a factor of 2 from uniform random and you probably get better than a factor of 2 in the % of relevant moves. Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? Imagine you're using a almost random playout that has a strength, if playing alone, of say 10 ELO (just a number, I don't know how strong it really is). Using some thousand MC simulations with that 10 ELO playout and some other clever tricks like UCT, and you get a program that plays with 1500 ELO. Now what if you change the playout to one that has a strength, if playing alone, of 20 ELO, 50 ELO or 100 ELO. How whould that affect the strength of your overall programm, if you run the same number of MCs? Any known numbers on that? [I understand that a non-random playout would decrease the speed, so couln't run as many simulations as before in the same time, but let's imagine we get it for free.] eph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Paper presents results on proximity heuristic
Okay, thanks for the feedback. I mentioned that I was allowing multi-stone suicide a couple days ago but no one said anything. It seems little more complicated to check for than single-stone suicide when only tracking pseudo-liberties. But I will get it in there and see what kind of improvement happens and how much it affects the speed. On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote: You should never play suicide, whether multiple or single stone in the play-out portion of the search - ESPECIALLY when it's not legal anyway in the rule-set you are using. Although suicide can occasionally be the best move in some rule-sets, I think it weakens your program to include it, even in the UCT tree portion of an MC search. If the rule-set allows suicide, I believe it's best to just let your program accept suicide moves from the opponent. - Don On Wed, 2007-02-07 at 13:23 -0800, Christoph Birk wrote: On Wed, 7 Feb 2007, Chris Fant wrote: Your paper does not mention suicide. Prohibiting single-stone suicide during playout gave me a nice increase in playout rate and strength. Does you program allow multiple-stone suicide during playout? myCtest does NOT allow it; and neither does GenericMC_ by Don. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
It's a *weighted* average of all moves. UCT tree search doesn't explore bad moves as often as good ones, so they don't contribute as much to the estimation of the worth of a node. - Dave Hillis -Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Wed, 7 Feb 2007 4:31 PM Subject: Re: [computer-go] MC approach That drives me nuts! Minimax search would eliminate bad lines of play whenever a refutation is found. A good opponent would not play badly, and the quantity of possible bad moves should not affect the evaluation of good moves - but that seems to be what MC does, averaging out all moves regardless of whether they are known to be good, have been refuted, or are of indeterminate status. What am I missing? Terry McIntyre - Original Message From: Nick Apperson [EMAIL PROTECTED] If it only did one playout you would be right, but imagine the following cases: case 1: White wins by .5 x 100, Black wins by .5 x 100 case 2: White wins by 100.5 x 91, Black wins by .5 x 109 the method that takes into account score would prefer the second case even though it has a lower winning percentage that may be represented by the fact that white is making an overplay for instance. Obviously this is just one example, but there are many cases like this and overplays tend to be priveledged in a sense I would suspect with this kind of algorithm. - Nick It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
The only think I can suggest which perhaps hasn't been tried is to only consider the score in the evaluation if NONE of the playouts resulted in a loss. On 2/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: It's a *weighted* average of all moves. UCT tree search doesn't explore bad moves as often as good ones, so they don't contribute as much to the estimation of the worth of a node. - Dave Hillis -Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Wed, 7 Feb 2007 4:31 PM Subject: Re: [computer-go] MC approach That drives me nuts! Minimax search would eliminate bad lines of play whenever a refutation is found. A good opponent would not play badly, and the quantity of possible bad moves should not affect the evaluation of good moves - but that seems to be what MC does, averaging out all moves regardless of whether they are known to be good, have been refuted, or are of indeterminate status. What am I missing? Terry McIntyre - Original Message From: Nick Apperson [EMAIL PROTECTED] If it only did one playout you would be right, but imagine the following cases: case 1: White wins by .5 x 100, Black wins by .5 x 100 case 2: White wins by 100.5 x 91, Black wins by .5 x 109 the method that takes into account score would prefer the second case even though it has a lower winning percentage that may be represented by the fact that white is making an overplay for instance. Obviously this is just one example, but there are many cases like this and overplays tend to be priveledged in a sense I would suspect with this kind of algorithm. - Nick It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Paper presents results on proximity heuristic
By the way, the paper was rejected on first submission, largely because we were just testing Orego against itself. We're now testing Orego against GNU Go and have a revised version: https://webdisk.lclark.edu/xythoswfs/webui/_xy-2352013_1-t_Gct7yJ5s%22 (Markus, could you change the link and title? This was DPSV07.) Peter Drake Assistant Professor of Computer Science Lewis Clark College http://www.lclark.edu/~drake/ On Feb 7, 2007, at 12:36 PM, Chris Fant wrote: In this paper, you say that you limit the number of moves to BoardArea*2 during the playouts. For me, this barely increases the playout rate and slightly reduces the strength (perhaps not statistically significant). Your paper does not mention suicide. Prohibiting single-stone suicide during playout gave me a nice increase in playout rate and strength. On 11/28/06, Peter Drake [EMAIL PROTECTED] wrote: Here it is: https://webdisk.lclark.edu/xythoswfs/webui/_xy-2115826_1-t_OX34gnaB Peter Drake Assistant Professor of Computer Science Lewis Clark College http://www.lclark.edu/~drake/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
Quoting Matt Gokey [EMAIL PROTECTED]: Magnus Persson wrote: Quoting Matt Gokey [EMAIL PROTECTED]: I was wondering if anyone was combining an opening library with MC/UCT by running a large number of the simulations and storing the results. This seems like a pretty natural extension to save simulation time in the opening. I actually first tried to save the entire search tree for every position. Then it would load the position and search deeper every time it was encountered. The problem was simply disk storage, and there is always positions where the program itself cannot in a reasonable time find strong moves by searching. How strong a player are you? You are probably unfairly evaluating Valkyria based on your strong expert play/perspective. I'm rather amazed that MC simulations find good moves at all given that most of the playouts are nonsense games. That is why I say MC/UCT is finding reasonably robust moves with more favorable options (strategic play) not necessarily great/best moves. Because of mostly meaningless playouts it misses nuances and tactics deeper into the game that would show otherwise. It seems the deeper the simulations the worse this effect would be. I am european 2Dan. I agree that MC/UCT finds robust moves, it is more important to avoid mistakes in go rather than finding the best move. There often little difference between alternative bestlooking moves but a blunders can be huge. I still disagree though a little on misses nuances and tactics. I think there is always a signal from the simulations that MC can pick up to some degree. The signals that get lost or cannot be distinguished are discoverd by searching deeper with UCT. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Unfortunately, it's just not that simple, because it depends far more on _how_ the playout is improved, rather than, say, the ELO numbers that measure that improvement. For example, programs exist that are good (in part) because they entirely disregard some lines of play. They may be correct to disregard these lines in almost every case, which generally makes the playout program stronger. However, for the few cases where this heuristic pruning is not correct, the calling program will suffer greatly, because these lines of play become completely invisible to the random playouts, no matter how many playouts are performed. As an extreme example, consider programs that play completely deterministic strategies. These are obviously useless as random players, yet it is in principle possible to construct ones that play arbitrarily well. Weston On 2/7/07, Ephrim Khong [EMAIL PROTECTED] wrote: Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Effective Go Library v0.101
On 2/7/07, steve uurtamo [EMAIL PROTECTED] wrote: And of course don't forget about this no_variable_in_for thing... i'll have to read up on what you're describing. The following pseudocode block for (int i = 0; i 10; ++i) { ... code ... } // i's lifetime ends after the for loop does is valid in just about any version of C++ and in the 1999 ISO C standard (also known as C99), but it is not valid in most older C standards. (Some older versions of gcc would accept this code but assign the wrong lifetime (according to the standard) to variable i. If you want to test this code with gcc, then use the -std=c99 flag, which, as of quite recently at least, was not enabled by default.) There are at least a couple other C++-isms -- offhand, the // single-line comment form and variable declarations in the middle of code blocks come to mind -- that are also valid in C99 but invalid in at least some of the older C standards. I'm not trying to run off on a language standards tangent! I just feared that we might be headed towards one anyway, and I wanted to make sure the information in it was not 8 years out of date. Sorry, now back to go (I hope)... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote: In truth the only thing that matters is to increase your winning percentage - not your score. There seems to be no point in tampering with this. I guess I must accept the wisdom of those who have tried these things. Still, it hurts my intuition that it could be better for a program to choose a line where it seems to win by 2 points, when another line seems to end in a 100 point win. What if the opponent can improve his play from what I expected, and gain an extra 3 points somewhere? Maybe all this shows that we have not (yet?) understood all the complications of the MC evaluation, and that more research is needed? -H -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On Wed, 2007-02-07 at 16:28 -0500, Don Dailey wrote: On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote: I have a hash funciton that creates a 64 bit cannonical hash of the position. The same hash is produced after a rotation for Most people do incremental hashing, which is cheaper even if 8(16) symmetric copies are to be maintained. YMMV. example. A book entry is a record with 3 fields, a priority value (how many times the deep search selected this position), the cannonical hash of the parent position and the cannonical hash of the posiiton AFTER the move is played.This makes collision very unlikely.The cannonical hash takes into consideration simple ko, so if a ko is possible it will hash differently than the same position where the ko is illegal. Here is some more detail to make this clearer: Since you seem to be replying to yourself, I'll add some comments. typedef struct typedef is useless here, IMO. (but this is a matter of style, I agree) { intcount; // number of times this position/response seen (priority) Warning: alignment will cause this struct to be 3* sizeof(U64), wastong 32 bits. Warning2: if the count is never negative, unsigned would be more appropiate. u64key; // cannonical position key u64resp;// resulting cannonical position Warning: you are wasting (64- ~9) bits here, since the response stems from a set of 361+1 choices, maximal. (this would be different if you intend to search backwards in the tree/dag, with 'resp' as search key) } book_entry; That could reduce your struct to: struct booklet { u64 key; u32 count; u16 move; /* you *could* store the de-canonilisation-info here: */ u16 spare; }; , which will take only 2*64 bits. These book_entry records are stored in a file and I keep them sorted. So the procedure to see if there is a book move is Sorted on key? (Keep them sorted == resort periodically) In that case, some kind of hashing would seem more appropiate, IMHO to binary search the file on the parent position key, collect all of these records together (making sure there is a legal move which leads to the cannonical response key) and then You are not clear here. Is there only a one-move-step between key and resp ? Why not store the move in the first place ? (instead of the resulting hash) choose one of the available moves in proportion to the priority values (the count field.) HTH, AvK ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
On Thu, 2007-02-08 at 00:46 +0100, Heikki Levanto wrote: On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote: In truth the only thing that matters is to increase your winning percentage - not your score. There seems to be no point in tampering with this. I guess I must accept the wisdom of those who have tried these things. Still, it hurts my intuition that it could be better for a program to choose a line where it seems to win by 2 points, when another line seems to end in a 100 point win. What if the opponent can improve his play from what I expected, and gain an extra 3 points somewhere? Maybe all this shows that we have not (yet?) understood all the complications of the MC evaluation, and that more research is needed? -H I agree, more research is needed. My intuition is also the same as yours - one would think 100 point win is better. It's not that it's better or it's a better move objectively, but it should be better against a fallible opponent who could easily get distracted by the little battles and forget the real point of the game. At least it's a way to fight when you are losing. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On 2/7/07, Unknown [EMAIL PROTECTED] wrote: to binary search the file on the parent position key, collect all of these records together (making sure there is a legal move which leads to the cannonical response key) and then You are not clear here. Is there only a one-move-step between key and resp ? Why not store the move in the first place ? (instead of the resulting hash) 128 bits of hash vs 64 bits. From his first post: This makes collision very unlikely. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On Thu, 2007-02-08 at 01:00 +0100, Unknown wrote: On Wed, 2007-02-07 at 16:28 -0500, Don Dailey wrote: On Wed, 2007-02-07 at 16:17 -0500, Don Dailey wrote: I have a hash funciton that creates a 64 bit cannonical hash of the position. The same hash is produced after a rotation for Most people do incremental hashing, which is cheaper even if 8(16) symmetric copies are to be maintained. YMMV. I'm an expert in zobrist incremental hashing and have been doing it since the early years of computer chess. However, I have some other considerations here: No hashing is still faster than incremental hashing - which is why I don't do any hashing during the play-out phase. Since I am only hashing for the purpose of building or looking up a book position, this is the best approach. If I wanted to get clever, I could pass a flag to the move maker to tell it whether to do hashing or not and then do it incrementally, but for all this trouble I would not get a speedup I would ever notice since I only hash to find a book move. example. A book entry is a record with 3 fields, a priority value (how many times the deep search selected this position), the cannonical hash of the parent position and the cannonical hash of the posiiton AFTER the move is played.This makes collision very unlikely.The cannonical hash takes into consideration simple ko, so if a ko is possible it will hash differently than the same position where the ko is illegal. Here is some more detail to make this clearer: Since you seem to be replying to yourself, I'll add some comments. typedef struct typedef is useless here, IMO. (but this is a matter of style, I agree) There are some advantages to NOT doing it my way, but I prefer the concise way - I don't like having to define it as struct book_entry all over my code. It's a matter of style. { intcount; // number of times this position/response seen (priority) Warning: alignment will cause this struct to be 3* sizeof(U64), wastong 32 bits. Warning2: if the count is never negative, unsigned would be more appropiate. I'm not concerned about space usage in the slightest because I have about 100 book positions currently, and in my dreams I might get to a few thousand. But you are right, I could save a few bits if I didn't worry about structure alignment. u64key; // cannonical position key u64resp;// resulting cannonical position Warning: you are wasting (64- ~9) bits here, since the response stems from a set of 361+1 choices, maximal. (this would be different if you intend to search backwards in the tree/dag, with 'resp' as search key) But if I use the compact approach I lose a lot of safety. These are hash keys and if I get a collision in key, then it's extremely unlikely that I will get a collision in any of the child hash keys. But if I use your scheme, I have very little extra safety (although I still have a little bit since a move might prove to be legal in one and not in the other, but this is probably only worth an extra bit or two.) To be honest, 64 bits is probably safe enough for a few hundred moves, unlikely to cause a collision. } book_entry; That could reduce your struct to: struct booklet { u64 key; u32 count; u16 move; /* you *could* store the de-canonilisation-info here: */ u16 spare; }; , which will take only 2*64 bits. Based on the considerations I have presented, a better layout is something like this: u64 key; u32 child_key:32; u32 count; I'm extremely unlikely to get a collision with a 64 bit parent and a 32 bit child, so I would use something like this to save space. If I wanted to use bit fields I could do something like this if I want to sacrafice a few more bits of safety: u64 key; uint child_key:28; uint count:4; I would be happy with just a few bits in child_key because 64 bits is reasonably safe by itself. These book_entry records are stored in a file and I keep them sorted. So the procedure to see if there is a book move is Sorted on key? (Keep them sorted == resort periodically) In that case, some kind of hashing would seem more appropiate, IMHO I don't need a complicated scheme. The book is small and always will be small and when I insert a record I just used insertion sort, on the fly, which is faster than quicksort on a file that is already sorted except for one record. This is trivial easy stuff and is bug-free. I dump the book to a file using fwrite and it's trivial. If I use a hashing scheme, I have to worrry about resizes and other complexities and how to put them on the disk. If I really get to the point where I think I can generated a huge book, my plan would be to use the sqlite library and I would store them using this very easy to use embedded database. to binary search the file on the parent position key,
Re: [computer-go] MC approach
But of course, it's not the size of the win that counts, it is rather the confidence that it really is a win. In random playouts that continue from a position from a close game, the ones that result in a large victory are generally only ones where the opponent made a severe blunder. (Put another way, the score of the game is affected more by how bad the bad moves are, rather than how good the good ones are, or even how good most of the moves are. Others have commented on this effect in this list, in other contexts.) Since you can't count on that happening in the real game, these simulations have a lower value in the context of ensuring a win. Even when the program is losing (say by a little bit) it is more important to play moves that it thinks are the most likely to convert the game to a win by confusing the opponent, rather than to play moves that will make the losing outcome more apparent. These tend to be different moves, and Monte Carlo methods are good at uncovering these differences. I agree that this is a bit surprising, but I find it much less so when I think about it in these terms. Given that people have reported such a strong effect, I am actually wondering if these simulations (those that result in a large score difference) should be _penalized_, for not being properly representative of the likely outcome of the game. In other words: value = 1000 * win - score Weston On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote: My intuition is also the same as yours - one would think 100 point win is better. It's not that it's better or it's a better move objectively, but it should be better against a fallible opponent who could easily get distracted by the little battles and forget the real point of the game. At least it's a way to fight when you are losing. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
After looking at your message I'm not sure you understand how I set this up. On Thu, 2007-02-08 at 01:00 +0100, Unknown wrote: to binary search the file on the parent position key, collect all of these records together (making sure there is a legal move which leads to the cannonical response key) and then You are not clear here. Is there only a one-move-step between key and resp ? Why not store the move in the first place ? (instead of the resulting hash) The book has 1 record per playable move. If there are 4 responses to some position in the book, there will be 4 records, one for each possible response. Those 4 keys will have the same key field which is a cannonical key of the position. But they will have separate child keys (resp for resulting position) which are also cannonical. From some actual position it's trivial to convert the cannonical child keys to actual moves using the cannoical hash. The count is just a priority system which right now happens to corespond to how many times the book searcher wanted to play the move in question (and since I set an arbitrary limit of 8 searches, all the moves for a given position would total 8 if all the lookups have been completed.) Doing 8 searches is time-consuming, but I really prefer a book that has a LOT of variety. If I decide that the book stuff is really useful, I will probably convert it to use a sqlite3 database which is quite nice and easy to use and manage, I might even place search data such as the score and number of nodes searched in such a database. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
Matt Gokey: [EMAIL PROTECTED]: Weston Markham wrote: But of course, it's not the size of the win that counts, it is rather the confidence that it really is a win. Yes, and my reasoning was that a larger average win implied a higher confidence since there is more room for error. That intuition may not hold though. In random playouts that continue from a position from a close game, the ones that result in a large victory are generally only ones where the opponent made a severe blunder. (Put another way, the score of the game is affected more by how bad the bad moves are, rather than how good the good ones are, or even how good most of the moves are. Others have commented on this effect in this list, in other contexts.) Since you can't count on that happening in the real game, these simulations have a lower value in the context of ensuring a win. That is the first reasonable argument I've heard that makes some sense as to why this effect may be true. The opposite of course may be true as well and close games may really not be close due to the same blunder effect. Perhaps it is just another symptom of the fact that most playouts are nonsense games. We can test this effect by using, for example, v = 0.5 * (1.0 + tanh (k * score)); // v is in [0...1]. with a little penalty of simulation speed. As k being lager, this function closes to commonly used threshold function, and vice versa. I guess the best value of k depends on the sensefulness of the games, ie., current heuristics for pruning moves are not so effective that larger k is the best. - gg -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/