Re: [Computer-go] Master Thesis: Information Sharing in MCTS
Petr, a huge congratulations! chetwynd not chetwyng, but hey had no expectation of a plug. looking forward to the browse. best ~: On 3 Aug 2011, at 23:13, Petr Baudis wrote: Hi! If anyone is interested, you can read my master thesis at: http://pasky.or.cz/go/prace.pdf It could give a good introduction to current Monte Carlo techniques in Computer Go in general, and discusses some approaches for improvement (nothing too dramatic). It also gives a mid-level technical description of Pachi (with some important stuff left out, but we are preparing a paper). Kind regards, -- Petr Pasky Baudis UNIX is user friendly, it's just picky about who its friends are. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Master Thesis: Information Sharing in MCTS
Great thesis. Many Faces also uses rule-based playouts, so Pachi is not the only rule-based strong program. You mention that in the playouts you check ataris and extensions to avoid growing a losing ladder. Do you do a full ladder search, or just some local heuristics? Many Faces does not have any ladder search in the playouts. David -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Petr Baudis Sent: Wednesday, August 03, 2011 3:13 PM To: computer...@computer-go.org Subject: [Computer-go] Master Thesis: Information Sharing in MCTS Hi! If anyone is interested, you can read my master thesis at: http://pasky.or.cz/go/prace.pdf It could give a good introduction to current Monte Carlo techniques in Computer Go in general, and discusses some approaches for improvement (nothing too dramatic). It also gives a mid-level technical description of Pachi (with some important stuff left out, but we are preparing a paper). Kind regards, -- Petr Pasky Baudis UNIX is user friendly, it's just picky about who its friends are. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] European Go Congress 2012
Hello Ingo, I would have liked to travel to Bordeaux for the computer Go, but it did not fit well with other things, also Olivier thought I could help just as much from my home in England. However I would like to help with the events in Bonn. Is the schedule for the computer Go known yet? I could attend for either week, but the middle weekend, August 27-29, will be difficult for me. Best wishes, Nick On 21/07/2011 15:40, Ingo Althöfer wrote: Hello, right now the European Go Congress 2011 is to start - in Bordeaux, France. At the same time preparations are running already for the Congress 2012, which will take place in Bonn, Germany, in July and August 2012. On the website of the EGC-2012 organizers, I am writing a column on computer go, in irregular intervals. Currently there is one entry only, from early June. http://www.egc2012.eu/home/news/2011/06/04/ingos-computer-go-column-times-are-hot-computer-go When someone here has interesting material or ideas or questions for that column, please let me know. Ingo. -- Nick Wedd n...@maproom.co.uk ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Master Thesis: Information Sharing in MCTS
On Thu, Aug 04, 2011 at 12:46:21AM -0700, David Fotland wrote: You mention that in the playouts you check ataris and extensions to avoid growing a losing ladder. Do you do a full ladder search, or just some local heuristics? Many Faces does not have any ladder search in the playouts. We do a full ladder search in the sense that we walk the board up to a ladderbreaker. However, we do not actually play the moves and the ladder breaker test is very simple. Overall, this check takes little time, but it also is not very important strength-wise. Pachi still tends to misplay many ladders and we have not found a complete solution for that. (I'm very reluctant to completely prune moves on principle.) -- Petr Pasky Baudis UNIX is user friendly, it's just picky about who its friends are. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] European Go Congress 2012
Answer given in private mail. Ingo. Original-Nachricht Datum: Thu, 04 Aug 2011 09:03:32 +0100 Von: Nick Wedd n...@maproom.co.uk An: computer-go@dvandva.org Betreff: Re: [Computer-go] European Go Congress 2012 Hello Ingo, I would have liked to travel to Bordeaux for the computer Go, but it did not fit well with other things, also Olivier thought I could help just as much from my home in England. However I would like to help with the events in Bonn. Is the schedule for the computer Go known yet? I could attend for either week, but the middle weekend, August 27-29, will be difficult for me. Best wishes, Nick On 21/07/2011 15:40, Ingo Althöfer wrote: Hello, right now the European Go Congress 2011 is to start - in Bordeaux, France. At the same time preparations are running already for the Congress 2012, which will take place in Bonn, Germany, in July and August 2012. On the website of the EGC-2012 organizers, I am writing a column on computer go, in irregular intervals. Currently there is one entry only, from early June. http://www.egc2012.eu/home/news/2011/06/04/ingos-computer-go-column-times-are-hot-computer-go When someone here has interesting material or ideas or questions for that column, please let me know. Ingo. -- Nick Wedd n...@maproom.co.uk ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] testing improvements
Hi all! I finally got an idea that is worth investigating. Luckily, it is something that can be tested by modifying existing programs and so I started to set up an environment to test it. In order to have a reference, I started this morning a small tournament with two identical versions of fuego (@1k) and gnugo (@level 10) and they played 863 rounds so far. The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. Now this is within limits, and it varies a bit, but it is always on the side of one of the instances, never less than 51.5%. Is this something normal? Sorry for the n00b question :-) best regards, Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 9:57 AM To: computer-go@dvandva.org Subject: [Computer-go] testing improvements Hi all! I finally got an idea that is worth investigating. Luckily, it is something that can be tested by modifying existing programs and so I started to set up an environment to test it. In order to have a reference, I started this morning a small tournament with two identical versions of fuego (@1k) and gnugo (@level 10) and they played 863 rounds so far. The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. Now this is within limits, and it varies a bit, but it is always on the side of one of the instances, never less than 51.5%. Is this something normal? Sorry for the n00b question :-) best regards, Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. And I'm not even taking into account the fact that you want to continue testing till you reach significance. That would again require you take a lower level. Jonas On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
Hi, On Thu, Aug 4, 2011 at 19:29, David Fotland fotl...@smart-games.com wrote: Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? Yes, I set up an all-play-all competition with gomill. On Thu, Aug 4, 2011 at 19:55, Erik van der Werf erikvanderw...@gmail.com wrote: On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) I took the easy way out and used a formula mentioned by David Fotland on this list for a while ago There is a simple formula to estimate the confidence interval of a result. I use it to see if a new version is likely better than a reference version (but I use 95% confidence intervals, so over hundred of experiments it gives me the wrong answer too often). 1.96 * sqrt(wr * (1 - wr) / trials) Where wr is the win rate of one version vs the reference, and trials is the number of test games. On Thu, Aug 4, 2011 at 20:21, Kahn Jonas jonas.k...@math.u-psud.fr wrote: All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. At the moment I ran the bots without any modification, to see if everything works fine. So I think that the results between the identical bots should have been closer to 50% or at least to swing sometimes to the other side of 50%. Right now it's 625-566, which is 52,5% and 2.83% confidence according to the formula above. The results are fuego-1.1 v fuego-new (1199/2000 games) unknown results: 1 0.08% board size: 9 komi: 6.5 wins black whiteavg cpu fuego-1.1569 47.46% 386 64.33% 183 30.55% 2.69 fuego-new629 52.46% 415 69.28% 214 35.67% 2.67 801 66.81% 397 33.11% I realize that statistic results don't always match what one would expect, but this should be a straightforward case... Thanks a lot for all the answers! regards, /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
Remember that the confidence interval is two sided, so 3% means plus or minus 3%. So 52% win rate is within +- 3% of 50%. David -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 1:14 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] testing improvements Hi, On Thu, Aug 4, 2011 at 19:29, David Fotland fotl...@smart-games.com wrote: Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? Yes, I set up an all-play-all competition with gomill. On Thu, Aug 4, 2011 at 19:55, Erik van der Werf erikvanderw...@gmail.com wrote: On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) I took the easy way out and used a formula mentioned by David Fotland on this list for a while ago There is a simple formula to estimate the confidence interval of a result. I use it to see if a new version is likely better than a reference version (but I use 95% confidence intervals, so over hundred of experiments it gives me the wrong answer too often). 1.96 * sqrt(wr * (1 - wr) / trials) Where wr is the win rate of one version vs the reference, and trials is the number of test games. On Thu, Aug 4, 2011 at 20:21, Kahn Jonas jonas.k...@math.u-psud.fr wrote: All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. At the moment I ran the bots without any modification, to see if everything works fine. So I think that the results between the identical bots should have been closer to 50% or at least to swing sometimes to the other side of 50%. Right now it's 625-566, which is 52,5% and 2.83% confidence according to the formula above. The results are fuego-1.1 v fuego-new (1199/2000 games) unknown results: 1 0.08% board size: 9 komi: 6.5 wins black whiteavg cpu fuego-1.1569 47.46% 386 64.33% 183 30.55% 2.69 fuego-new629 52.46% 415 69.28% 214 35.67% 2.67 801 66.81% 397 33.11% I realize that statistic results don't always match what one would expect, but this should be a straightforward case... Thanks a lot for all the answers! regards, /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
On Thu, Aug 4, 2011 at 22:41, David Fotland fotl...@smart-games.com wrote: Remember that the confidence interval is two sided, so 3% means plus or minus 3%. So 52% win rate is within +- 3% of 50%. Yes, of course. What I reacted to was that under the whole test, one bot always had around 52% wins (well, after some 100 games, at least). I would have thought it would move around the real value. Thanks, Vlad -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 1:14 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] testing improvements Hi, On Thu, Aug 4, 2011 at 19:29, David Fotland fotl...@smart-games.com wrote: Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? Yes, I set up an all-play-all competition with gomill. On Thu, Aug 4, 2011 at 19:55, Erik van der Werf erikvanderw...@gmail.com wrote: On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) I took the easy way out and used a formula mentioned by David Fotland on this list for a while ago There is a simple formula to estimate the confidence interval of a result. I use it to see if a new version is likely better than a reference version (but I use 95% confidence intervals, so over hundred of experiments it gives me the wrong answer too often). 1.96 * sqrt(wr * (1 - wr) / trials) Where wr is the win rate of one version vs the reference, and trials is the number of test games. On Thu, Aug 4, 2011 at 20:21, Kahn Jonas jonas.k...@math.u-psud.fr wrote: All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. At the moment I ran the bots without any modification, to see if everything works fine. So I think that the results between the identical bots should have been closer to 50% or at least to swing sometimes to the other side of 50%. Right now it's 625-566, which is 52,5% and 2.83% confidence according to the formula above. The results are fuego-1.1 v fuego-new (1199/2000 games) unknown results: 1 0.08% board size: 9 komi: 6.5 wins black white avg cpu fuego-1.1 569 47.46% 386 64.33% 183 30.55% 2.69 fuego-new 629 52.46% 415 69.28% 214 35.67% 2.67 801 66.81% 397 33.11% I realize that statistic results don't always match what one would expect, but this should be a straightforward case... Thanks a lot for all the answers! regards, /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
On Thu, Aug 4, 2011 at 4:41 PM, David Fotland fotl...@smart-games.comwrote: Remember that the confidence interval is two sided, so 3% means plus or minus 3%. So 52% win rate is within +- 3% of 50%. Yes. And something else that is rarely considered is that the error margin does not mean what you think it does if you pick and choose when to observe it. For example you don't just stop the test because you like the current result and error margin. The correct way to interpret the error margin is to decided in advance exactly how many games you are going to play - and then the error margins mean what it is supposed to mean (also considering that it is two sided that is.) When I test I use bayeselo and I set the confidence to 99% instead of the standard 95% because we are not as strict as we should be about this stuff (since due to limited resources we must be able to stop tests early.)But we are at least aware of the problems and issue. Don David -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 1:14 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] testing improvements Hi, On Thu, Aug 4, 2011 at 19:29, David Fotland fotl...@smart-games.com wrote: Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? Yes, I set up an all-play-all competition with gomill. On Thu, Aug 4, 2011 at 19:55, Erik van der Werf erikvanderw...@gmail.com wrote: On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) I took the easy way out and used a formula mentioned by David Fotland on this list for a while ago There is a simple formula to estimate the confidence interval of a result. I use it to see if a new version is likely better than a reference version (but I use 95% confidence intervals, so over hundred of experiments it gives me the wrong answer too often). 1.96 * sqrt(wr * (1 - wr) / trials) Where wr is the win rate of one version vs the reference, and trials is the number of test games. On Thu, Aug 4, 2011 at 20:21, Kahn Jonas jonas.k...@math.u-psud.fr wrote: All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. At the moment I ran the bots without any modification, to see if everything works fine. So I think that the results between the identical bots should have been closer to 50% or at least to swing sometimes to the other side of 50%. Right now it's 625-566, which is 52,5% and 2.83% confidence according to the formula above. The results are fuego-1.1 v fuego-new (1199/2000 games) unknown results: 1 0.08% board size: 9 komi: 6.5 wins black whiteavg cpu fuego-1.1569 47.46% 386 64.33% 183 30.55% 2.69 fuego-new629 52.46% 415 69.28% 214 35.67% 2.67 801 66.81% 397 33.11% I realize that statistic results don't always match what one would expect, but this should be a straightforward case... Thanks a lot for all the answers! regards, /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
I often see that one side gets lucky early and over a few hundred games the win rate moves back toward what I expected. Small numbers of games (like a few hundred) can be very misleading. David -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 2:02 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] testing improvements On Thu, Aug 4, 2011 at 22:41, David Fotland fotl...@smart-games.com wrote: Remember that the confidence interval is two sided, so 3% means plus or minus 3%. So 52% win rate is within +- 3% of 50%. Yes, of course. What I reacted to was that under the whole test, one bot always had around 52% wins (well, after some 100 games, at least). I would have thought it would move around the real value. Thanks, Vlad -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Vlad Dumitrescu Sent: Thursday, August 04, 2011 1:14 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] testing improvements Hi, On Thu, Aug 4, 2011 at 19:29, David Fotland fotl...@smart-games.com wrote: Did each fuego play the same number of games vs gnugo, and did each play half its games on each color? Yes, I set up an all-play-all competition with gomill. On Thu, Aug 4, 2011 at 19:55, Erik van der Werf erikvanderw...@gmail.com wrote: On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu vladd...@gmail.com wrote: The scores towards gnugo are almost identical, but the two fuegos score 449-415, which is 52% and the 95% confidence is ~3%, i.e. ~10 ELO. That 3% is not a 95% confidence interval, more like 1 standard deviation... (so nothing with high confidence yet) I took the easy way out and used a formula mentioned by David Fotland on this list for a while ago There is a simple formula to estimate the confidence interval of a result. I use it to see if a new version is likely better than a reference version (but I use 95% confidence intervals, so over hundred of experiments it gives me the wrong answer too often). 1.96 * sqrt(wr * (1 - wr) / trials) Where wr is the win rate of one version vs the reference, and trials is the number of test games. On Thu, Aug 4, 2011 at 20:21, Kahn Jonas jonas.k...@math.u-psud.fr wrote: All the more since you're testing the same idea on two bots simultaneaously. So if you want to be wrong at most five percent of the time, and consider you are better as soon as one of the bots gets better, you have to make individual tests at the 2.5% level. At the moment I ran the bots without any modification, to see if everything works fine. So I think that the results between the identical bots should have been closer to 50% or at least to swing sometimes to the other side of 50%. Right now it's 625-566, which is 52,5% and 2.83% confidence according to the formula above. The results are fuego-1.1 v fuego-new (1199/2000 games) unknown results: 1 0.08% board size: 9 komi: 6.5 wins black white avg cpu fuego-1.1 569 47.46% 386 64.33% 183 30.55% 2.69 fuego-new 629 52.46% 415 69.28% 214 35.67% 2.67 801 66.81% 397 33.11% I realize that statistic results don't always match what one would expect, but this should be a straightforward case... Thanks a lot for all the answers! regards, /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
On Thu, Aug 4, 2011 at 23:12, David Fotland fotl...@smart-games.com wrote: I often see that one side gets lucky early and over a few hundred games the win rate moves back toward what I expected. Small numbers of games (like a few hundred) can be very misleading. Great, thanks. This means then that 1000 games can only detect a change of at least 50 ELO. /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] testing improvements
On Thu, Aug 4, 2011 at 5:37 PM, Vlad Dumitrescu vladd...@gmail.com wrote: On Thu, Aug 4, 2011 at 23:12, David Fotland fotl...@smart-games.com wrote: I often see that one side gets lucky early and over a few hundred games the win rate moves back toward what I expected. Small numbers of games (like a few hundred) can be very misleading. Great, thanks. This means then that 1000 games can only detect a change of at least 50 ELO. Of course it's a matter of how much certainty you want. If your results show only 2 or 3 ELO, you need tens of thousands of games to give high confidence that there is an actual improvement.However if your results show 50 ELO, I believe the error margins after 1000 games is more than enough to prove that some of this 50 ELO is real.Of course it may all be real, but you can only assume that some of it is. Don /Vlad ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] pachi questions
Why does pachi fill it's eye here, when it can simply pass and win the game (komi = -35.5) ? IN: genmove b pre-simulated 816 games (UCT tree; root white; extra komi 0.00; max depth 10) [F6] 1.000/1002 [prior 0.059/119 amaf 1.000/824 crit 0.000] h=0 c#=3 fbfb7 [pass] 0.995/1004 [prior 0.500/14 amaf 0.000/0 crit -0.005] h=0 c#=5 fbfbd (avg score 0.00/0 value 0.00/0) [186] best 0.00 | seq | can pass(0.995) A1(0.000) F1(0.000) *** WINNER is A1 (1,1) with score 0. (0/1002:186/186 games), extra komi 0.00 genmove in 0.10s (1820 games/s, 606 games/s/thread) playing move A1 Move: 41 Komi: -35.5 Handicap: 0 Captures B: 2 W: 8 A B C D E FA B C D E F +-++-+ 6 | O O O O . O | 6 | O O O O O O | 5 | O O . O O . | 5 | O O O O O O | 4 | O O O . O O | 4 | O O O O O O | 3 | X X O O O O | 3 | X X O O O O | 2 | X X X X X X | 2 | X X X X X X | 1 | X)X X X X . | 1 | X X X X X X | +-++-+ ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go