Hi David Do you know if there is a big difference between Gnugo 3.7.10 and 3.8? That would explain a bit.
>From what I've seen/heard of Many Faces, I really can't compare the number of playouts with my program. My program doesn't have much "traditional" knowledge, unlike what I believe Many Faces has. Thanx, I will keep trying different ideas. -- Francois van Niekerk Email: [email protected] | Twitter: @francoisvn Cell: +2784 0350 214 | Website: http://leafcloud.com On Fri, Dec 31, 2010 at 4:02 AM, David Fotland <[email protected]> wrote: > It's a good start. I haven't tested against gnugo on 9x9 for a long time so > I tried a short test his afternoon. > > Many Faces, 1000 playouts per move, vs Gnugo 3.7.10, level 10. > Many Faces won 75.9% of 3669 games (+-1.4% confidence level). > > It took me about 300 versions tested to get from 10% wins against gnugo in 3 > minute games to 90% wins against gnugo with 5K playouts, before I switched > to 19x19. Many things you try won't make your program better, but if you > keep trying it will get much better. > > David > >> -----Original Message----- >> From: [email protected] [mailto:computer-go- >> [email protected]] On Behalf Of Jacques Basaldúa >> Sent: Thursday, December 30, 2010 1:57 PM >> To: [email protected] >> Subject: [Computer-go] Oakfoam and ELO Features >> >> Hi Francois, Welcome >> >> >> > For reference I need about 100k playouts with >> > RAVE to get 50% winrate against GnuGo 3.8 L10. >> >> Yes that's more or less expected. At least before the >> "big" improvements (yet to come ;-) >> >> In my case I do a lot of testing at 4x10000 because >> the games are around 15 seconds long and I get fast >> Elo confidence intervals. At that rate 40 K plyo/move >> I get about 40-42% of wins against gnugo. This is more >> or less consistent with a debugged barebones without >> particular smarts (but with RAVE, without progressive >> widening). I guess 40% at 40000 scales to 50% near >> 100K but the exact point where I reach 50% has not >> been studied as I expect it to be much lower in the >> near future. (Optimistic) >> >> > The next step is obviously to apply these to the >> > playouts. I am currently testing my program with >> > the ELO features in the playouts, but unfortunately >> > the preliminary results don't look good. >> >> That's exactly my experience! Although you do get >> improvement with extend from atari/capture/distance to >> prev heuristic. >> >> The Mogo and CrazyStone papers report improvement >> "all features included" which is true because the >> other ideas produce improvement, but they don't >> give results for the patterns in isolation. >> >> I got a lot of improvement from Rémi's Bradley-Terry >> ideas in move prediction (although with some >> overlearning which I didn't care much about as >> predicting moves is not my interest.) But neither >> the naif values (times played/times seen) nor the >> improved Bradley-Terry values are better in playouts >> than uniform random. They are 158 CI(114..202) Elo >> points worse! >> >> That is good and bad news. Why should uniform random >> be the best?. Obviously it is not. But what humans >> play lacks all the information about what they don't >> play because it is obvious to them, but it is not >> obvious to a "silly" playout policy. >> >> How to find good values for the patterns? (What I have >> tried.) >> >> a. Use small patterns (3x3) with all non-ill-formed >> patterns in the database. (Other databases have a value >> for "unknown" this one shouldn't.) >> >> b. Classify patterns. I have done that in 40 classes. >> This way you reduce the amount of degrees of liberty. >> So your vector of gamma values is in R^40 >> >> c. Then what? I really don't know. I have a "sort of >> genetic algorithm". I like the idea that anything >> changes at random, because the gammas are not >> independent and this way the expected value of the >> correlation is zero even under stochastic dependence. >> Then I select the "best winners" and move my center of >> gravity one little step in one or two classes of patterns >> repeat the entire process. Then test to see if there >> was improvement. A long process. I only won a little >> in the first iterations. After tat fake improvement >> that wasn't verified against uniform random. >> >> In all about 100 Elo points, less improvement than >> the "humans patterns" do wrong. I guess best playouts >> are a research area where there is room for improvement. >> >> >> Jacques. >> >> >> >> _______________________________________________ >> Computer-go mailing list >> [email protected] >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
