Hi, Marc, You may want to take a look at my papers, they are kind of related to these ideas (although I didn't yet change the "team" on the fly, as you are suggesting):
* http://teamcore.usc.edu/people/sorianom/aaai14.pdf * http://teamcore.usc.edu/papers/2013/ijcai13.pdf * http://teamcore.usc.edu/people/sorianom/AAMAS2011.pdf Concerning changing the settings dynamically, my most recent work may help (when using voting): * http://teamcore.usc.edu/papers/2015/aamas2015.pdf Let me know if you have questions/comments... :) Best, Leandro On Sun, Apr 26, 2015 at 7:59 AM, Marc Landgraf <mahrgel...@gmail.com> wrote: > Heya, > I lately tried to think about, whether it would be possible to combine the > strengths of different bots, or at least different parameter sets/bias > systems for one bots in some way. They may shine at different > situations/phases during the game, but how to figure out, which one is > currently the better one? > > What I now came up with, was the following: > For simplicity we assume for now, that our different bots are using the > same playouts, but different approaches during the tree phase. So maybe > they use different ways to bias nodes, different selection formulas etc. > Gonna focus on them using different bias systems. > Now you split up your playouts in percentiles: > 25%: Bot1 selects the white moves, Bot2 the black ones > 25%: the other way around > 0<x%<50%: Bot1 selects for both. > 50%-x%: Bot2 selects for both. > You track the win rates of those first 2 quarters separately and also > calculate winrate of Bot1 vs Bot2. > Now if the bots are identical obviously both should win 50%. But if the > bots are different, you may see different results. > E.g. when Bot1 wins 55% of his games, his move selection is probably > better then Bot2's move selection. Here you have to be careful about wrong > conclusions, because if you would setup a depth-first bot vs a width-first > you would certainly also get win rates heavily in favor of the depth-first. > But where this could shine is, when using different bias systems. Because > it actually tells you, which bias system is doing better in the current > board situation. > Now you can use that knowledge to calculate x. E.g. if either bot wins > 60%+ he gains all 50% of the remaining playouts, and let the balance slide > linearly, if the win rate is 40-60% for the bots against each other. (use > whatever formula here, open for testing) > > This should enable you to figure out on the fly, which bias system is > doing a better job at the current situation, while doing playouts. You are > just tracking some additional stats. > > Of course, there are pros and cons to this method: > + In general, switching selection method in the tree should not cost any > time, and tracking those additional stats also costs close to no time. Only > additional time used comes from using a second bias function or similar > (because now you have to calculate the bias twice, for most nodes) > - At the same time, the amount of data is actually increasing, because you > have to track the stats for the different bots. This may cause memory > issues! (but when using a distributed memory solution anyway, it does not > create additional data, if each memory/thread unit is assigned to one of > those percentiles of playouts) > +- In general the costs increase depends on how different the 2 Bots are. > If they are the same, there would be basically no cost. > + Allows you to figure out, which bot/bias/selection is doing better right > now > - but may lead to false conclusions, like above mentioned depth vs width > example > +- As long as both bots are of similar strength, you should not lose from > using this kind of system. Worst case is, that you play the wrong bots > move, if you had above mentioned false conclusions. But when they are close > in strength, that is not worse than using just one bot all the time. Of > course, if one bot is dominating the other one in all situations, you are > losing quality, when figuring out again and again, that this is the case. > (because in 50% of the playouts, half of the moves were selected by the > worse bot) > > > > So some quick ideas how it could be modified further: > - all percentages are obviously placeholders and could be adjusted (even > dynamically) > - assuming you have a low cost and a very high time cost bias function, > you can actually check, if it is worth using the high cost bias function, > or if the board situation is simple enough to churn out more playouts using > the cheaper function. > - identifying certain game situations: you may know, that one bias is > doing much better in corner fights or liberty races, so if that one is > dominating, you are probably in such a situation and can now adjust further > (e.g. modify playouts, add additional routines, whatever) > - using more then 2 routines, possibly in conjunction with the idea above > to identify situations. > > Of course, one could now also think how to expand this idea, with using > different playouts, but I'm not sure how to judge, which playouts are the > better ones, when using different ones. Only because Bot1's playouts tell > me, I'm winning 60%, it does not mean, that his playouts are better then > Bot2's, who is only giving me 40%. (Even though I would wish so :D) So > right now I need the same playouts, to judge my selection routine. Maybe > someone else has an idea, how to judge playouts? > > > What do you guys think? Is there anything with potential in those ideas? > Or is the cost and danger of wrong conclusions too high for the possible > gains? > Sadly can't test it with my own bot,as my versions strictly dominate each > other, and with its current strength it would probably be not conclusive > anyway. > > > Marc > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go