Heya, I lately tried to think about, whether it would be possible to combine the strengths of different bots, or at least different parameter sets/bias systems for one bots in some way. They may shine at different situations/phases during the game, but how to figure out, which one is currently the better one?
What I now came up with, was the following: For simplicity we assume for now, that our different bots are using the same playouts, but different approaches during the tree phase. So maybe they use different ways to bias nodes, different selection formulas etc. Gonna focus on them using different bias systems. Now you split up your playouts in percentiles: 25%: Bot1 selects the white moves, Bot2 the black ones 25%: the other way around 0<x%<50%: Bot1 selects for both. 50%-x%: Bot2 selects for both. You track the win rates of those first 2 quarters separately and also calculate winrate of Bot1 vs Bot2. Now if the bots are identical obviously both should win 50%. But if the bots are different, you may see different results. E.g. when Bot1 wins 55% of his games, his move selection is probably better then Bot2's move selection. Here you have to be careful about wrong conclusions, because if you would setup a depth-first bot vs a width-first you would certainly also get win rates heavily in favor of the depth-first. But where this could shine is, when using different bias systems. Because it actually tells you, which bias system is doing better in the current board situation. Now you can use that knowledge to calculate x. E.g. if either bot wins 60%+ he gains all 50% of the remaining playouts, and let the balance slide linearly, if the win rate is 40-60% for the bots against each other. (use whatever formula here, open for testing) This should enable you to figure out on the fly, which bias system is doing a better job at the current situation, while doing playouts. You are just tracking some additional stats. Of course, there are pros and cons to this method: + In general, switching selection method in the tree should not cost any time, and tracking those additional stats also costs close to no time. Only additional time used comes from using a second bias function or similar (because now you have to calculate the bias twice, for most nodes) - At the same time, the amount of data is actually increasing, because you have to track the stats for the different bots. This may cause memory issues! (but when using a distributed memory solution anyway, it does not create additional data, if each memory/thread unit is assigned to one of those percentiles of playouts) +- In general the costs increase depends on how different the 2 Bots are. If they are the same, there would be basically no cost. + Allows you to figure out, which bot/bias/selection is doing better right now - but may lead to false conclusions, like above mentioned depth vs width example +- As long as both bots are of similar strength, you should not lose from using this kind of system. Worst case is, that you play the wrong bots move, if you had above mentioned false conclusions. But when they are close in strength, that is not worse than using just one bot all the time. Of course, if one bot is dominating the other one in all situations, you are losing quality, when figuring out again and again, that this is the case. (because in 50% of the playouts, half of the moves were selected by the worse bot) So some quick ideas how it could be modified further: - all percentages are obviously placeholders and could be adjusted (even dynamically) - assuming you have a low cost and a very high time cost bias function, you can actually check, if it is worth using the high cost bias function, or if the board situation is simple enough to churn out more playouts using the cheaper function. - identifying certain game situations: you may know, that one bias is doing much better in corner fights or liberty races, so if that one is dominating, you are probably in such a situation and can now adjust further (e.g. modify playouts, add additional routines, whatever) - using more then 2 routines, possibly in conjunction with the idea above to identify situations. Of course, one could now also think how to expand this idea, with using different playouts, but I'm not sure how to judge, which playouts are the better ones, when using different ones. Only because Bot1's playouts tell me, I'm winning 60%, it does not mean, that his playouts are better then Bot2's, who is only giving me 40%. (Even though I would wish so :D) So right now I need the same playouts, to judge my selection routine. Maybe someone else has an idea, how to judge playouts? What do you guys think? Is there anything with potential in those ideas? Or is the cost and danger of wrong conclusions too high for the possible gains? Sadly can't test it with my own bot,as my versions strictly dominate each other, and with its current strength it would probably be not conclusive anyway. Marc
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go