Re: [Computer-go] Utilizing multiple parametersets/bias systems/bots

2015-04-27 Thread Leandro Marcolino
Hi, Marc,

You may want to take a look at my papers, they are kind of related to these
ideas (although I didn't yet change the team on the fly, as you are
suggesting):

* http://teamcore.usc.edu/people/sorianom/aaai14.pdf
* http://teamcore.usc.edu/papers/2013/ijcai13.pdf
* http://teamcore.usc.edu/people/sorianom/AAMAS2011.pdf

Concerning changing the settings dynamically, my most recent work may help
(when using voting):

* http://teamcore.usc.edu/papers/2015/aamas2015.pdf

Let me know if you have questions/comments... :)

Best,
Leandro

On Sun, Apr 26, 2015 at 7:59 AM, Marc Landgraf mahrgel...@gmail.com wrote:

 Heya,
 I lately tried to think about, whether it would be possible to combine the
 strengths of different bots, or at least different parameter sets/bias
 systems for one bots in some way. They may shine at different
 situations/phases during the game, but how to figure out, which one is
 currently the better one?

 What I now came up with, was the following:
 For simplicity we assume for now, that our different bots are using the
 same playouts, but different approaches during the tree phase. So maybe
 they use different ways to bias nodes, different selection formulas etc.
 Gonna focus on them using different bias systems.
 Now you split up your playouts in percentiles:
 25%: Bot1 selects the white moves, Bot2 the black ones
 25%: the other way around
 0x%50%: Bot1 selects for both.
 50%-x%: Bot2 selects for both.
 You track the win rates of those first 2 quarters separately and also
 calculate winrate of Bot1 vs Bot2.
 Now if the bots are identical obviously both should win 50%. But if the
 bots are different, you may see different results.
 E.g. when Bot1 wins 55% of his games, his move selection is probably
 better then Bot2's move selection. Here you have to be careful about wrong
 conclusions, because if you would setup a depth-first bot vs a width-first
 you would certainly also get win rates heavily in favor of the depth-first.
 But where this could shine is, when using different bias systems. Because
 it actually tells you, which bias system is doing better in the current
 board situation.
 Now you can use that knowledge to calculate x. E.g. if either bot wins
 60%+ he gains all 50% of the remaining playouts, and let the balance slide
 linearly, if the win rate is 40-60% for the bots against each other. (use
 whatever formula here, open for testing)

 This should enable you to figure out on the fly, which bias system is
 doing a better job at the current situation, while doing playouts. You are
 just tracking some additional stats.

 Of course, there are pros and cons to this method:
 + In general, switching selection method in the tree should not cost any
 time, and tracking those additional stats also costs close to no time. Only
 additional time used comes from using a second bias function or similar
 (because now you have to calculate the bias twice, for most nodes)
 - At the same time, the amount of data is actually increasing, because you
 have to track the stats for the different bots. This may cause memory
 issues! (but when using a distributed memory solution anyway, it does not
 create additional data, if each memory/thread unit is assigned to one of
 those percentiles of playouts)
 +- In general the costs increase depends on how different the 2 Bots are.
 If they are the same, there would be basically no cost.
 + Allows you to figure out, which bot/bias/selection is doing better right
 now
 - but may lead to false conclusions, like above mentioned depth vs width
 example
 +- As long as both bots are of similar strength, you should not lose from
 using this kind of system. Worst case is, that you play the wrong bots
 move, if you had above mentioned false conclusions. But when they are close
 in strength, that is not worse than using just one bot all the time. Of
 course, if one bot is dominating the other one in all situations, you are
 losing quality, when figuring out again and again, that this is the case.
 (because in 50% of the playouts, half of the moves were selected by the
 worse bot)



 So some quick ideas how it could be modified further:
 - all percentages are obviously placeholders and could be adjusted (even
 dynamically)
 - assuming you have a low cost and a very high time cost bias function,
 you can actually check, if it is worth using the high cost bias function,
 or if the board situation is simple enough to churn out more playouts using
 the cheaper function.
 - identifying certain game situations: you may know, that one bias is
 doing much better in corner fights or liberty races, so if that one is
 dominating, you are probably in such a situation and can now adjust further
 (e.g. modify playouts, add additional routines, whatever)
 - using more then 2 routines, possibly in conjunction with the idea above
 to identify situations.

 Of course, one could now also think how to expand this idea, with using
 different playouts, but I'm not sure how to judge, 

[Computer-go] Utilizing multiple parametersets/bias systems/bots

2015-04-26 Thread Marc Landgraf
Heya,
I lately tried to think about, whether it would be possible to combine the
strengths of different bots, or at least different parameter sets/bias
systems for one bots in some way. They may shine at different
situations/phases during the game, but how to figure out, which one is
currently the better one?

What I now came up with, was the following:
For simplicity we assume for now, that our different bots are using the
same playouts, but different approaches during the tree phase. So maybe
they use different ways to bias nodes, different selection formulas etc.
Gonna focus on them using different bias systems.
Now you split up your playouts in percentiles:
25%: Bot1 selects the white moves, Bot2 the black ones
25%: the other way around
0x%50%: Bot1 selects for both.
50%-x%: Bot2 selects for both.
You track the win rates of those first 2 quarters separately and also
calculate winrate of Bot1 vs Bot2.
Now if the bots are identical obviously both should win 50%. But if the
bots are different, you may see different results.
E.g. when Bot1 wins 55% of his games, his move selection is probably better
then Bot2's move selection. Here you have to be careful about wrong
conclusions, because if you would setup a depth-first bot vs a width-first
you would certainly also get win rates heavily in favor of the depth-first.
But where this could shine is, when using different bias systems. Because
it actually tells you, which bias system is doing better in the current
board situation.
Now you can use that knowledge to calculate x. E.g. if either bot wins 60%+
he gains all 50% of the remaining playouts, and let the balance slide
linearly, if the win rate is 40-60% for the bots against each other. (use
whatever formula here, open for testing)

This should enable you to figure out on the fly, which bias system is doing
a better job at the current situation, while doing playouts. You are just
tracking some additional stats.

Of course, there are pros and cons to this method:
+ In general, switching selection method in the tree should not cost any
time, and tracking those additional stats also costs close to no time. Only
additional time used comes from using a second bias function or similar
(because now you have to calculate the bias twice, for most nodes)
- At the same time, the amount of data is actually increasing, because you
have to track the stats for the different bots. This may cause memory
issues! (but when using a distributed memory solution anyway, it does not
create additional data, if each memory/thread unit is assigned to one of
those percentiles of playouts)
+- In general the costs increase depends on how different the 2 Bots are.
If they are the same, there would be basically no cost.
+ Allows you to figure out, which bot/bias/selection is doing better right
now
- but may lead to false conclusions, like above mentioned depth vs width
example
+- As long as both bots are of similar strength, you should not lose from
using this kind of system. Worst case is, that you play the wrong bots
move, if you had above mentioned false conclusions. But when they are close
in strength, that is not worse than using just one bot all the time. Of
course, if one bot is dominating the other one in all situations, you are
losing quality, when figuring out again and again, that this is the case.
(because in 50% of the playouts, half of the moves were selected by the
worse bot)



So some quick ideas how it could be modified further:
- all percentages are obviously placeholders and could be adjusted (even
dynamically)
- assuming you have a low cost and a very high time cost bias function, you
can actually check, if it is worth using the high cost bias function, or if
the board situation is simple enough to churn out more playouts using the
cheaper function.
- identifying certain game situations: you may know, that one bias is doing
much better in corner fights or liberty races, so if that one is
dominating, you are probably in such a situation and can now adjust further
(e.g. modify playouts, add additional routines, whatever)
- using more then 2 routines, possibly in conjunction with the idea above
to identify situations.

Of course, one could now also think how to expand this idea, with using
different playouts, but I'm not sure how to judge, which playouts are the
better ones, when using different ones. Only because Bot1's playouts tell
me, I'm winning 60%, it does not mean, that his playouts are better then
Bot2's, who is only giving me 40%. (Even though I would wish so :D) So
right now I need the same playouts, to judge my selection routine. Maybe
someone else has an idea, how to judge playouts?


What do you guys think? Is there anything with potential in those ideas? Or
is the cost and danger of wrong conclusions too high for the possible gains?
Sadly can't test it with my own bot,as my versions strictly dominate each
other, and with its current strength it would probably be not conclusive
anyway.


Marc