Don Dailey wrote:
Tim Foden wrote:
Don Dailey wrote:
I suggest
exactly 25,000 play-outs that we should standardize on.    50,000 will
tax my spare computer which I like to use for modest CGOS tests. If it is agreed, I will start a 25k test. My prediction is that this will finish around 1600 ELO on CGOS.
OK, I added Fluke to this (25k) test (twice), before I saw the later
comment about using 10k too.

Its looking like your drdGeneric 25k bot is currently around 1475 (147
games).

Fluke on the other hand looks to be settling at around 1300 (125
games).  I feel that I've probably got a problem in my
implementation!  :)  (I've felt this for some time actually -- UCT
never seemed to work well for me at all.)

Details of Fluke's UCT + Random playouts.

1. UCT constant, c = 0.25.  e.g. UCB value = averageScore + c *
sqrt(log(n)/m).
2. New children are created once a node is visited 1 time (URd) or 2
times (UR2).
3. Eye rule for random playouts:
  * Solid eyes (all 4 from same group).
  * False non-solid eyes (at least 50% of corners are of opposite
colour).
4. Choosing legal moves for playouts:  1st probe is random, then scan.

Is there anything else that's likely to be significant here?
 1.  My UCT constant is 1.0  - my formula is  averageScore + c * sqrt(
(2.0 * log(n)) / (10.0 * m) );
Which corresponds to a constant of 0.447 in my formula. I've started some Flukes with this constant now.

This discussion about constants made me go an check my maths for the 0.25 versions I'd submitted... and I found I'd made a stupid elimentary maths mistake.... I'd somehow decided that sqrt(1/2) == 0.25... doh! Of course its more like 0.707.

I've now corrected the 0.25 versions so they really are running with 0.25 (e.g. sqrt(1/16)), and I've added one with 0.71 too.

 2.  New children are created when parent exceeds 100 visits.
OK. Mine is using the child visit counts rather than the parent visit counts. I've now got versions which expand the child nodes after 1, 2, 4 and 8 visits. I've only just added the 8 version, and it's still pretty early to tell, but, ATM, the 4 looks better than the 2 which looks better than the 1.

 3.  I think the eye rule is the same (you state it differently, but I
believe it's the same.)
OK :)

4. playouts are truly uniform random - yours are not.
I think point 4 could be significant but I can't be sure.
I'm not sure it's significant either. Something else for me to test later on methinks.

Cheers, Tim.
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to