[Computer-go] CNN for winrate and territory

Detlef Schmicker Sun, 08 Feb 2015 02:22:53 -0800

Hi,

I am working on a CNN for winrate and territory:


approach:
 - input 2 layers for b and w stones

- 1. output: 1 layer territory (0.0 for owned by white, 1.0 for ownedby black (because I missed TANH in the first place I used SIGMOID))

 - 2. output: label for -60 to +60 territory leading by black
the loss of both outputs is trained

the idea is, that this way I do not have to put komi into input and makethe winrate from the statistics of the trained label:

e.g. komi 6.5: I sum the probabilites from +7 to +60 and get somethinglike a winrate

I trained with 800000 positions with territory information through 500playouts from oakfoam, which I symmetrized by the 8 transformationleading to >6000000 positions. (It is expensive to produce the positionsdue to the playouts....)

The layers are the same as the large network from Christopher Clark<http://arxiv.org/find/cs/1/au:+Clark_C/0/1/0/all/0/1>, Amos Storkey<http://arxiv.org/find/cs/1/au:+Storkey_A/0/1/0/all/0/1> :http://arxiv.org/abs/1412.3409

I get reasonable territory predictions from this network (compared to500 playouts of oakfoam), the winrates seems to be overestimated. Butanyway, it looks as it is worth to do some more work on it.

The idea is, I can do the equivalent of lets say 1000 playouts with acall to the CNN for the cost of 2 playouts some time...

Now I try to do a soft turnover from conventional playouts to CNNpredicted winrates within the framework of MC.


I do have some ideas, but I am not happy with them.

Maybe you have better ones :)


Thanks a lot

Detlef

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] CNN for winrate and territory

Reply via email to