Thanks,

I will try re-rolling out these positions. Do you have any experience of
how to do good rollouts of race positions? Good rollout settings for race
positions?

-Øystein



On Sun, Jun 9, 2019 at 11:38 PM Philippe Michel <[email protected]>
wrote:

> On Fri, Jun 07, 2019 at 08:30:10PM +0200, Øystein Schønning-Johansen wrote:
>
> > (Of course I remove any position duplicated in the two datasets, such
> that
> > the training and validation set are disjoint.)
>
> Is it really important (in general) ? I know one shouldn't use the same
> dataset but is some limited random overlap really an issue ? I didn't
> verify how limited it is in the case of gnubg's databases, though...
>
> > I train a neural network. If I validate the training with a 10% fraction
> of
> > the training dataset itself, I get a MSE error of about 1.0e-04. But if I
> > validate against the dataset generated from train.bm-1.00.bz2 I get an
> MSE
> > error of 7e-04. About 7 times higher!
> >
> > This makes me believe that the rolled out positions in the
> race-train-data
> > file is rolled out in an other way (different tool, different settings,
> > different neural net?) than the positions in train.bm-1.00.bz2.
>
> Different tool and different neural net.
>
> For the benchmark databases it is recorded as a comment at the beginning
> of the file :
>
> s version 1.93 weights 1.00 moves2plyLimit 20 rolloutLimit 5 nRollOutGames
> 1296 cubeAway 7 include0Ply 1 evalPlies 2 shortCuts 1 osrGames 1296
> osrInRoll 1
>
> This is version 1.93 of the sagnubg tool, using the 1.OO weights file
> (the current one). I rerolled the benchmark databases with it after the
> new weights file was generated.
>
> The training database was rolled out with a slightly modified gnubg
> (merely to have gnubg -t print the rollout results in the right format).
>
> This was done with earlier weights. I didn't kept notes but I think I
> used one intermediate weights set for the race and possibly more than
> one for the crashed net (rollout the training database with the 0.90
> net, train a new net, reroll the training database with it, etc...). For
> the contact net I'm not sure.
>
> In any case, this was with different weights than the current benchmark
> database.
>
> > Joseph? Philippe? Ian? Others? Do you know how these data where
> generated?
> > Is it maybe worth rolling these positions out again? I do remember that
> > Joseph made a separate rollout tool, but I'm not sure what Philippe did?
>
> It is likely the different errors you got have another cause : as far as
> I can see,the sagnubg tool used for creating the benchmark databases
> doesn't use variance reduction.
>
> That should be enough of a reason to seriously consider rerolling them,
> but we would have to implement variance reduction in sagnubg first or
> use gnubg with some substantial pre- and post-processing.
>
> > (I also remember that the original benchmark was move based, and it
> > calculates the loss based on incorrect moves picked, and that it might
> not
> > be that interesting if the rollout values are abit wrong....)
>
> I'm afraid they may not be just a bit wrong. It seems the standard
> deviation of a 1296 trials rollout without variance reduction is larger
> than the vast majority of the "errors" found when running the benchmark.
>
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to