Seems like you got a very very slightly better race net, but I would be surprised if it makes a difference in real life.
Would be much more interesting to - get a better contact or crashed net - expand the roll-out database for all categories (should be easy with the current availability of cycles) - improve cube decisions (this is a hard one) - improve back game evaluation and play (very hard one) -Joseph On 5 January 2012 12:40, Philippe Michel <[email protected]> wrote: > I have just tried to use the gnubg-nn tools to train nets and I have some > questions for Joseph or other who may have some experience with these. > > I started from the existing weights file (nngnubg.weights, 0.17-c5-123) > and with the race net : > > % ./train.py -v $DATA/training_data/race-**train-data race > reading training data file > > cycle 0 : checking > eqerr 0.00857, Max 0.11473 (0:01 210360) > creating race.save.86-115 > cycle 1: training (20.000) 251862 positions in 0:02 > > and about a day later I was at 15000 cycles and stopped there : > > cycle 15000 : checking > eqerr 0.00791, Max 0.11625 (0:01 207806) > cycle 15001: training (5.649) 251862 positions in 0:02 > > cycle 15001 : checking > eqerr 0.00785, Max 0.11597 (0:01 207807) > ^Ccycle 15002: training (5.649) > Traceback (most recent call last): > File "./train.py", line 210, in <module> > trainer.train(alpha, order) > KeyboardInterrupt > > First, how much is 15000 cycles ? There are counters in the weights file > but at these values (15000 cycles * 251862 positions) 32 bit counters are > close to wrapping around so I'm not sure if they are meaningful. > > For some time, there were multiple intermediate weights files saved that > apparently were best for various combinations of average and maximum error. > But when I stopped there was only one. Is this especially favourable (a > "dominant" net best for all cases) or not particularly significant because > what's best will be determined by benchmark data, not training data ? > > Is this a good practice to let the training run for a long time, or should > it be stopped relatively frequently and restarted from the most promising > intermediate weights ? > > > The benchmarks for the original weights and the result of the training > were: > > % ./perr.py -W $DATA/nets/nngnubg.weights $DATA/benchmarks/race.bm > 14388 Non interesting, 96620 considered for moves. > 0p errors 16847 of 96620 avg 0.000588801601517 > n-out ( 488 ) 0.51% > 7478 errors of 119067 > cube errors interesting 5688 of 105573 > me 3.72024388105e-05 eq 3.79352135977e-06 > cube errors non interesting 1790 of 13494 > me 0.000114145065754 eq 0.0 > > % ./perr.py -W ../train/race.save.73-100 $DATA/benchmarks/race.bm > 14388 Non interesting, 96620 considered for moves. > 0p errors 16483 of 96620 avg 0.000581237898433 > n-out ( 633 ) 0.66% > 7500 errors of 119067 > cube errors interesting 5710 of 105573 > me 3.58447342447e-05 eq 3.68966502016e-06 > cube errors non interesting 1790 of 13494 > me 8.47618947172e-05 eq 0.0 > > > How should one interpret this ? It looks like the new weights made 4 less > checker play error and 22 more cube errors out of about 100000 in each case > but the cost of these errors was a few % lower in both cases. Is this right > ? And are these differences possibly significant or rather just noise ? > > What can be called a worthwhile improvement ? Joseph's page about the > racing net mentions an error rate halved between an old and a new net but > says it was an especially successful step. What were the improvements > between generations of the contact net, for instance ? > > ______________________________**_________________ > Bug-gnubg mailing list > [email protected] > https://lists.gnu.org/mailman/**listinfo/bug-gnubg<https://lists.gnu.org/mailman/listinfo/bug-gnubg> >
_______________________________________________ Bug-gnubg mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-gnubg
