Albert Silver wrote on August 2006 20:10
> You see, while scouring the archives, I saw from a discussion > from not long ago, that Ian Shaw had managed to get the plain > TD training working, though no one (at least I saw nothing in > the archives) commented on the effectiveness of his attempts. > While looking at the CLI version, I saw two functions linked > to 'train': > > Train database - Train the network from a database of > positions Train td - Train the network from TD(0) > zero-knowledge self-play > The train db function runs from the command line. I've set it running to see what happens. Does anyone know how many GHz.days it needs to run to have an effect? I can send the resulting weights file to someone to benchmark. Better still, I will run it myself if I can have some help setting up. Like Øystein, I'm suspicious of the procedure. It reports that the contact and crashed network have been trained on the same number of positions, which implies that both networks' weights are being adjusted during td-training, irrespective of whether positions are contact or crashed. Frank Berger mentioned that BgBlitz had only received TD training, which indicates that it is possible to get an expert standard on play using TD_training alone. As I understand it, gnubg's training went roughly like this: TD-training to intermediate level (FIBS 1650). Supervised training on rolled-out positions to get to advanced stage (FIBS 1800). Positions included some where gnubg was known to err, and others chosen at random. Supervised training on positions where 0-ply evaluation differed from 2-ply evaluation. Gnubg achieved expert level (FIBS 1930). Since Gnubg is now over the plateau reached by TD training, I wondered if a new bout of TD training on top of the supervised training might be beneficial. Øystein and Joseph, are you saying that you have already tried this, to no avail? -- Ian _______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
