I had a lot of fun training transformers on some toy data, and now i am thinking what to do next. I was surprised how fast even small transformers lern long multiplication.
Anyone know if someone trained a GTP on set.mm yet. set.mm is only ~40mb, i guess that is not enough to avoid overfitting really fast. Anyone produced some procedural generated metamath code? I belive a well trained model could be of great help as some kind of methamath copilote. Have a nice day :D -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/1fe3d510-bd01-4097-8d04-6d4baac1f696n%40googlegroups.com.
