> On Feb 14, 2023, at 1:46 PM, Jon P <[email protected]> wrote: > > I think that's a great idea David! Seems like very low downside and it would > be interesting to see if language models could train on the datasets. > > There's already an option to Customise GPT3 with additional datasets so once > this is ready it could be an interesting experiment to try. > > https://openai.com/blog/customized-gpt-3/
Thanks so much for that link. So I looked into creating a specially-trained model to create metamath proofs. that page points to this pricing model for GPT-3: https://openai.com/api/pricing/#faq-token I did a quick estimate of training costs for the top GPT-3 model on all of set.mm. It comes out to about $2600. That's less than I expected, and that's easily affordable in a grant (you could train multiple times), but it's not an amount I have pocket change for :-(. Inference isn't expensive; trying to generate axtgcgrrflx is $0.15 (15 cents). See below for how I created the quick estimate. We could do a subset of set.mm (e.g., skip mathboxes, etc.), though fewer training examples would typically produce less-good results. I can't possibly afford that myself. That said, there are alternatives. There are many other Generative Pre-trained Transformer (GPT) models that are less capable but also less expensive. My current personal computer is useless for ML training, but Kaggle.com provides a nice Jupyter notebook interface for AI/ML work that's free to use. Here are some pages that discuss using GPT-2 and Kaggle: https://www.kaggle.com/code/changyeop/how-to-fine-tune-gpt-2-for-beginners https://www.kaggle.com/code/nulldata/fine-tuning-gpt-2-to-generate-netlfix-descriptions/notebook https://www.kaggle.com/general/116387 I had also been looking at adding definition checks to metamath-knife, and maybe trying to help this new JavaScript metamath proof assistant. Sigh, not enough time in the day :-). --- David A. Wheeler === Cost estimate === Here's. cost estimate using https://openai.com/api/pricing/ OpenAI's most powerful GPT-3 model is Davinci. Their charges are: * Fine-tuned training: $0.0300 / 1K tokens, * Usage: $0.1200 / 1K tokens. I did a quick estimate of tokens in the entirety of set.mm using: metamath 'set scroll continuous' 'set width 9999' 'read set.mm' 'show statement *' 'show proof * /lemmon /renumber' 'quit' | grep -v -- '-------' | wc That yielded 64218080 words. A token is about 4 characters or 0.75 words, so for a quick estimate, tokens = 64218080/0.75 = 85624107 tokens. Fine-tuned training of this entire set = (85624106/1000)*$0.03 = $2569. Theorem axtgcgrrflx : 942 words, tokens = 942/0.75 = 1256, cost = (1256/1000)*$0.12 = $0.15 -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/00F786DE-04C9-49B6-B070-365827D5DDD1%40dwheeler.com.
