Hi Johnathan,

I read why you didn't include metamath; but to let me understand your paper 
in the metamath "environment":

- to keep things simple, assume we had a set.mm with all proofs in normal 
mode (no compression)

- assume you use symbols + labels + metamath reserved keywords  (in set.mm) 
as your tokens (no BPE applied)

- assume you got a transformer T from your genetic optimizer (on 
hyperparameters), at generation i

- then, do you fit T's parameters on (let's say) 90% of the proofs in 
set.mm ?

- then, do you measure "learnability" by measuring how good it is at 
predicting the next token in the 10% out of sample?

- would it mean, either the next token in a wff or the next label in a 
proof?

Is this roughly how it would go in the metamath framework?

I'm trying to look at it from a Proof Assistant standpoint, where may be 
the most intriguing part would be to "generate" the next missing label in a 
proof (where "next" could be "one" of the missing labels)


Thanks in advance for your help
Glauco
 

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/9cc05019-dc5b-4242-9bf6-7af5406b6837n%40googlegroups.com.

Reply via email to