Dear folks, I'm working on some Math-rich Natural Language Processing research, using Archive of Formal Proofs and Isabelle as the source of training data.

## Advertising

I can imagine I might have to change Isabelle code in order to solve my problems, so I decided to post in isabelle-dev instead of isabelle-user, sorry if it's the wrong place. Here are my goals. Start from a toy problem. 1. We have 90k lemmas in 300 articles, I'm looking for a command line which can extract all lemmas and do some very basic processing. Assume the lemma looks like this: ``` lemma "exec (comp a) s stk_option = (case stk_option of None ⇒ None | Some stk ⇒ Some (aval a s # stk))" ``` The expected result is like this (with no line break - gmail might add unexpected line break): ``` exec (comp a) s stk_option = (case stk_option of None ⇒ None | Some (stk::int list) ⇒ Some (aval a s # stk)) ``` I know it is trivial to do that in Isabelle JEdit GUI, but how to do that in command line? I tried several `isabelle build` arguments but with no luck yet. 2. Similar to the previous one, but I also want to show the result of `declare [[show_brackets]]` `declare [[show_types]]` and `declare [[show_consts]]`. Not sure if it is trivial given the previous goal is reached. Goal: ``` ((exec (comp a) s stk_option) = (case stk_option of (None ⇒ None) | ((Some stk) ⇒ (Some ((aval a s) # stk))))) constants: Pure.prop :: (prop ⇒ prop) aval :: (aexp ⇒ ((char list ⇒ int) ⇒ int)) op # :: (int ⇒ (int list ⇒ int list)) Some :: (int list ⇒ int list option) None :: int list option case_option :: (int list option ⇒ ((int list ⇒ int list option) ⇒ (int list option ⇒ int list option))) comp :: (aexp ⇒ instr list) exec :: (instr list ⇒ ((char list ⇒ int) ⇒ (int list option ⇒ int list option))) op = :: (int list option ⇒ (int list option ⇒ bool)) Trueprop :: (bool ⇒ prop) op ⟹ :: (prop ⇒ (prop ⇒ prop)) variables: stk_option :: int list option s :: (char list ⇒ int) a :: aexp ``` 3. What if I want to embed all the type signatures into the above formula? Like: ``` ((exec :: (instr list ...) (comp :: (aexp ... ) a :: aexp) s :: (char list => int) stk_option :: int list option) = .... # stk :: (int list)))))) ``` I know the question I ask sounds weird to most people, basically I'm trying to create some dataset to train a Deep Learning model on Mathematics Part-of-Speech task. If anyone is interested I'm happy to explain more details. Thanks in advance! [1] http://prodg.org/talks/mnlp_billion_token_corpora -- Regards, Qian Hong

_______________________________________________ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev