There have been a couple of AI systems, gpt-f <https://arxiv.org/abs/2009.03393> and Holophrasm <https://arxiv.org/abs/1608.02644>, built on metamath whose aim is to generate proofs. Are there more I’m not aware of?
If I have understood correctly I think how the search process works is as follows: You have some set of hypotheses and a proposition you want to prove. You apply a series of already proven theorems, which are substitution rules, to the proposition until the resulting statements either match the hypotheses or match theorems already proven. The question “Is there a proof of any length which proves this proposition” is in the same complexity class as the halting problem I think. Whereas “Is there a proof of length less than L steps which proves the proposition” is NP-complete, so is easier to tackle I think. Trying to find such a chain of substitution rules is essentially a tree search problem. If there are N theorems in the database then to answer the latter question comes down to searching a tree of size N^L. Is that somewhat right? So does this mean, in the abstract, given enough hardware and time you could find all proofs of length less than L just by brute force search? I am also interested in what the true branching factor is. For instance not all N theorems in the database can be applied to all hypotheses, only a subset can be. How much is the reduction? Can most theorems be applied to most statements or is it that only a very small subset can? Moreover I guess there is a reduction because it is sufficient to find only one proof whereas there will be many for any provable statement. Is it true that any AI system built on metamath will need these same foundations? For instance it will need a parser for the database, a verifier and also it will need a function which: for any statement returns the possible substitutions/theorems which could be applied. Is there any value in us (I don’t know if anyone would be interested in working on this) gathering some programs which already do this or writing some new code in a nice optimised way? For example in MineRL <https://minerl.io/docs/tutorials/first_agent.html#creating-an-environment>, a minecraft AI challenge, they provide an environment which offers the agent the state of the game and a list of options available to it. Maybe some sort of environment like this for metamath would help encourage people to tackle the AI side of these problems? Another question I have is how much can be done with optimised code, or multithreading, or gpu parallelisation? For instance the unify command in MMJ2 presumably is searching for all possible proofs of length 1 to fill in a gap? Could something like this be optimised to run in the background on other threads on the cpu so it could fill in 2, 3, or more steps for you? Would this be a powerful thing to add to a proof assistant, a brute force solver which is always trying to help? If a gpu optimised parallel search could be performed on a large amount of hardware might it be able to actually generate some reasonable length proofs? Finally for the general AI problem presumably if any kind of mild heuristic could be found then it might be possible to generate proofs much more efficiently? I think that’s what the above systems are doing, if I have understood correctly, is given a list of possible substitutions to prefer some other overs as more likely to lead in the right direction? Their hope is that the Neural Networks can develop some sort of intuition. In terms of finding a heuristic there is some interesting work here <https://arxiv.org/abs/2005.12535> which might apply. The basic idea, as applied to metamath, would be to use some variant of word2vec to embed all the statements in the database into a high dimensional (say 100) vector space. Then when searching the tree of possible proofs statements could be evaluated based on whether they were leading closer to the hypotheses or not. Might this approach have any value here? Is anyone interested in it? Are there other ways of generating a heuristic to help with search? Thank you for taking the time to read this. I’d really value any feedback or input. I apologise in advance for the many misunderstandings I have and would love to be corrected on anything. Jon -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/027a6381-8caa-4a3b-b2a0-5435d55e0e77n%40googlegroups.com.
