Hi all, As you might already know, Daniel Kühlwein (a Ph.D. student of Josef Urban) and I have been working on a machine-learning based relevance filter for Sledgehammer, called MaSh. The idea is to learn from successful proofs which facts are more likely to be useful for future proofs.
We are looking for volunteers to try it out. By default, Sledgehammer uses the good old MePo (Meng & Paulson) filter, but if you do either of the following, you'll find yourself using a combination of MePo and MaSh: 1. sledgehammer_params [fact_filter = mesh] -- at the top of your theory 2. sledgehammer [fact_filter = mesh] -- for a single invocation 3. export MASH=yes ("mesh" = "mash" + "mepo"; yes, the names are a bit crazy. You can also cheat and drop "fact_filter =".) You will need a recent repository version (e.g. 05f8ec128e83) as well as Python on your machine (which you probably already have). The Sledgehammer manual has some documentation on MaSh. To get the latest version: isabelle build_doc Sledgehammer isabelle doc sledgehammer Sections 5.1, 6.1, and 7.2 are relevant. There's not much to know really -- the learning and use of that learning takes place automatically (but can be guided in various ways). Example: theory Scratch imports Main begin lemma "map f xs = ys ==> rev (zip xs ys) = zip (rev xs) (rev ys)" The command sledgehammer learn_isar forces learning at this point, so that we can start using MaSh right away. (Otherwise, learning takes place in the background and is available only from the second Sledgehammer invocation.) This takes about 5 seconds and prints MaShing through 7127 facts for Isar proofs... Learned 4644 nontrivial Isar proofs. Next step: sledgehammer [prover = e, fact_filter = mesh, verbose] This prints ATP: Including (up to) 1000 relevant facts: zip_append list_induct2 ... divide_nonneg_pos wf_lenlex. ... Facts in "e" proof (of 1000): zip_rev@4, length_map@11. ... Try this: by (metis length_map zip_rev) (20 ms). "@4" and "@11" indicate that the 4th and 11th facts are used for the proof. The lower the numbers, the better. If we try again sledgehammer [prover = e, fact_filter = mesh, verbose] we get Facts in "e" proof (of 1000): zip_rev@1, length_map@5. which is an improvement that comes from learning. In contrast, MePo doesn't get any smarter: sledgehammer [prover = e, fact_filter = mepo, verbose] Facts in "e" proof (of 1000): zip_rev@4, length_map@14. The learning is persistent across sessions and should be robust in the face of theory reorganizations etc. If you want to improve MaSh's accuracy, you can let sledgehammer learn_atp run for hours with your favorite theories loaded. It will then invoke Sledgehammer on each available theorem to learn from its proofs. You can interrupt it any time. Learning from ATP proofs is usually better than learning from human-written proofs, according to evaluations by Kühlwein, Urban et al. I hope those of you who use Sledgehammer regularly will give this a try and let me know about your experience. We've had very useful feedback about brand new features this way before (e.g. the tight SPASS integration). If you have nice examples, they might easily end up in one of our papers. Cheers, Jasmin _______________________________________________ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev