Those all sound like good things to add to the testsuite. (The tests in the metamath-exe and metamath-knife repos are I think better developed, although in principle some of their tests could be fed back to a testsuite which could be run on any verifier).
On September 16, 2025 12:44:47 PM PDT, Matthew House <[email protected]> wrote: >A few issues I spotted, many of which are nitpicks and some of which are >more serious: > > - The verifier does not check that each file only uses printable ASCII + > "\t\n\f\r". This can be problematic if not checked, since differences in > how characters outside this set are handled (e.g., what counts as white > space) can lead to discrepancies between verifiers. > - The verifier does not check that keyword tokens are separated by white > space, e.g., it accepts $($) as a valid comment, or foo$a|- bar$. as a > valid $a statement. > - The verifier does not check that comments do not contain $( or $). > - The verifier does not check that ${ $} delimiters for blocks are > balanced within each file. > - The verifier does not check that math symbols do not contain $. > - The verifier accepts a meaningless dangling $ character at the end of > a file. > - The verifier does not check that active constants and variables cannot > be redeclared, nor that variables cannot be declared as constants or vice > versa. It has no support for variables becoming inactive at the end of > their containing block. > - The verifier does not check that math symbols in $d statements are > distinct active variables. > - The verifier does not check that all labels are unique, nor that all > labels are distinct from all declared math symbols. > - The verifier does not check that typecodes are active constants. > - The verifier does not check that $f statements use an active variable, > nor that all variables in an $e, $a, or $p statement have an active $f > statement. > - The verifier does not check that a variable has at most one active $f > statement at a time, nor that every $f statement for a variable has the > same typecode. > - The verifier does not ignore repeated inclusions of the same file. For > instance, a file test.mm containing $[ test.mm $] will lead to a runaway > memory leak. > - The verifier does not ignore comments in certain contexts in > statements. For instance, it will interpret foo $a $( baz $) |- bar $. > as a statement with typecode $( and expression $( baz $) |- bar, which > is nonsense and will lead to discrepancies between verifiers. > - The verifier does not check for integer overflow in compressed-label > steps. This allows multiple ways to write step numbers, as well as ways to > write the special code Z as a number, which will lead to discrepancies > between verifiers. > - The verifier has no support for unknown steps ?. After reading a proof > with a ? step, a typical verifier will continue verifying the rest of > the database as usual, but warn the user that it contains incomplete proofs. > - *The verifier does not check that a proof cannot use its own label in > a step.* This allows incorrect proofs such as $[ set.mm $] evil $p |- ph > $= ( evil ) AB $. which insist upon themselves. > - *The verifier does not check that typecodes match when substituting > variables.* This allows incorrect proofs such as $[ set.mm $] evil $p |- > ph $= ( vx wn weq wal equid ax-gen ax6 pm2.24ii ) BCZJDZBEAKBJFGBJHI $. > which exploit syntax ambiguities. > >Writing a correct Metamath verifier requires very careful attention to >detail, which it seems that LLMs still struggle with, short of directed >hand-holding or extraordinarily thorough test suites. The original >mmverify.py code is quite compact for how many details it captures. > >Matthew House > >On Tue, Sep 16, 2025 at 5:03 AM Zarathustra Goertzel <[email protected]> >wrote: > >> I was curious if Codex could write a Metamath verifier, so asked ChatGPT >> for a list of verifiers and what languages it thought would be both *easy >> and interesting*: it chose Go. >> >> I started from my mmverify.py repo (where iIhave a too-slow verifier in >> another weird language, MeTTa <https://metta-lang.dev/>), so it had >> access to reference implementations. >> >> The task was completed pretty easily (perhaps in 40-60min of AI-work) with >> my feeding in some errors. It seems to pass David Wheeler's Metamath-test >> suite <https://github.com/zariuq/metamath-test>. >> >> It's here if anyone's curious: >> https://github.com/zariuq/mmverify.py/tree/mmverifyGo/go >> >> FWIW, I decided to try to see if it could do Metamath Zero, and I must say >> it struggles (even to just do .mm0 + .mmu verification). >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Metamath" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/metamath/e55ea1ba-b3c2-4018-80a1-a46149ead54fn%40googlegroups.com >> <https://groups.google.com/d/msgid/metamath/e55ea1ba-b3c2-4018-80a1-a46149ead54fn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > >-- >You received this message because you are subscribed to the Google Groups >"Metamath" group. >To unsubscribe from this group and stop receiving emails from it, send an >email to [email protected]. >To view this discussion visit >https://groups.google.com/d/msgid/metamath/CADBQv9aKA%3DMBJY6SvJNqneUhm54jP1FQ186WK8VSgSbCENt57w%40mail.gmail.com. -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/metamath/836103D7-290E-4EF9-93C2-7F1CFE6620CD%40panix.com.
