Rather than shuffling which sounds complex, a simpler way to verify that
hypothesis might be to count the number of different chapters/sections
each theorem uses, and have a statistical approach to confirm whether
that matches with O(1) as Mario suggests.


On 12/05/2022 18:04, Mario Carneiro wrote:
It's tricky to shuffle an mm file because theorem dependencies put a
constraint on allowable reorderings. I would probably do a random
shuffle and then do DFS to get a topological ordering in order to get
a random allowable ordering.

My prediction would be that it does not make a huge effect on
verification speed. Except for the most optimized, Metamath verifiers
are typically bottlenecked on things other than loading the theorem
into cache; stuff like substituting expressions into theorem statement
are probably 20-100 times more expensive except at *really* large scale.

On Thu, May 12, 2022 at 5:24 AM Antony Bartlett <[email protected]> wrote:

    Wow, great answer, thanks Mario!  Mind duly blown.

    The claim that contributors to mm files have in effect sorted the
    proofs in a manner favorable to efficient verification should be
    scientifically testable by the way (i.e. to a lower standard of
    'proof' than what we're used to around here).  It's possible to
    write a program to shuffle an mm file, the prediction being that
    makes verification a lot less efficient.  After that it might be
    possible to run some sort of regression on the timings.

    (I'm glossing over a lot of detail here, but a shuffle algorithm
    1. makes a pass of an mm file to construct a DAG, 2. writes out a
    random axiom, 3. removes the axiom from the DAG which promotes any
    theorem no longer dependent on any axioms to the status of axiom
    (though obviously still outputted as a theorem), 4. return to step
    2 until the whole file is written out)

        Best regards,

            Antony


    On Wed, May 11, 2022 at 5:21 PM Mario Carneiro <[email protected]>
    wrote:

        It's true that O(n) is an approximation of the reality. If you
        assume that array lookups can be done in O(1) (and there are
        at most 2^word size theorems total, which is almost certainly
        the case unless you have a "streaming" model for theorem
        verification, which is not the case for most proof systems),
        then you can get O(n) total time on a word RAM computational
        model. There is another asterisk though which is that theorems
        have to be O(1) in size for this to work. In general if
        theorems have size O(m) then verifying one metamath step takes
        O(m) so total verification time is O(mn), so in the worst
        case, if you have a single theorem whose statement takes up
        half of the entire file, you can spend the other half of the
        file applying that theorem O(n) times for O(n^2) work. This is
        the actual bound I have established for MM0 verification, but
        if you make a less pessimistic assumption it looks
        approximately linear. (There are also complications getting
        this linear time bound in metamath when *expressions* stop
        being O(1): you can actually get exponential size terms in
        O(n) steps by reusing subproofs. MM0 uses dag-like sharing of
        terms to circumvent this problem, but metamath fundamentally
        uses token strings for its expressions so you have to get
        fancy with data structures in order to express such a highly
        redundant string without losing the ability to freely
        reassociate token strings.)

        Setting aside the issues with large expressions and theorems
        and returning to the lookup question, the reality is still
        more complicated. If you don't use a word RAM model and
        instead use functional data structures, then your array is
        actually a tree and then you get that O(n log n) bound. If you
        remember The Myth of RAM
        (http://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html)
        then you will actually want to revise this to O(n sqrt n). But
        actually, it's a bit better than this because like regular
        programs, mathematics also has a lot of temporal and spatial
        locality, and this reflects in the fact that the theorems you
        access in the proof of a given theorem will often be hit many
        times and nearby theorems will also likely be accessed. So the
        usual caching tricks work and it goes down to O(n sqrt s)
        where s is the working set size, which is difficult to
        quantify but general considerations of how humans work suggest
        that this should be roughly constant - each area of
        mathematics generally has direct connections to O(1) other
        areas of mathematics. So we're back to O(n) again.

        Mario

        On Wed, May 11, 2022 at 7:37 AM Antony Bartlett
        <[email protected]> wrote:

            Great talk Mario! Thanks for sharing.

            I nearly asked a couple of nights ago when it turned out
            Glauco and I have both accidentally written Metamath
            verifiers in TypeScript which probably run in O(n^2),
            hopefully on our way to writing verifiers that are
            actually useable!  What's a 'good' length of time in big-O
            notation for a verifier to run in?

            In your talk you state that this is O(n) - unless I've
            wildly misunderstood.

            I'm pleasantly surprised by that, and also curious.  The
            best I was hoping for was O(n.log(n)), and my thinking was
            that as you add more proofs to the system, subsequent
            proofs sometimes refer back to them, which requires the
            verifier to do a look up, which is why I was assuming 'n'
            to make your pass of the file, and 'log(n)' for the
            lookups.  Broadly, how does an O(n) verifier avoid the
            need for this, please?

                Best regards,

                    Antony


            On Wed, May 11, 2022 at 11:03 AM Mario Carneiro
            <[email protected]> wrote:

                Hi all,

                I finally put the recording of my talk "Lessons from
                Metamath" from the Lorentz Center meeting for
                Machine-Checked Mathematics on YouTube. It is
                primarily aimed at an audience of people who are
                familiar with proof assistants but not with Metamath,
                but I'm sure you all can get plenty out of it as well.

                https://www.youtube.com/watch?v=OOF4NWRyue4

                Mario Carneiro
                --
                You received this message because you are subscribed
                to the Google Groups "Metamath" group.
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                [email protected].
                To view this discussion on the web visit
                
https://groups.google.com/d/msgid/metamath/CAFXXJSsoNfduP5TgNbDzPMQEfFYYc4aCRt0y2aPRFPn6sH8ZyQ%40mail.gmail.com
                
<https://groups.google.com/d/msgid/metamath/CAFXXJSsoNfduP5TgNbDzPMQEfFYYc4aCRt0y2aPRFPn6sH8ZyQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

            --
            You received this message because you are subscribed to
            the Google Groups "Metamath" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an email to
            [email protected].
            To view this discussion on the web visit
            
https://groups.google.com/d/msgid/metamath/CAJ48g%2BCwN2q7qKqi5oh4WezRV75V%2BikgT9DJyT4_tbmO825yyQ%40mail.gmail.com
            
<https://groups.google.com/d/msgid/metamath/CAJ48g%2BCwN2q7qKqi5oh4WezRV75V%2BikgT9DJyT4_tbmO825yyQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

        --
        You received this message because you are subscribed to the
        Google Groups "Metamath" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to [email protected].
        To view this discussion on the web visit
        
https://groups.google.com/d/msgid/metamath/CAFXXJSuz8Q5Hxcu7U3iV6%3DU2KwwbWg6dhiJEVj5ZJSTHpayA_A%40mail.gmail.com
        
<https://groups.google.com/d/msgid/metamath/CAFXXJSuz8Q5Hxcu7U3iV6%3DU2KwwbWg6dhiJEVj5ZJSTHpayA_A%40mail.gmail.com?utm_medium=email&utm_source=footer>.

    --
    You received this message because you are subscribed to the Google
    Groups "Metamath" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected].
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/metamath/CAJ48g%2BDNwi2KhpTfYh%2BgfSTYd%2BV33Qmr_hWNfLSn3O6OnfgBXQ%40mail.gmail.com
    
<https://groups.google.com/d/msgid/metamath/CAJ48g%2BDNwi2KhpTfYh%2BgfSTYd%2BV33Qmr_hWNfLSn3O6OnfgBXQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google
Groups "Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/metamath/CAFXXJSvYQQn1sDmjt1E_czs_vAuROUTTDVnWZpjC%3DjkaVSTWVA%40mail.gmail.com
<https://groups.google.com/d/msgid/metamath/CAFXXJSvYQQn1sDmjt1E_czs_vAuROUTTDVnWZpjC%3DjkaVSTWVA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/26b8ed35-fe41-6643-fdd0-2da994968e79%40gmx.net.

Reply via email to