Correction: There was a bug in the code that lead to unusually large values for "avg score", due to a single very large data point. The corrected results:
avg score: 75178012 / 3075357 = 24.4 avg lg(score): 10750001.8 / 3075357) = 3.4955 = lg(11.2787) avg sqrt(score): 12640434.2 / 3075357 = 4.1102 = sqrt(16.8940) shuffled 1: avg score: 157481709 / 3075357 = 51.2 avg lg(score): 12857460.5 / 3075357) = 4.1808 = lg(18.1362) avg sqrt(score): 17549817.2 / 3075357 = 5.7066 = sqrt(32.5652) shuffled 2: avg score: 157220284 / 3075357 = 51.1 avg lg(score): 12861417.8 / 3075357) = 4.1821 = lg(18.1524) avg sqrt(score): 17552661.4 / 3075357 = 5.7075 = sqrt(32.5758) completely random 1: avg score: 565223103 / 3075357 = 183.8 avg lg(score): 22394511.3 / 3075357) = 7.2819 = lg(155.6242) avg sqrt(score): 40151474.7 / 3075357 = 13.0559 = sqrt(170.4559) completely random 2: avg score: 565104301 / 3075357 = 183.8 avg lg(score): 22394838.4 / 3075357) = 7.2820 = lg(155.6357) avg sqrt(score): 40149999.7 / 3075357 = 13.0554 = sqrt(170.4433) On Thu, May 12, 2022 at 9:36 AM Mario Carneiro <[email protected]> wrote: > I got curious about this, so here's some code to do just that. We can > calculate the "unified bound" ( > https://www.youtube.com/watch?v=DZ7jt1F8KKw&t=1178s) explicitly for the > access sequence used in set.mm using some stuff on top of metamath-knife > (code here: > https://gist.github.com/digama0/78c5a2de9589d53b4667b9a75b3ca5f3), and > the result is: > > Define the score of an access x_j to be the min of (t_ij + |x_i - x_j|) > over all previous x_i in the sequence, where t_ij is the number of distinct > values in {x_i, ..., x_j-1}. Then we can compute the average score of the > access sequence, for different kinds of average weighting: > > set.mm commit 3de3ae52 > number of theorems: 42879 > > avg score: 4370145307 / 3075357 = 1421.0 > 2^(avg log(score)): 2^(10750033.8 / 3075357) = 2^3.4955 = 11.2788 > (avg sqrt(score))^2: (12705970.2 / 3075357)^2 = 4.1315^2 = 17.0696 > > shuffled 1: > avg score: 4452015191 / 3075357 = 1447.6 > 2^(avg log(score)): 2^(12857388.6 / 3075357) = 2^4.1808 = 18.1359 > (avg sqrt(score))^2: (17604771.6 / 3075357)^2 = 5.7245^2 = 32.7695 > > shuffled 2: > avg score: 4453096882 / 3075357 = 1448.0 > 2^(avg log(score)): 2^(12873701.1 / 3075357) = 2^4.1861 = 18.2027 > (avg sqrt(score))^2: (17659385.1 / 3075357)^2 = 5.7422^2 = 32.9731 > > completely random 1: > avg score: 4860210535 / 3075357 = 1580.4 > 2^(avg log(score)): 2^(22395391.0 / 3075357) = 2^7.2822 = 155.6550 > (avg sqrt(score))^2: (40220315.5 / 3075357)^2 = 13.0783^2 = 171.0409 > > completely random 2: > avg score: 4860319828 / 3075357 = 1580.4 > 2^(avg log(score)): 2^(22396702.0 / 3075357) = 2^7.2826 = 155.7010 > (avg sqrt(score))^2: (40224473.6 / 3075357)^2 = 13.0796^2 = 171.0762 > > That is: the log-weighted average of the scores is 11.2, meaning that the > s in O(n log(s)) is 11.2 and log(s) is about 3.49. To get a sense for how > good spatial layout in set.mm is, I compare this to a randomized layout > (I didn't pay attention to dependency order here, it's just a shuffle), > which results in 18.1 for s. To double check the variance on the random > layout calculation I ran it twice and you can see the second result is > almost the same, 18.2, so clearly set.mm is laid out much better than > random. > > Note that this shuffle simulation does not account for the difference in > temporal locality: I am still simulating the same accesses in the same > order, even though they would presumably be done in the same random order > used for the layout. This is probably not a big effect since this only > affects ordering between proofs: most of those 3 million accesses occur > (there are only around 65000 theorems in the set) because one proof > involves many accesses, so the majority of temporal locality would not be > affected by reshuffling theorems. > > The "completely random" data just uses the output of a random number > generator with the same range and length as the set.mm access sequence. > The value of s is now much larger, 155.7. I believe the operative effect > here is the within-proof locality: the same theorems are referenced many > times in a given proof, while a random proof will touch a lot more of the > library. > > All in all I would say this is a pretty small number: s = 11.2 is very far > from n = 3075357, so even if it's not constant (obviously difficult to > measure) it is still very sub-linear. > > On Thu, May 12, 2022 at 6:35 AM Thierry Arnoux <[email protected]> > wrote: > >> Rather than shuffling which sounds complex, a simpler way to verify that >> hypothesis might be to count the number of different chapters/sections each >> theorem uses, and have a statistical approach to confirm whether that >> matches with O(1) as Mario suggests. >> >> On 12/05/2022 18:04, Mario Carneiro wrote: >> >> It's tricky to shuffle an mm file because theorem dependencies put a >> constraint on allowable reorderings. I would probably do a random shuffle >> and then do DFS to get a topological ordering in order to get a random >> allowable ordering. >> >> My prediction would be that it does not make a huge effect on >> verification speed. Except for the most optimized, Metamath verifiers are >> typically bottlenecked on things other than loading the theorem into cache; >> stuff like substituting expressions into theorem statement are probably >> 20-100 times more expensive except at *really* large scale. >> >> On Thu, May 12, 2022 at 5:24 AM Antony Bartlett <[email protected]> wrote: >> >>> Wow, great answer, thanks Mario! Mind duly blown. >>> >>> The claim that contributors to mm files have in effect sorted the proofs >>> in a manner favorable to efficient verification should be scientifically >>> testable by the way (i.e. to a lower standard of 'proof' than what we're >>> used to around here). It's possible to write a program to shuffle an mm >>> file, the prediction being that makes verification a lot less efficient. >>> After that it might be possible to run some sort of regression on the >>> timings. >>> >>> (I'm glossing over a lot of detail here, but a shuffle algorithm 1. >>> makes a pass of an mm file to construct a DAG, 2. writes out a random >>> axiom, 3. removes the axiom from the DAG which promotes any theorem no >>> longer dependent on any axioms to the status of axiom (though obviously >>> still outputted as a theorem), 4. return to step 2 until the whole file is >>> written out) >>> >>> Best regards, >>> >>> Antony >>> >>> >>> On Wed, May 11, 2022 at 5:21 PM Mario Carneiro <[email protected]> >>> wrote: >>> >>>> It's true that O(n) is an approximation of the reality. If you assume >>>> that array lookups can be done in O(1) (and there are at most 2^word size >>>> theorems total, which is almost certainly the case unless you have a >>>> "streaming" model for theorem verification, which is not the case for most >>>> proof systems), then you can get O(n) total time on a word RAM >>>> computational model. There is another asterisk though which is that >>>> theorems have to be O(1) in size for this to work. In general if theorems >>>> have size O(m) then verifying one metamath step takes O(m) so total >>>> verification time is O(mn), so in the worst case, if you have a single >>>> theorem whose statement takes up half of the entire file, you can spend the >>>> other half of the file applying that theorem O(n) times for O(n^2) work. >>>> This is the actual bound I have established for MM0 verification, but if >>>> you make a less pessimistic assumption it looks approximately linear. >>>> (There are also complications getting this linear time bound in metamath >>>> when *expressions* stop being O(1): you can actually get exponential size >>>> terms in O(n) steps by reusing subproofs. MM0 uses dag-like sharing of >>>> terms to circumvent this problem, but metamath fundamentally uses token >>>> strings for its expressions so you have to get fancy with data structures >>>> in order to express such a highly redundant string without losing the >>>> ability to freely reassociate token strings.) >>>> >>>> Setting aside the issues with large expressions and theorems and >>>> returning to the lookup question, the reality is still more complicated. If >>>> you don't use a word RAM model and instead use functional data structures, >>>> then your array is actually a tree and then you get that O(n log n) bound. >>>> If you remember The Myth of RAM ( >>>> http://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html) then you >>>> will actually want to revise this to O(n sqrt n). But actually, it's a bit >>>> better than this because like regular programs, mathematics also has a lot >>>> of temporal and spatial locality, and this reflects in the fact that the >>>> theorems you access in the proof of a given theorem will often be hit many >>>> times and nearby theorems will also likely be accessed. So the usual >>>> caching tricks work and it goes down to O(n sqrt s) where s is the working >>>> set size, which is difficult to quantify but general considerations of how >>>> humans work suggest that this should be roughly constant - each area of >>>> mathematics generally has direct connections to O(1) other areas of >>>> mathematics. So we're back to O(n) again. >>>> >>>> Mario >>>> >>>> On Wed, May 11, 2022 at 7:37 AM Antony Bartlett <[email protected]> wrote: >>>> >>>>> Great talk Mario! Thanks for sharing. >>>>> >>>>> I nearly asked a couple of nights ago when it turned out Glauco and I >>>>> have both accidentally written Metamath verifiers in TypeScript which >>>>> probably run in O(n^2), hopefully on our way to writing verifiers that are >>>>> actually useable! What's a 'good' length of time in big-O notation for a >>>>> verifier to run in? >>>>> >>>>> In your talk you state that this is O(n) - unless I've wildly >>>>> misunderstood. >>>>> >>>>> I'm pleasantly surprised by that, and also curious. The best I was >>>>> hoping for was O(n.log(n)), and my thinking was that as you add more >>>>> proofs >>>>> to the system, subsequent proofs sometimes refer back to them, which >>>>> requires the verifier to do a look up, which is why I was assuming 'n' to >>>>> make your pass of the file, and 'log(n)' for the lookups. Broadly, how >>>>> does an O(n) verifier avoid the need for this, please? >>>>> >>>>> Best regards, >>>>> >>>>> Antony >>>>> >>>>> >>>>> On Wed, May 11, 2022 at 11:03 AM Mario Carneiro <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I finally put the recording of my talk "Lessons from Metamath" from >>>>>> the Lorentz Center meeting for Machine-Checked Mathematics on YouTube. It >>>>>> is primarily aimed at an audience of people who are familiar with proof >>>>>> assistants but not with Metamath, but I'm sure you all can get plenty out >>>>>> of it as well. >>>>>> >>>>>> https://www.youtube.com/watch?v=OOF4NWRyue4 >>>>>> >>>>>> Mario Carneiro >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Metamath" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/metamath/CAFXXJSsoNfduP5TgNbDzPMQEfFYYc4aCRt0y2aPRFPn6sH8ZyQ%40mail.gmail.com >>>>>> <https://groups.google.com/d/msgid/metamath/CAFXXJSsoNfduP5TgNbDzPMQEfFYYc4aCRt0y2aPRFPn6sH8ZyQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Metamath" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/metamath/CAJ48g%2BCwN2q7qKqi5oh4WezRV75V%2BikgT9DJyT4_tbmO825yyQ%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/metamath/CAJ48g%2BCwN2q7qKqi5oh4WezRV75V%2BikgT9DJyT4_tbmO825yyQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Metamath" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/metamath/CAFXXJSuz8Q5Hxcu7U3iV6%3DU2KwwbWg6dhiJEVj5ZJSTHpayA_A%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/metamath/CAFXXJSuz8Q5Hxcu7U3iV6%3DU2KwwbWg6dhiJEVj5ZJSTHpayA_A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Metamath" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/metamath/CAJ48g%2BDNwi2KhpTfYh%2BgfSTYd%2BV33Qmr_hWNfLSn3O6OnfgBXQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/metamath/CAJ48g%2BDNwi2KhpTfYh%2BgfSTYd%2BV33Qmr_hWNfLSn3O6OnfgBXQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Metamath" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/metamath/CAFXXJSvYQQn1sDmjt1E_czs_vAuROUTTDVnWZpjC%3DjkaVSTWVA%40mail.gmail.com >> <https://groups.google.com/d/msgid/metamath/CAFXXJSvYQQn1sDmjt1E_czs_vAuROUTTDVnWZpjC%3DjkaVSTWVA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/CAFXXJSuCBsjDkFRgEqiFMMH4pM5iwwiYNpYk%3Dz-1U%2B1O%3DW4qJw%40mail.gmail.com.
