Hi Jon,
I am pleased you noticed, as I have really put a lot of thought into this.
> Are there any other optimisations you can think of which might make it
quicker?
Not apart from what I already noted, but in the scenario of using a
supercomputer, I'd first try to not even use the time intensive schema
filtering but try to quickly get results that are obtainable due to the
sheer amounts of available memory.
Unless I found a database structure for schema search, that is. The current
approach simply isn't practicable much more, it will be durations in the
years just a few iterations up (rougly x10 each iteration) since it is
already weeks.
This way, the limiting factor is available memory rather than time (as it
was, in fact, also for my PC, which is why I spent so much time to squeeze
in the data).
My program is already able to use both modes. When not using the '
redundantSchemasRemoval' mode, it will first read all 'dProof<n>.txt' first
for n = 17, 19, ... (up to 15 is built into the program), then from when '
dProof<m>.txt' does not exist, it will continue to read '
dProofs<k>-unfiltered<m>+.txt' for k = m, m+2, m+4, ..., until there is no
such file as well. Then just parse all the proofs for those files, and
continue to generate further such files indefinitely.
The difference in timing currently is insane, let me quote parts of the
output when I generated dProof29.txt [on Intel Core i7-3770 with Linux Mint
20.3, doing just that for the whole time]:
Sun Sep 18 20:03:39 2022: Unlimited D-proof representative generator
started. [parallel ; 8 hardware thread contexts]
0.18 ms taken to load built-in representatives.
2.04 ms taken to read 5221 condensed detachment proofs from
data/dProofs17.txt. [tid:140462833604352]
5.22 ms taken to read 15275 condensed detachment proofs from
data/dProofs19.txt. [tid:140462825211648]
11.59 ms taken to read 44206 condensed detachment proofs from
data/dProofs21.txt. [tid:140462543337216]
23.26 ms taken to read 129885 condensed detachment proofs from
data/dProofs23.txt. [tid:140462534944512]
61.70 ms taken to read 385789 condensed detachment proofs from
data/dProofs25.txt. [tid:140462526551808]
170.14 ms taken to read 1149058 condensed detachment proofs from
data/dProofs27.txt. [tid:140462518159104]
171.27 ms total read duration.
Loaded 14 representative collections of sizes:
1 : 3
3 : 6
5 : 12
7 : 38
9 : 89
11 : 229
13 : 672
15 : 1844
17 : 5221
19 : 15275
21 : 44206
23 : 129885
25 : 385789
27 : 1149058
1732327 representatives in total.
Sun Sep 18 20:03:46 2022: Parsed 5% of D-proofs. [ 86616 of 1732327]
(ETC: Sun Sep 18 20:05:51 2022 ; 2 min 5 s 466.39 ms remaining ; 2 min 12
s 69.85 ms total)
Sun Sep 18 20:03:54 2022: Parsed 10% of D-proofs. [ 173232 of 1732327]
(ETC: Sun Sep 18 20:06:07 2022 ; 2 min 12 s 868.68 ms remaining ; 2 min 27
s 631.80 ms total)
[...]
Sun Sep 18 20:06:59 2022: Parsed 90% of D-proofs. [1559094 of 1732327]
(ETC: Sun Sep 18 20:07:21 2022 ; 22 s 134.78 ms remaining ; 3 min 41 s
347.38 ms total)
Sun Sep 18 20:07:09 2022: Parsed 95% of D-proofs. [1645710 of 1732327]
(ETC: Sun Sep 18 20:07:20 2022 ; 11 s 11.10 ms remaining ; 3 min 40 s
220.40 ms total)
nodeReplacements: 7035620, valueReplacements: 2986551
219579.17 ms (3 min 39 s 579.17 ms) total parse & insertion duration.
Sun Sep 18 20:07:19 2022: Starting to generate D-proof representatives of
length 29.
Sun Sep 18 20:08:42 2022: Iterated 2% of D-proof candidates. [ 511784 of
25589216] (ETC: Sun Sep 18 21:16:35 2022 ; 1 h 7 min 53 s 250.44 ms
remaining ; 1 h 9 min 16 s 377.95 ms total)
Sun Sep 18 20:10:04 2022: Iterated 4% of D-proof candidates. [ 1023568 of
25589216] (ETC: Sun Sep 18 21:16:16 2022 ; 1 h 6 min 11 s 820.93 ms
remaining ; 1 h 8 min 57 s 313.36 ms total)
[...]
Sun Sep 18 20:58:10 2022: Iterated 96% of D-proof candidates. [24565647 of
25589216] (ETC: Sun Sep 18 21:00:18 2022 ; 2 min 7 s 143.96 ms remaining ;
52 min 58 s 597.79 ms total)
Sun Sep 18 20:58:58 2022: Iterated 98% of D-proof candidates. [25077431 of
25589216] (ETC: Sun Sep 18 21:00:02 2022 ; 1 min 3 s 255.95 ms remaining ;
52 min 42 s 793.29 ms total)
nodeReplacements: 34152765, valueReplacements: 9736419
3143396.47 ms (52 min 23 s 396.47 ms) taken to collect 4375266 D-proofs of
length 29. [iterated 25589216 condensed detachment proof strings]
Mon Sep 19 05:34:12 2022: Removed ≈ 2% of redundant conclusions. [ 18395 of
approximately 919795] (ETC: Thu Oct 6 17:43:27 2022 ; 17 d 12 h 9 min 14
s 461.78 ms remaining ; 17 d 20 h 43 min 41 s 379.16 ms total)
Mon Sep 19 14:08:43 2022: Removed ≈ 4% of redundant conclusions. [ 36791 of
approximately 919795] (ETC: Thu Oct 6 17:44:06 2022 ; 17 d 3 h 35 min 23
s 743.01 ms remaining ; 17 d 20 h 44 min 20 s 833.92 ms total)
Mon Sep 19 22:39:25 2022: Removed ≈ 6% of redundant conclusions. [ 55187 of
approximately 919795] (ETC: Thu Oct 6 16:41:03 2022 ; 16 d 18 h 1 min 37
s 814.24 ms remaining ; 17 d 19 h 41 min 17 s 279.24 ms total)
[...]
Wed Oct 5 13:42:18 2022: Removed ≈94% of redundant conclusions. [864607 of
approximately 919795] (ETC: Thu Oct 6 15:16:56 2022 ; 1 d 1 h 34 min 38 s
351.86 ms remaining ; 17 d 18 h 17 min 10 s 855.44 ms total)
Wed Oct 5 22:10:50 2022: Removed ≈96% of redundant conclusions. [883003 of
approximately 919795] (ETC: Thu Oct 6 15:13:48 2022 ; 17 h 2 min 58 s
35.65 ms remaining ; 17 d 18 h 14 min 2 s 549.91 ms total)
Thu Oct 6 06:40:15 2022: Removed ≈98% of redundant conclusions. [901399 of
approximately 919795] (ETC: Thu Oct 6 15:11:41 2022 ; 8 h 31 min 26 s
485.28 ms remaining ; 17 d 18 h 11 min 55 s 923.48 ms total)
1546071343.66 ms (17 d 21 h 27 min 51 s 343.66 ms) taken to detect 926015
conclusions for which there are more general variants proven in lower or
equal amounts of steps.
6342.49 ms (6 s 342.49 ms) taken to filter and order new representative
proofs.
Found 3449251 representative, 11549939 redundant, and 10590026 invalid
condensed detachment proof strings.
lengths up to 29 ; amounts per length: {(1,3), (3,6), (5,12), (7,38),
(9,89), (11,229), (13,672), (15,1844), (17,5221), (19,15275), (21,44206),
(23,129885), (25,385789), (27,1149058), (29,3449251)} ; 3449251 new
representative proofs (11549939 redundant, 10590026 invalid)
484.77 ms taken to print and save 103477529 bytes of representative
condensed detachment proof strings to data/dProofs29.txt.
Thu Oct 6 18:27:42 2022: Starting to generate D-proof representatives of
length 31.
About using graphics cards.. I actually tried this and used Nvidia's NVC++
on the same computer (with an GTX 1660 Ti), but the results were a little
slower than when GCC compiled. I don't know how one could rewrite parts of
the program for a GPU being able to help it. As it currently is, the
required operations seem too elaborated for a GPU, but I might be totally
mistaken. I am not an expert on this by any means. However, I don't see how
one could handle operations on pairwise distinct tree structures (where the
type of the next required operation depends on the structure that was
discovered just before) by using the same instructions on whole arrays of
those operations. But there might be ways to entirely restructure the
program in order to do it. I'd like to hear the thoughts on this from
people with a better knowledge on the subject. On the other hand, I will
explain below why GPUs wouldn't really help since time is not the limiting
factor.
I think apart from the schema filtering (where I don't know if there are
better heuristics / data structures), the algorithms I used for this
feature (1. string + tree enumerator, 2 D-rule parser, and 3.
tbb::concurrent_unordered_map-based filter) are pretty close to optimal
time.
W.r.t. space, on the other hand, my structures (such as the tree structure
that stores formulas) are far from space-optimal just so they can do a lot
of non-trivial things faster once the structure exists. For example, each
node of a tree contains an array of integers sufficient to identify the
whole formula beneath it. It is left empty in the end, but first it is used
to find identical subformulas by using a hashmap that takes such arrays of
integers as keys. Outside of that algorithm, I use that array, for example,
to print formulas using an integer-to-string hashmap (which is quicker than
traversing the nodes), and allow individual representations (e.g. more or
less parentheses) without modifying the structure. (This way, issues like
having df-3or <https://us.metamath.org/mpeuni/df-3or.html> pollute your
system in order to save parentheses cannot happen.)
When it comes to ressources, first let me show you a comment of some
measurements from my code:
// e.g. 25: 831 MB memory commit size ; nodeReplacements: 806328,
valueReplacements: 345001 ; at "[...] total parse & insertion duration."
// 2892 MB memory commit size ; nodeReplacements: 3796756,
valueReplacements: 1103491 ; at "[...] taken to collect 490604 D-proofs of
length 25."
// 27: 2601 MB memory commit size ; nodeReplacements: 2373851,
valueReplacements: 1011151 ; at "[...] total parse & insertion duration."
// 9748 MB memory commit size ; nodeReplacements: 11348932,
valueReplacements: 3265546 ; at "[...] taken to collect 1459555 D-proofs of
length 27."
// 29: 8663 MB memory commit size ; nodeReplacements: 7036815,
valueReplacements: 2986586 ; at "[...] total parse & insertion duration."
// 32190 MB memory commit size ; nodeReplacements: 34154357,
valueReplacements: 9736481 ; at "[...] taken to collect 4375266 D-proofs of
length 29."
The first of both steps for each word length limit is before collection of
candidates of the new word length n (i.e. for all proofs of lengths 1, 3,
..., n-2), and the second step after collection. The given node and value
replacements are just to save memory, it says how many formulas (or their
values – they contain shared pointers to strings) have been merged. The
next higher step is again a little lower at first, due to the schema
filtering that took place there, but which wouldn't take place for larger
numbers.
So we have factors of around 25: 3.480..., 27: 3.747..., and 29: 3.715...
displayed. The first is probably not quite representative, I'd say around
3.73 is a reasonable guess. That makes 32190*3.73^n MB memory commit size
an estimate for n doublesteps above 29, e.g. approximately 120.07 GB for 31,
447.86 MB for 33, 1.67 TB for 35, ... and 86.69 TB for 41. Going up another
10, you surpassed the (few petabtes) of RAM of present-day most powerful
supercomputers.
Doing similar calculations with time, starting from just hours, using
// e.g. 17: 1631.72 ms ( 1 s 631.72 ms) taken to collect 6649
[...]
// 19: 5883.22 ms ( 5 s 883.22 ms) taken to collect 19416
[...]
// 21: 21007.67 ms ( 21 s 7.67 ms) taken to collect 56321
[...]
// 23: 75916.37 ms ( 1 min 15 s 916.37 ms) taken to collect 165223
[...]
// 25: 268873.93 ms ( 4 min 28 s 873.93 ms) taken to collect 490604
[...]
// 27: 957299.91 ms (15 min 57 s 299.91 ms) taken to collect 1459555
[...]
// 29: 3378071.50 ms (56 min 18 s 71.50 ms) taken to collect 4375266
[...]
as a basis (i.e. factors 25: 3.541..., 27: 3.560..., 29: 3.528...),
guessing 3.54, we're at only an estimated 3378071.50*3.54^n ms duration per
step, e.g. ≈ 3 h 19 min for 31, ≈ 11 h 45 min for 33, ≈ 41 h 37 min for 35,
... and ≈ 1847 h (or ≈ 76.94 days) for 41 (according to WolframAlpha). But
recall that those are durations with only 8 weak hardware threads, a
supercomputer has many hundreds of thousands or even millions of those in
strong! So, even when going up to word length 51, you probably end up
around only some hours [e.g. (3378071.50*3.54^11 ms)/100000 is 10 h 15 min
57.68 s according to WolframAlpha
<https://www.wolframalpha.com/input?i=%283378071.50*3.54%5E11+ms%29%2F100000>],
even for such long proofs! (But probably there needs to be added some more
time – and space – due to the distributed memory management architecture in
supercomputers, I don't know much about that subject.)
> How much more do you think you need in terms of computing resources?
I hope the overview gave a good idea, but it depends entirely on for how
far we want to go. :-) We could use all the computing resources in the
world for millennia and most likely not find shortest proofs for most of
the theorems in pmproofs.txt, at least not knowingly. We'd get more
knowledge on lower bounds and some surprisingly short proofs might pop up,
but the gain diminishes quickly in comparison to the resources spent in
this way. But this is also quite a niche subject, yeah, so I wouldn't be
too optimistic about funding.
Summarized, yes, the data blowup is definitely more of an issue here. Even
on supercomputers, it might be useful to spend more time doing schema
filtering up to a certain threshold in order to be able to compute results
for longer proofs at all. In the end, it takes a good part of the
exponential data blowup away.
— Samiro Discher
[email protected] schrieb am Freitag, 14. Oktober 2022 um 21:08:36 UTC+2:
> This is really cool and impressive work :)
>
> Are there any other optimisations you can think of which might make it
> quicker? Like how close do you think this is to optimal?
>
> Do you think it would help to implement it for gpus? Maybe the data blowup
> is more of a problem, if you already have it for multithread cpu then that
> sounds pretty good.
>
> How much more do you think you need in terms of computing resources? If
> you have the program ready to go it might be nice to apply to some grant
> bodies for money to go further. Though maybe it's of quite niche interest.
>
> Anyway yeah this is all really awesome :) Just wanted to say well done!
>
> On Sunday, October 9, 2022 at 9:03:14 PM UTC+1 Samiro Discher wrote:
>
>> Thanks :-)
>>
>> Yes, the ones for the second proposal are ties, but I wouldn't say it's
>> worth keeping them all since one could generate huge amounts of ties from
>> what my tool can generate and what information such a proof – and the
>> collection as a whole – contains. Except if you mean keeping only both, the
>> historically first discovered and the alphanumerically minimal variants,
>> since they are both "relevant" in a way.
>>
>> Let me explain this and my approach a bit. (Sorry for the incoming wall
>> of text. Feel free to skip at your pace. :-)
>>
>> A parser can quickly extract all subproof strings, e.g.
>> DD2D1DD22D11DD2D112 (length 19) for theorem *1.5 has 7 non-trivial proper
>> subproofs (i.e. excluding axioms and itself): D11, D22, D2D11, DD22D11,
>> DD2D112, D1DD22D11, and D2D1DD22D11. Since every proof is essentially a
>> modus ponens (two inputs and one output), a proof of length *n* ≥ 3
>> (they all have odd lengths) has up to (*n*-3)/2 non-trivial proper
>> subproofs (there may be overlaps).
>>
>> These numbers can grow quite large for longer proofs, and there may be
>> many overlaps, e.g. my 1883-step proof of biass alone has 222 different
>> non-trivial proper subproofs. When counting such, the unified collection
>> (pmproofs-unified.txt) consists of 1374 proofs (which are all listed
>> explicitly in data/pmproofs_1-unified.dpc), and FYI the non-unified
>> collection consists of 1436 proofs (listed explicitly in
>> data/pmproofs_1.dpc).
>> Now each such proof has a final formula (the conclusion), and many of
>> those formulas have several shortest (known) proofs. There may even be
>> cases where both a formula and a different schema of that formula are
>> suitable with both having a minimal proof of the same length.
>>
>> Now for a proof that has *m* uses of a certain conclusion where that
>> conclusion has *n* shortest (known) proofs, there is a number of *m* to
>> the power of *n* ways to only insert proofs of that conclusion into the
>> final proof. So we are talking about exponential growth of possibilities
>> based on the number of overlaps and redundant shortest (known) proofs.
>> There is certainly quite a number of other 1883-step proofs of biass that
>> could be deduced from the information that is already in pmproofs.txt. But
>> not in pmproofs-unified.txt since the overlaps have been removed there.
>>
>>
>> Now for the overall approach, my parser creates the list of non-trivial
>> subproofs merged from all proofs, sorts them by their lengths and parses
>> them all to their conclusions.
>> Then I used files I generated which contain proofs of all possible
>> conclusions of proofs up to length 29 to find better variants, using a lot
>> of schema checks (my formulas are actually tree structures). If a
>> conclusion *A* from my files is a schema of a conclusion *B* of
>> pmproofs.txt, all string occurrences of proof(*B*) within pmproofs.txt
>> can simply be replaced by proof(*A*), which keeps all the proofs valid
>> and ideally shortens them.
>>
>> So I collected all the best potential replacements, and then applied them
>> from long to short.
>> Not too interesting I'd say, but rather straightforward (it took me one
>> day when all the basic functionality was already implemented).
>> What really took its time and great effort was the very optimized
>> implementation of the D-proof-to-formula (tree structure) parser and
>> subtree checks, and the generator of all possible outcomes. For instance, I
>> can parse pmproofs.txt in less than 170 ms (to what you see in
>> data/pmproofs_2.dpc) using a single thread of my old Intel Core i7-3770,
>> but I use multiple threads to compare and generate results.
>>
>> The generator of all provable formulas is fully parallelized via Intel's
>> oneTBB library. I essentially implemented a push-down automaton to generate
>> all possible D-proofs of length *n*+2 based on when all of length *n*
>> are already known, then do this step by step (i.e. file by file). But I
>> sorted out all of the redundancies, including proofs with a conclusion for
>> which there is a schema that has a known proof at least as short. So the
>> remaining proofs have a conclusion that represents a whole bunch of other
>> proofs and potentially other formulas. In short, I kept the amount of data
>> as small as possible.
>> Here is the amount of new representative proofs of lengths up to 29:
>> 1 : 3
>> 3 : 6
>> 5 : 12
>> 7 : 38
>> 9 : 89
>> 11 : 229
>> 13 : 672
>> 15 : 1844
>> 17 : 5221
>> 19 : 15275
>> 21 : 44206
>> 23 : 129885
>> 25 : 385789
>> 27 : 1149058
>> 29 : 3449251
>> (5181578 representatives in total)
>>
>> For example, dProofs29.txt is already 103.477529 megabytes (that is
>> 3449251*(29+1)-1 bytes) in size, and all the parsed data from those files,
>> including the formula trees of all of the conclusions, just barely fits
>> into my 32 GiB of DDR memory. What takes a lot of time currently are the
>> schema checks of which there are too many (quadratic complexity; pairwise
>> comparisons only limited by fitting formula lengths), and I am still hoping
>> to be able to come up with a database structure to make this more
>> efficient. (For the 27->29 step, everything else took just under an hour on
>> my PC, but the schema filtering took over 17 days.) The gain of schema
>> filtering is little but still has an exponential effect on subsequent
>> files. (926015 proofs were filtered out from the proofs of length 29 in
>> that manner.) Anyway, a supercomputer could probably produce more shortest
>> results using my HPC-ready program, and I will try to get access to one.
>>
>>
>> I hope this is somewhat readable, if something is still unclear, feel
>> free to ask. And let's hope at least David is still alive .. (´ ∀ ` *)
>>
>> — Samiro Discher
>>
>>
>> Benoit schrieb am Sonntag, 9. Oktober 2022 um 14:50:47 UTC+2:
>>
>>> Congratulations !
>>>
>>> Since that page of the website is not on git, I think only David can add
>>> your proposals. I checked both using drule.c.
>>>
>>> As for your second proposal, that happened because you found "ties"
>>> among shortest known proofs, is that it ? Maybe it's worth keeping them
>>> all ?
>>>
>>> I'm curious to see that C++ tool when you publicly release it. I don't
>>> know when the number of "21 steps" in pmproofs.txt for exhaustive search
>>> was written, but being able to reach 29 steps, given the exponential nature
>>> of the problem, seems impressive. And also for the non-exhaustive part,
>>> the heuristics you're using are probably interesting.
>>>
>>> Benoît
>>>
>>>
>>> On Thursday, October 6, 2022 at 9:50:28 PM UTC+2 Samiro Discher wrote:
>>>
>>>> I'd like to contribute my findings to
>>>> https://us.metamath.org/mmsolitaire/pmproofs.txt.
>>>>
>>>> I found several (17) shorter proofs. See 'pmproofs.txt' at
>>>> https://github.com/xamidi/mmsolitaire [changes: 2be2349
>>>> <https://github.com/xamidi/mmsolitaire/commit/2be2349a5621eaaee5d5940e98202bd265c9d50a>
>>>> ]:
>>>> *2.62: 89 -> 77
>>>> *3.31: 103 -> 83
>>>> *3.43: 137 -> 117
>>>> *3.44: 203 -> 183
>>>> *4.14: 321 -> 283
>>>> *4.32: 355 -> 317
>>>> *4.39: 609 -> 479
>>>> *4.41: 271 -> 249
>>>> *4.76: 271 -> 249
>>>> *4.77: 375 -> 287
>>>> *4.86: 617 -> 557
>>>> *4.87: 523 -> 447
>>>> *5.31: 119 -> 109
>>>> *5.6: 203 -> 167
>>>> *5.75: 387 -> 351
>>>> meredith: 145 -> 141
>>>> biass: 2127 -> 1883
>>>>
>>>> Furthermore, I modified another 26 proofs such that all subproofs of
>>>> the same formula (ranging over the whole collection) are equal (preferring
>>>> the lexicographically smaller proof strings). See
>>>> 'pmproofs-unified.txt'. Feel free to use its contents if desired. I find
>>>> that property useful since this way, there are no redundant proofs when
>>>> parsing the collection, which saves time and space and determines ideal
>>>> and
>>>> unique references. To showcase this, I uploaded example files of the
>>>> parsed
>>>> collections to https://github.com/xamidi/mmsolitaire/tree/master/data.
>>>> The string "pmproofs_*n*" in a filename indicates that helper proofs
>>>> are referenced for parts that are used in at least *n* different
>>>> proofs.
>>>>
>>>> My findings were assisted by a custom tool I wrote in C++ (still a
>>>> private GitHub repo for assisting my research), and I have – If I didn't
>>>> mess something up – exhaustively searched all proofs up to a length of 29
>>>> steps.
>>>>
>>>> — Samiro Discher
>>>>
>>>> PS: The contact information on the Metamath website and in pmproofs.txt
>>>> is still of Dr. Norman Megill. At first I tried to use it despite him
>>>> being
>>>> deceased, and got no respone from anyone.
>>>>
>>>> PPS: Is there a way I can add a non-Google mail address to this mailing
>>>> list?
>>>>
>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/metamath/2ead26a7-4249-4e8b-bce8-3e1748b3e3e5n%40googlegroups.com.