Unfortunately I found bugs in my tool which I mentioned in my previous post. I tested it by generating a table proof, then generating a compressed proof from the table proof obtained on the previous step, and again generating a second table proof from the compressed proof obtained on the previous step. Ideally those two table proofs should be identical, but there are discrepancies for some assertions. I will try to fix bugs and let you know once it is ready (if this topic is still relevant at that time :) ). Sorry for posting before validation.
- Igor On Wednesday, February 15, 2023 at 7:04:49 PM UTC+1 David A. Wheeler wrote: > > 2023-02-14 21:57 GMT-05:00 "Mario Carneiro" <[email protected]>: > > I think a json version of the web pages would make a lot more sense than > a > > plain text version, if the intent is to support machine consumption by > > tools that can't parse metamath already. There are even uses for that in > > the web pages themselves, since an interactive version of the web pages > > would like to have the underlying data in a more easily processed form so > > that it can do things with it. In fact, we could even eliminate the > > mpeuni/mpegif pages entirely and replace them with a on-the-spot client > > side renderer which uses the json file as input along with some global > > formatting data (i.e. a json version of the $t comment), which would > > significantly decrease the space cost of the web site without changing > the > > user experience at all. > > I can see value in generating JSON formats, and it's wouldn't be that hard. > > I do see big downsides for the use cases listed here though: > 1. For training a GPT, I fear that the system would focus on "it's a JSON > file!" > and any tool-specific training would be swamped by thinking it's the same > as other JSON files. > I think a GPT system that trains on the "Internet as a whole" is more > likely to > generate some useful information if we minimize distractors. > In short, make it REALLY OBVIOUS that this text is regular in some way, yet > clearly different from anything else, and expose structure where we > reasonably can. > > 2. For rendering on the client side, there are big advantages to showing > static data > as straight HTML. For one, people who disable JavaScript can still see it > fine. > Sure, sometimes it's useful to run arbitrary code sent by strangers, but > there are good > reasons to minimize that as a requirement :-). It also means that search > engines see > "what we actually have". > > That said, I can see JSON being quite useful, but that still leaves > generating a GPT-friendly > text fiole as useful. > > --- David A. Wheeler -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/670013b1-6268-4ecf-886c-e169b81d3559n%40googlegroups.com.
