Re: [Metamath] Future website directions

David A. Wheeler Mon, 27 Feb 2023 06:51:55 -0800

> On Feb 26, 2023, at 11:04 PM, Mario Carneiro <[email protected]> wrote:
> 
> FYI I did a fairly comprehensive HTML modernization in mm-web-rs 
> [https://github.com/digama0/mm-web-rs/blob/master/src/render.rs] (which is 
> basically a clone of the metamath-exe theorem page generator), which among 
> other things lowercased all the tags and made it pass the w3c validator again 
> (there is a link at the bottom of most pages but they don't pass more modern 
> HTML standards and still use things like <FONT>).

Awesome!

When I spoke I was thinking more about the static HTML files in:
https://github.com/metamath/set.mm
https://github.com/metamath/metamath-website-seed

... but yes, we want to modernize those as well.

--- David A. Wheeler

> 
> On Sun, Feb 26, 2023 at 11:52 AM David A. Wheeler <[email protected]> 
> wrote:
> I completely agree, a page on AI/ML would be great. The pages could do with a 
> refresh anyway, many places suggest contacting Norm.
> 
> Norm used an HTML 1.0 convention of using uppercase tags. I suggest that new 
> text use lowercase tags as a style choice
> It doesn't matter technically, but it is the current convention, and it would 
> make it easier to see which parts are unchanged.
> 
> On February 25, 2023 5:33:02 PM EST, Mario Carneiro <[email protected]> wrote:
> A page with short descriptions of papers and projects using Metamath for AI 
> would be great to put on the website. I think it would be good publicity for 
> both metamath and the AI systems, and maybe we can solicit the paper authors 
> to write the descriptions and provide related links as well.
> 
> On Sat, Feb 25, 2023 at 5:27 PM Jon P <[email protected]> wrote:
> Re the subject of LLMs being trained on metamath this new model called LLama 
> from facebook is interesting. It seems to be smaller than other systems and 
> it looks like they're already training on metamath.
> 
> https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
> 
> It links to this page and this paper which talk about the matheamtical aspects
> 
> https://ai.facebook.com/blog/ai-math-theorem-proving/
> 
> https://arxiv.org/abs/2205.11491
> 
> It's interesting they talk about the "metamath benchmark". Maybe one approach 
> is to take that and have a page which presents the metamath database for 
> training (kind of like the .txt files or json or whatever is preferred) and 
> then has some notes about the papers that have already been published about 
> it (gpt-f, LLama) and their results. Might be interesting.
> 
> On Sunday, February 19, 2023 at 10:39:55 PM UTC David A. Wheeler wrote:
> 
> > On Sun, Feb 19, 2023 at 2:19 PM David A. Wheeler <[email protected]> 
> > wrote: 
> > My first choice when building a website, where it's easily achieved, is 
> > static rendering (statically-generated pages) where the most important data 
> > is visible without client-side JavaScript. Those are easily understood, 
> > incredibly hard to attack, trivial to configure, trivial to mirror, support 
> > security/privacy-conscious users, etc. Some sites *cannot* be developed 
> > this way, of course!! But using that as the "starting point" helps clarify 
> > "what else do you need?" - that is, it forces a discussion of "what are the 
> > unspoken requirements?". It's also where we are in the metamath site. 
> > 
> > With pre-rendering as a replacement for dynamic server side stuff, we can 
> > still do most of the things. For example we could prerender all the json 
> > queries and use client side JS to request them when doing cross linking 
> > stuff. This is only made difficult if the set of all queries is infinite, 
> > for example if we can do a search query. 
> 
> 
> > On Feb 19, 2023, at 3:13 PM, Mario Carneiro <[email protected]> wrote: 
> > The main advantages of dynamic rendering are: 
> > (1) quicker turnaround time, e.g. if we want to hastily put up a mm100 
> > theorem on the website; 
> > (2) faster iteration - it is possible to make changes to the server and see 
> > the results ~immediately; 
> > (3) lower storage cost, since we won't have to generate all the HTML for 
> > the web site in advance and just cache important stuff (I assume that most 
> > of the work going into the website build is wasted, since no one looks at 
> > that page before it is next regenerated); 
> > (4) the ability to have many more rendering styles (unicode, gif, text, 
> > latex, mathml, json have all been considered and more are possible) and 
> > more databases, which is a combinatorial problem for the website build 
> > limited by storage space and generation time. 
> 
> Sure. As always, it's all about trade-offs :-). 
> 
> I don't know if quicker turnaround time & such is all that important. 
> A "local" server run (by, say, a tool that called metamath-knife) would have 
> the same 
> tooling problems: You'd need to set up additional tools to run the tool 
> locally. 
> 
> Of that list, the ability to have more rendering styles is possibly the most 
> compelling of that list. 
> But exactly what rendering styles? 
> 
> 
> > > We can obviously use a program to dynamically generate many of the pages 
> > > instead, but what would it need to do? 
> > 
> > For the first version, I would aim simply to replicate the behavior / look 
> > and feel of the current metamath web pages. That's what I implemented in 
> > https://github.com/digama0/mm-web-rs . Once we have something that can 
> > potentially replace the functionality of metamath-exe it will be easier to 
> > consider incremental improvements, in an architecture where we can do fancy 
> > things if we want. 
> 
> Okay, but what is desirable that is *different* long term? I think there are 
> plausible good answers, 
> but it'd be good to have some idea about them. 
> 
> > One technique that I use in the MM0 web pages is to use the HTML as the 
> > data source: put all the proofs in a table like they would be normally, but 
> > perhaps not as marked up as they should be. Then postprocess it with JS to 
> > add all the markup and make additional server requests if needed. That 
> > solves the "make the JS-disabled experience palatable" problem, since you 
> > just get the basic version of the page if no JS is running. 
> 
> Yup! Sounds like a great approach. 
> 
> > The sort of thing that might require additional data is parsing support; 
> > for example with MM0 pages the formulas are printed in ascii but ideally 
> > they would reflow using the layout algorithm, using correct indentation. 
> > Metamath web pages are horrible at this; they get good per-token rendering 
> > but reflowing is garbage so hopefully you just don't have too-long lines. 
> > mmj2 has a nice indenting formatter (actually there are several algorithms 
> > available); wouldn't it be nice if that could be done in-browser so that it 
> > adapts to the viewport width? If the lines are sent in HTML as token 
> > strings, then that means that the JS has to do the parsing itself so it 
> > needs to know the signatures of declarations like 'wi'. 
> 
> An alternative is to provide HTML in a table (or some other grid), along with 
> something that's simpler to parse but not shown to someone just looking at 
> the HTML. It could be in an attribute, something with "display: none", 
> whatever. If there's a client-side renderer being used, I would want it to be 
> *easy* to write, not make it parse complex HTML. The individual theorems 
> aren't big, so it'd be pretty easy to include both in one file. 
> 
> > Also expensive queries can be hidden behind an expandable drop-down and a 
> > server request. We probably want to keep axioms-used and used-by lists 
> > enabled by default even though they are somewhat expensive (although if the 
> > server has some time to warm up and precompute some data structures then it 
> > can cheaply answer these requests), but there are some other requests along 
> > the same lines, like "how many steps would this proof be if everything was 
> > expanded to axioms" (which is fun if you like to look at numbers in the 
> > range of 10^50), which are accessible to metamath-exe but not shown on the 
> > web page. 
> 
> Agreed. You can do the same with pregenerated static files, BTW, where 
> searches bring you to a *dynamic* program that does the search & provides the 
> results. 
> 
> An alternative to a lengthy warm-up each time would be to precompute some 
> data structures & answers, then store them in a database (probably using 
> SQLite, but MySQL or PostgreSQL would work, though perhaps even a simple 
> key/value store would be enough). Or do the lengthy warm-up only when that 
> .mm has changed, then store that in a database. Then server restarts can be 
> fast when the .mm file hasn't changed. 
> 
> > > I don't know if we need a JSON format for the .mm file as a whole. ... 
> > 
> > I agree with this. The main use case for JSON is at the per-theorem level; 
> > a JSON for the mm file as a whole would be huge and you would be better off 
> > parsing MM directly at that point, since it would be more efficient. Some 
> > other potential files are table-of-contents like things, which cover the 
> > whole mm file but only represent one 'aspect' of the theorems. So an actual 
> > table of contents (the section name hierarchy), the theorem pages (i.e. 
> > give the statements of theorems 1000-1100), the definition list, the 
> > grammar (syntax axioms). The bare list of all theorem names, in order, 
> > might also be a useful query - I would just give this in plain text, 
> > newline separated. Maybe include the $f/$e/$a/$p classifications too. 
> 
> --- David A. Wheeler 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/00387AA9-F563-4312-BB32-A610E1E4AC8C%40dwheeler.com.
Re: [Metamath] Future website directions

Reply via email to