> On Feb 26, 2023, at 11:04 PM, Mario Carneiro <[email protected]> wrote:
>
> FYI I did a fairly comprehensive HTML modernization in mm-web-rs
> [https://github.com/digama0/mm-web-rs/blob/master/src/render.rs] (which is
> basically a clone of the metamath-exe theorem page generator), which among
> other things lowercased all the tags and made it pass the w3c validator again
> (there is a link at the bottom of most pages but they don't pass more modern
> HTML standards and still use things like <FONT>).
Awesome!
When I spoke I was thinking more about the static HTML files in:
https://github.com/metamath/set.mm
https://github.com/metamath/metamath-website-seed
... but yes, we want to modernize those as well.
--- David A. Wheeler
>
> On Sun, Feb 26, 2023 at 11:52 AM David A. Wheeler <[email protected]>
> wrote:
> I completely agree, a page on AI/ML would be great. The pages could do with a
> refresh anyway, many places suggest contacting Norm.
>
> Norm used an HTML 1.0 convention of using uppercase tags. I suggest that new
> text use lowercase tags as a style choice
> It doesn't matter technically, but it is the current convention, and it would
> make it easier to see which parts are unchanged.
>
> On February 25, 2023 5:33:02 PM EST, Mario Carneiro <[email protected]> wrote:
> A page with short descriptions of papers and projects using Metamath for AI
> would be great to put on the website. I think it would be good publicity for
> both metamath and the AI systems, and maybe we can solicit the paper authors
> to write the descriptions and provide related links as well.
>
> On Sat, Feb 25, 2023 at 5:27 PM Jon P <[email protected]> wrote:
> Re the subject of LLMs being trained on metamath this new model called LLama
> from facebook is interesting. It seems to be smaller than other systems and
> it looks like they're already training on metamath.
>
> https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
>
> It links to this page and this paper which talk about the matheamtical aspects
>
> https://ai.facebook.com/blog/ai-math-theorem-proving/
>
> https://arxiv.org/abs/2205.11491
>
> It's interesting they talk about the "metamath benchmark". Maybe one approach
> is to take that and have a page which presents the metamath database for
> training (kind of like the .txt files or json or whatever is preferred) and
> then has some notes about the papers that have already been published about
> it (gpt-f, LLama) and their results. Might be interesting.
>
> On Sunday, February 19, 2023 at 10:39:55 PM UTC David A. Wheeler wrote:
>
> > On Sun, Feb 19, 2023 at 2:19 PM David A. Wheeler <[email protected]>
> > wrote:
> > My first choice when building a website, where it's easily achieved, is
> > static rendering (statically-generated pages) where the most important data
> > is visible without client-side JavaScript. Those are easily understood,
> > incredibly hard to attack, trivial to configure, trivial to mirror, support
> > security/privacy-conscious users, etc. Some sites *cannot* be developed
> > this way, of course!! But using that as the "starting point" helps clarify
> > "what else do you need?" - that is, it forces a discussion of "what are the
> > unspoken requirements?". It's also where we are in the metamath site.
> >
> > With pre-rendering as a replacement for dynamic server side stuff, we can
> > still do most of the things. For example we could prerender all the json
> > queries and use client side JS to request them when doing cross linking
> > stuff. This is only made difficult if the set of all queries is infinite,
> > for example if we can do a search query.
>
>
> > On Feb 19, 2023, at 3:13 PM, Mario Carneiro <[email protected]> wrote:
> > The main advantages of dynamic rendering are:
> > (1) quicker turnaround time, e.g. if we want to hastily put up a mm100
> > theorem on the website;
> > (2) faster iteration - it is possible to make changes to the server and see
> > the results ~immediately;
> > (3) lower storage cost, since we won't have to generate all the HTML for
> > the web site in advance and just cache important stuff (I assume that most
> > of the work going into the website build is wasted, since no one looks at
> > that page before it is next regenerated);
> > (4) the ability to have many more rendering styles (unicode, gif, text,
> > latex, mathml, json have all been considered and more are possible) and
> > more databases, which is a combinatorial problem for the website build
> > limited by storage space and generation time.
>
> Sure. As always, it's all about trade-offs :-).
>
> I don't know if quicker turnaround time & such is all that important.
> A "local" server run (by, say, a tool that called metamath-knife) would have
> the same
> tooling problems: You'd need to set up additional tools to run the tool
> locally.
>
> Of that list, the ability to have more rendering styles is possibly the most
> compelling of that list.
> But exactly what rendering styles?
>
>
> > > We can obviously use a program to dynamically generate many of the pages
> > > instead, but what would it need to do?
> >
> > For the first version, I would aim simply to replicate the behavior / look
> > and feel of the current metamath web pages. That's what I implemented in
> > https://github.com/digama0/mm-web-rs . Once we have something that can
> > potentially replace the functionality of metamath-exe it will be easier to
> > consider incremental improvements, in an architecture where we can do fancy
> > things if we want.
>
> Okay, but what is desirable that is *different* long term? I think there are
> plausible good answers,
> but it'd be good to have some idea about them.
>
> > One technique that I use in the MM0 web pages is to use the HTML as the
> > data source: put all the proofs in a table like they would be normally, but
> > perhaps not as marked up as they should be. Then postprocess it with JS to
> > add all the markup and make additional server requests if needed. That
> > solves the "make the JS-disabled experience palatable" problem, since you
> > just get the basic version of the page if no JS is running.
>
> Yup! Sounds like a great approach.
>
> > The sort of thing that might require additional data is parsing support;
> > for example with MM0 pages the formulas are printed in ascii but ideally
> > they would reflow using the layout algorithm, using correct indentation.
> > Metamath web pages are horrible at this; they get good per-token rendering
> > but reflowing is garbage so hopefully you just don't have too-long lines.
> > mmj2 has a nice indenting formatter (actually there are several algorithms
> > available); wouldn't it be nice if that could be done in-browser so that it
> > adapts to the viewport width? If the lines are sent in HTML as token
> > strings, then that means that the JS has to do the parsing itself so it
> > needs to know the signatures of declarations like 'wi'.
>
> An alternative is to provide HTML in a table (or some other grid), along with
> something that's simpler to parse but not shown to someone just looking at
> the HTML. It could be in an attribute, something with "display: none",
> whatever. If there's a client-side renderer being used, I would want it to be
> *easy* to write, not make it parse complex HTML. The individual theorems
> aren't big, so it'd be pretty easy to include both in one file.
>
> > Also expensive queries can be hidden behind an expandable drop-down and a
> > server request. We probably want to keep axioms-used and used-by lists
> > enabled by default even though they are somewhat expensive (although if the
> > server has some time to warm up and precompute some data structures then it
> > can cheaply answer these requests), but there are some other requests along
> > the same lines, like "how many steps would this proof be if everything was
> > expanded to axioms" (which is fun if you like to look at numbers in the
> > range of 10^50), which are accessible to metamath-exe but not shown on the
> > web page.
>
> Agreed. You can do the same with pregenerated static files, BTW, where
> searches bring you to a *dynamic* program that does the search & provides the
> results.
>
> An alternative to a lengthy warm-up each time would be to precompute some
> data structures & answers, then store them in a database (probably using
> SQLite, but MySQL or PostgreSQL would work, though perhaps even a simple
> key/value store would be enough). Or do the lengthy warm-up only when that
> .mm has changed, then store that in a database. Then server restarts can be
> fast when the .mm file hasn't changed.
>
> > > I don't know if we need a JSON format for the .mm file as a whole. ...
> >
> > I agree with this. The main use case for JSON is at the per-theorem level;
> > a JSON for the mm file as a whole would be huge and you would be better off
> > parsing MM directly at that point, since it would be more efficient. Some
> > other potential files are table-of-contents like things, which cover the
> > whole mm file but only represent one 'aspect' of the theorems. So an actual
> > table of contents (the section name hierarchy), the theorem pages (i.e.
> > give the statements of theorems 1000-1100), the definition list, the
> > grammar (syntax axioms). The bare list of all theorem names, in order,
> > might also be a useful query - I would just give this in plain text,
> > newline separated. Maybe include the $f/$e/$a/$p classifications too.
>
> --- David A. Wheeler
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com.
>
--
You received this message because you are subscribed to the Google Groups
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/metamath/00387AA9-F563-4312-BB32-A610E1E4AC8C%40dwheeler.com.