Summarising (and elaborating on) some points from a discussion Michael and I 
had after my seminar yesterday.

Michael was of the opinion that everything would be much easier for me if I 
would just use STeX.

I had certain reservations. If the aim was strictly for me to write math papers 
where the semantic content[*] was machine-readable, then indeed STeX would be 
better for this than the openmathcd package I've been working on, but that's 
not the main use-case for me at the moment. Also, many of the problems I talked 
about have to do with mathematical structures that in formalised logic would be 
part of a well-formed formula but in traditional publication is written out in 
words, often spanning several sentences — it's not clear to me that STeX helps 
with that (but I could be convinced otherwise).

Rather, I see OM primarily as a tool for letting computers speak math. There is 
a population of people who write programs for carrying out some advanced 
calculation, because it's simply too massive to do by hand. If the result is 
just a list of numbers then exporting the result is straightforward, but if the 
result of the calculation is rather some formula then things get messier. I've 
written such programs encoding their results as Maple input, as Mathematica 
input, as LaTeX, as XHTML, and probably other things still — it's quirky, 
fragile, and not at all pretty. A great thing about OM is that it provides 
exactly the level of detail these people need for their primary output: only 
the semantics, none of that messy prettyprinting or presentation.

A further selling point for us would be Open Research Data — increasingly it's 
not sufficient to just publish a paper, you must also make your research data 
available. If your research data consist of big symbolic expressions, then how 
can you make sure people will be able to parse them? You use OpenMath!

One stumbling point is finding all the symbols you need. Michael claimed 
writing new content dictionaries is easy in STeX, and that he himself has 
written hundreds. But if they don't show up in the big list on 
www.openmath.org, then how is anyone outside the Kohlhase academic lineage to 
know?

Another stumbling block is notation. It's nice that your program outputs 
results that are future-proof and unambiguously machine-readable, but you 
probably would like to read them yourself as well. That means you need to 
generate a presentation, and you may want to control the notation used in that. 
How does one do that? Last time I looked into the matter, there wasn't much of 
a system (apart from the collection of XSLTs on www.openmath.org that generate 
MathML from OMOBJs in our CDs, and it's not clear to me if that is meant to be 
adaptable).

Michael claims there are not one system, but two: an STeX one, and an MMT one. 
He expresses regret that there is no common standard (but he only has so many 
hours in a day).

From my perspective, it is news that there even IS some system — presuming that 
it is generic and comprehensible — and I consider standardisation to be a minor 
issue (that may well solve itself, by some system emerging as a de facto 
standard). I would LOVE to see a talk on how it works and how one uses it to 
this end: given an OMOBJ and some source of notation specifications, generate a 
presentation of that OMOBJ in some *ML or LaTeX.

This needn't even be a drain on valuable professor time, but could be handled 
as an exercise (in preparing and performing academic presentations) for some 
student who is already familiar with one system: either do a Zoom 
talk/tutorial, or produce a write-up for posting on www.openmath.org.

Lars Hellström


[*] As a side remark, I note that there are developments in public 
administration vis-a-vis accessibility that over time could grow into 
requirements that formulae in official documents must have machine-readable 
semantic content. I'm extrapolating, but last month when the Swedish national 
library requested comments on some future guidelines for open access, they were 
very explicit that replies in PDF had to be "accessible PDF" (which after some 
unravelling turned out to mean tagged PDF, to aid speech synthesis). I can 
certainly picture bureaucrats imposing similar requirements on grant proposals.


_______________________________________________
Om mailing list
Om@openmath.org
https://mailman.openmath.org/cgi-bin/mailman/listinfo/om

Reply via email to