Re: [Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)

Tim Daly Thu, 18 Aug 2016 21:46:27 -0700

One of the Axiom project goals is to develop a "Computer Algebra Test
Suite" (CATS). Albert Rich has done this with RUBI and integration. That
work is already partially in the test suite and work has been done on the
pattern matching. Large datasets (like Kamke) are always welcome. Some,
such as Schaums were hand-developed. This is tedious.

As Axiom develops more explanations and documentation it would be
useful to execute the formulas directly so there is a local incentive to be
clear about semantics.

In the long term the hope is that we can just grab formulas directly from
their sources (ala literate programming). Your work makes it plain that raw
latex does not carry sufficient semantics. There is a global question of
how to make this work. Unfortunately a general cross-platform solution
is difficult (cite Dewar/Davenport/et al. for OpenMath).

Since Axiom is literate and extracting formulas is trivial it seems that
literate markup is a natural goal. Since Axiom uses abstract algebra
as a scaffold the type tower already has a lot of axiomatic semantics.
The natural join of literate latex and abstract algebra is clearly
semantic markup, aka selatex.

===========
Consideration 10: semantic->inputform translation (weaver? :-) )

>x and not x   has no particular meaning,  but if x is explicitly true or
false,
>Maxima simplifies it to false.  If SEALATEX has a semantics -- are you
>defining yet another CAS?  Or perhaps you should link it 100% to Axiom's
>semantics, which you presumably know about and can modify.

I am NOT defining another CAS. The goal is a "well-designed hack" using
universally understood latex, a latex package, and a translation program.

The selatex idea is only partially Axiom specific. \INT, for instance,
seems
pretty generic. However, if the idea is to read formulas and disambiguate
a=b (boolean) vs a=b (equation) then the markup needs to be grounded
to have meaning. Axiom's domains (BOOLEAN) and (EQ) as the ground

\BOOLEAN(a=b)
\EQ(a=b)

are unambiguous relative to each other in Axiom. I don't know enough
about Maxima to understand how this might translate.

The extracted formulas with the decorated semantics still needs a
semantics->inputform (weaver) pre-processor which could be Maxima
specific. This would lead to debate about what "equality" means, of course.

Axiom has tried to create a first-order "rosetta stone" to translate between
systems (rosetta.pdf [1]) but it is too shallow to consider providing
cross-platform semantics.

=============
Consideration 11: \scope in selatex

>As far as recording stuff in DLMF -- there are presumably scope issues
>("in this chapter n,m are natural numbers....")  and maybe even a need
>to make value assignments.
>I think you need to model these in SEALATEX too.

(See Consideration 6)

Clearly there are scoping issues. My current thinking is to create a
\scope markup that would manage the environment(s). This is not
a new issue (see "Lisp in Small Pieces" [0])

There seem to be three concerns.

First is the scope name, with something like 'global' as a keyword.
Second is the "closure chain" of other scopes.
Third is the symbol being scoped.

\scope{name}{chain}{symbol}

The weaver program would walk this chain to create the proper
file syntax for system input.

============
Consideration 12: System specific commands \axiom

Along with the formulas it is clear that some system specific
input may be required, such as loading files, clearing workspaces,
etc. Some of these may be done in the weaver program, such as
between formulas. Others may need to be added to the semantics
block. So a markup that provides verbatim quoting per system
might be defined, e.g.

\axiom{)clear all}  %clear the workspace

which would simply quote an input line.

==============

Note that so far all that is being suggested is transparent formula
markups which do not impact the presentation, some special tags
(\scope, \axiom,...) and a weaver program, along with the ability to
read the latex and extract named formulas (aka a literate program,
which Axiom already can do).

It ought to be possible (by design) to create a semantic version of
CRC that any system could import, assuming a "sufficiently clever
weaver".

On a more ambitious note, I am trying to find a way to keep the selatex
markup "hidden" in a pdf and use it as the clipboard paste when the
formula is selected. Anyone with a clue, please help.

===============

[0] Queinnec, Christopher, "Lisp in Small Pieces" ISBN 978-0521545662
(2003)

[1] Wester, Michael J. and Daly, TImothy "Rosetta"
http://axiom-developer.org/axiom-website/rosetta.pdf

On Thu, Aug 18, 2016 at 5:30 PM, Richard Fateman <[email protected]>
wrote:

> thanks for all the references :)
>
> I'm not sure if I'm going to repeat comments I  made already somewhere.
> First,  has Dan Zwillinger weighed in?  I think that it would be useful
> to see what he has done.
>
> Next, there are ambiguities among CAS and even within a single CAS.
>
> For example, in Macsyma/ Maxima  there is generally no semantics
> associated with "=" or ">".   But in some contexts, there is some meaning.
>
> x>2*y
>
> is a tree expression.  It is not associated with x or with y.
>
> assume(x>2*y)   does mean something ... it puts info in a database.
> Somehow encoding the method to extract this information into SEALATEX
> (SeLaTeX?) in a CAS-independent way -- that's quite a task.   In
> particular, it would seem to require an understanding of what assume()
> does in Maxima, and what is() does also.
>
> x and not x   has no particular meaning,  but if x is explicitly true or
> false,
> Maxima simplifies it to false.  If SEALATEX has a semantics -- are you
> defining yet another CAS?  Or perhaps you should link it 100% to Axiom's
> semantics, which you presumably know about and can modify.
>
> As far as recording stuff in DLMF -- there are presumably scope issues
> ("in this chapter n,m are natural numbers....")  and maybe even a need
> to make value assignments.
> I think you need to model these in SEALATEX too.
>
> Just musing about where you are heading.
>
> RJF
>
>
>
>
> On 8/18/2016 11:45 AM, Tim Daly wrote:
>
> Fateman [0] raised a set of issues with the OpenMath
> approach. We are not trying to be cross-platform in this
> effort. Axiom does provide an algebraic scaffold so it is
> possible that the selatex markup might be useful elsewhere
> but that is not a design criterion.
>
> Fateman[1] also raises some difficult cross-platform issues
> that are not part of this design.
>
> Fateman[2] shows that parsing tex with only syntactic markup
> succeeded on only 43% of 10740 inputs. It ought to be posible
> to increase this percentage given proper semantic markup.
> (Perhaps there should be a competition similar to the deep
> learning groups? PhDs have been awarded on incremental
> improvements of the percentage)
>
> This is a design-by-crawl approach to the semantic markup
> idea. The hope is to get something running this week that
> 'works' but giving due consideration to global and long-term
> issues. A first glance at CRC/NIST raises more questions
> than answers as is usual with any research.
>
> It IS a design goal to support a Computer Algebra Test Suite
> (http://axiom-developer.org/axiom-website/CATS). It is very
> tedious to hand construct test suites. It will be even more
> tedious to construct them "second-level" by doing semantic
> markup and then trying to use them as input, but the hope is
> that eventually the CRC/NIST/G&R, etc will eventually be
> published with semantics so computational mathematics can
> stop working from syntax.
>
> ===========
> Consideration 4: I/O transparency
>
> Assume for the moment that we take a latex file containing
> only formulas. We would like to be able to read this file so
> it has computational mathematics (CM) semantics.
>
> It is clear that there needs to be semantic tags that carry the
> information but these tags have to be carefully designed NOT
> to change the syntactic display. They may, as noted before,
> require multiple semantic versions for a single syntax.
>
> It is also clear that we would like to be able to output formulas
> with CM semantics where currently we only output syntax.
>
> ===========
> Consideration 5: I/O isomorphism
>
> An important property of selatex is an isomorphism with
> input/output. Axiom allows output forms to be defined for a
> variety of targets so this does not seem to be a problem. For
> input, however, this means that the reader has to know how
> to expand \INT{3} into the correct domain. This could be done
> with a stand-alone pre-processor from selatex->inputform.
>
> It should be possible to read-then-write an selatex formula,
> or write-then-read an selatex formula with identical semantics.
>
> That might not mean that the I/O is identical though due to
> things like variable ordering, etc.
>
> ===========
> Consideration 6: Latex semantic macros
>
> Semantic markup would be greatly simplified if selatex provided
> a mechanism similar to Axiom's ability to define types "on the fly"
> using either assignment
>
>    TYP:=FRAC(POLY(INT))
>
> or macro form
>
>    TYP ==> FRAC(POLY(INT))
>
> Latex is capable of doing this and selatex should probably include
> a set of pre-defined common markups, such as
>
>   \FRINT ==> \FRAC\INT
>
> ===========
> Consideration 7: selatex \begin{semantic} environment?
>
> Currently Axiom provides a 'chunk' environment which surrounds
> source code. The chunks are named so they can be extracted
> individually or in groups
>
>    \begin{chunk}{a name for the chunk}
>       anything
>    \end{chunk}
>
> We could provide a similar environment for semantics such as
>
>   \begin{semantics}{a name for the block}
>   \end{semantics}
>
> which would provide a way to encapsulate markup and also allow
> a particular block to be extracted in literate programming style.
>
> ===========
> Consideration 8: Latex-time processing
>
> Axiom currently creates specific files using \write to create
> intermediate files (e.g. for tables). This technique can be used
> to enhance latex-time debugging (where did it fail?).
>
> It can be used to create Axiom files which pre-construct domains
> needed when the input file with semantic markup is read.
>
> This would help a stand-alone selatex->inputform preprocessor.
>
> ===========
> Consideration 9: Design sketches
>
> It is all well-and-good to hand-wave at this idea but a large
> amount of this machinery already exists.
>
> It would seem useful to develop an incremental test suite that
> starts with "primitive" domains (e.g. INT), creating selatex I/O.
>
> Once these are in place we could work on "type tower" markup
> such as \FRAC\INT or \POLY\COMPLEX\FLOAT.
>
> Following that might be pre-existing latex functions like \int, \sum,
> \cos, etc.
>
> To validate these ideas Axiom will include an selatex.sty file and
> some unit tests files on primitive domain markup. That should be
> enough to start the bikeshed discussions.
>
> Ideas? Considerations? Suggestions?
>
> Tim
>
> [0] Fateman, Richard J.
> "A Critique of OpenMath and Thoughts on
> Encoding Mathematics, January, 2001"
> https://people.eecs.berkeley.edu/~fateman/papers/openmathcrit.pdf
>
> [1] Fateman, Richard J.
> "Verbs, Nouns, and Computer Algebra, or What's Grammar Got to
> do with Math? ", December 18, 2008
> https://people.eecs.berkeley.edu/~fateman/papers/nounverbmac.pdf
>
> [2] Fateman, Richard J.
> "Parsing TeX into Mathematics",
> https://people.eecs.berkeley.edu/~fateman/papers/parsing_tex.pdf
>
>
>

_______________________________________________
Axiom-developer mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/axiom-developer

Re: [Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)

Reply via email to