Re: [Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)

Tim Daly Thu, 25 Aug 2016 11:51:55 -0700

William,

It is unlikely that authors will provide a special chunk for Axiom in
papers.


Such an ability already exists but I don't expect anyone will adopt it.
The \usepackage{axiom} and \begin{chunk} / \end{chunk} pair work and is
all that is needed.

The primary target of this effort (although not restricted to them) are the
various collections (NIST/CRC/G&R/Jeffrey/etc) of formulas. I use these
references to create the computer algebra test suite but it takes months
to do this by hand. I also use them to build regression tests for Axiom.

There have been various attempts to extract semantics from latex. Some
are quite interesting (see http://mathpix.com). Unfortunately, there isn't
enough information in the latex. For instance, are your formulas given
over the real or complex domain?

In the longer term I am campaigning to bend these tomes toward a
more computational mathematics basis. Instead of listing the names of
20 invariant graph algorithms we really need reference versions of the
algorithms. And we need them in machine-readable form. And we need
them now so a whole generation of computational mathematicians do
not write yet-another-CAS from scratch.

Tim









On Thu, Aug 25, 2016 at 9:13 AM, William Sit <[email protected]> wrote:

> Hi Tim:
>
>
> Would it be simpler to only add semantic markups to algorithmic
> descriptions in papers? Authors can be asked to provide a separate chunk
> with [Axiom] semantic markups (in essence, a skeleton implementation or
> pseudo-code of the algorithm involved---skeleton because the data
> structures of mathematical objects are usually ignored in a math paper).
> This would avoid having to mess with the latex source (already hard to read
> sometimes) or to "weave" to remove the semantic markups to recapture the
> latex: all that is needed would be to ignore the semantic chunk). Put
> another way, the semantic chunk is a direct (by hand or
> automatic) translation of the latex version of an algorithm chunk.
>
>
> Also, what would be the scope of the added semantic macros in LaTeX (like
> \AD, \INT)? Can their scope be limited only to semantic chunks?
>
>
> William
>
>
> William Sit
> Professor Emeritus
> Department of Mathematics
> The City College of The City University of New York
> New York, NY 10031
> homepage: wsit.ccny.cuny.edu
> ------------------------------
> *From:* Axiom-developer <axiom-developer-bounces+wyscc=sci.ccny.cuny.edu@
> nongnu.org> on behalf of Tim Daly <[email protected]>
> *Sent:* Thursday, August 25, 2016 6:17 AM
> *To:* Dan Zwillinger
> *Cc:* Richard Fateman; James Davenport; [email protected];
> Mike Dewar; axiom-dev; [email protected]
> *Subject:* Re: [Axiom-developer] Design Thoughts on Semantic Latex
> (SELATEX)
>
> My initial approach was too heavy-handed and Axiom specific.
>
> It seems the semantic markup task can be viewed as an editor
> task. Editors don't care about semantics, they just work on text.
> Viewed this way the markup's only function is decoration for
> post-processing. It is (mostly) system independent.
>
> The edit tasks seem to be {delete, insert, replace} and some
> post-markup hints {function, type}
>
> The delete markup tells weaver to remove the latex completely.
> This is useful for things like {dx}
>
> The insert markup adds missing semantic text, such as {*} which
> is needed to indicate multiplication.
>
> The replace markup gives alternate text for weaver, for things where
> the function name might differ, e.g. \int -> integrate
>
> The function markup names a function that weaver should call
> which allows special handling in post-processing. It can be any
> s-expression (the weaver implementation is currently lisp-based).
>
> The type markup passes type information to weaver so Axiom
> knows the target type, useful for things like matrix.
>
> The macros are
>
> \newcommand{\AD}[1]{#1}% delete
> \newcommand{\AI}[1]{}% insert
> \newcommand{\AR}[2]{#1}% replace
>
> \newcommand{\AF}[2]{#1}% function
> \newcommand{\AT}[2]{#1}% type
>
> Note that \AI outputs nothing, so 3\AI{*}x == 3x to latex.
>
> \AD tells weaver to delete the text, e.g. {\AD ~dx} == ~dx to latex
>
> \AR tells weaver to replace the text e.g. {\AR \pi}{\%pi}
>
> \AF tells weaver to call a function, e.g. one that knows how to
> rewrite the input in a special way, or does tracing, etc.
>
> \AT adds target type information for Axiom
> e.g. \AT{3x+6}{POLY(INT)} == 3x+6 to latex but passes it as
> a Polynomial(Integer) to Axiom
>
> For example,
>
> $\int{\frac{dx}{ax*b}}$
>
> becomes
>
> $\AT{\AR{\int}{integrate}{\frac{\AD{dx}}{a\AI{*)x+b}}}{EXPR}{INT}}$
>
> telling weaver that the target type (AT) is EXPR(INT),
> the \int is really integrate
> the dx is to be ignored and
> the ax+b should read a*x+b
>
> There is an obvious tradeoff of markup vs weaver.
> For example. \int might be known to weaver.
> Or expressions might call an equation rewriter to add {*}
>
> The markup could vary from almost nothing to massive detail
> depending on the downstream cleverness.
>
> This initial markup set seems sufficient to handle every task
> that requires semantics markup so far. The overhead seems
> small and the gain seems large.
>
> Now the only problem is post-processing the latex. Sigh.
>
> There is no such thing as a simple job.
>
> Tim
>
>
> On Tue, Aug 23, 2016 at 7:27 PM, Tim Daly <[email protected]> wrote:
>
>> For those of you at home wishing to play along, there is a
>> selatex.test1 file at
>> http://axiom-developer.org/axiom-website/selatex.test1
>> containing 620 integrals.
>>
>> Each line is a call of the form
>>
>>   weaver(latex-string,axiom-string)
>>
>> The goal is to transform the latex into Axiom.
>>
>> Implicit is the idea that weaver will use the selatex tokens
>> to disambiguate the input. The current file has no selatex
>> tokens. They will be added as needed. The idea is to keep
>> the problem simple by adding print-invisible sematics to the
>> latex-string. In the ideal case the weaver program is trivial,
>> as is the markup. Any tradeoff should prioritize simplicity.
>> Another priority is to align the semantic markup with
>> Axiom domains in order to ground the semantics with code.
>>
>> Once all of these calls translate correctly the Axiom output
>> routines need to output the latex-string with the added
>> semantic markup so the mapping is bi-directional.
>>
>> The current file only looks at integration as I already have
>> the latex->axiom text available. Future test files will look
>> at other areas of interest. The long term goal is to parse
>> NIST/CRC/etc formulas.
>>
>> Tim
>>
>>
>> On Sun, Aug 21, 2016 at 11:02 PM, Tim Daly <[email protected]> wrote:
>>
>>> Dan,
>>>
>>> While paging through the CRC 31st Standard Mathematical
>>> Tables I landed on page 219, section 3.4.1.2 Graph Invariants.
>>>
>>> It would be a vast improvement if there were algorithms
>>> associated with these invariants. Clearly they exist somewhere.
>>>
>>> To "cross the gap" between tables and computational mathematics
>>> it would be valuable to include implementations of these invariants.
>>>
>>> It is hard to walk away from that page. An Axiom implementation
>>> would be fun to write, especially given the next section that lists
>>> different kinds of graphs which, presumably, would all have the
>>> invariants. Even better, the graph algorithms are likely good
>>> candidates for proof technology (ACL2 if done in Lisp, COQ if
>>> done in Spad). Lisp has the advantage of an ANSI standard.
>>>
>>> It seems worthwhile to take sections like this, expand them
>>> across computational and proof tools, and publish them in a
>>> form that is generally useful. It is "nice to know" that a graph
>>> has a radius but it would be even better if I could "just point and
>>> click" to import the algorithm.
>>>
>>> Axiom has been pushing literate programming for years. The
>>> tools exist to "make it so", as the saying goes.
>>>
>>> Tim
>>>
>>>
>>>
>>>
>>> On Sun, Aug 21, 2016 at 10:40 PM, Tim Daly <[email protected]> wrote:
>>>
>>>> Like any research problem it is a struggle to get a useful grip on this.
>>>> Looking at G&R (I just ordered the latest, mine is 4th edition), the
>>>> task quickly gets out of hand.
>>>>
>>>> The CATS tests in the past were created by reading the printed latex
>>>> in various volumes and hand-translating them to Axiom input.
>>>>
>>>> It is not difficult to re-create the latex input for these examples.
>>>> Doing so and combining the results gives a set of examples with
>>>> matching input latex and output Axiom. The homework problem is
>>>> to write the necessary markup and weaver.
>>>>
>>>> It is immediately obvious that this is more challenging than it seems.
>>>> For example, when writing y'(x)=0, Axiom needs y:=operator 'y
>>>> so it knows about the symbol as an operator. This falls under
>>>> "Consideration 12: System Specific Commands"... which implies
>>>> that the latex environment and quoting macros have to be
>>>> implemented. Sigh.
>>>>
>>>> There is no such thing as a simple job.
>>>>
>>>> Anyway, at least there is a way to make a proof of concept
>>>> prototype that reproduces existing results.
>>>>
>>>> Tim
>>>>
>>>>
>>>> On Sun, Aug 21, 2016 at 4:17 PM, Tim Daly <[email protected]> wrote:
>>>>
>>>>> Dan,
>>>>>
>>>>> Welcome.
>>>>>
>>>>> Re: Howard Cohl. Yes, I'd like an introduction. It seems important to
>>>>> make DLMF, CRC, and other sources contain enough semantics that
>>>>> they can be read by a computer algebra system. There are an
>>>>> enormous number of issues, such as what to do with functions
>>>>> unknown to the CAS, which need to be thought through.
>>>>>
>>>>> I believe that NIST/CRC/G&R collections with semantic markup will
>>>>> have a great normalizing effect on CAS development since it will raise
>>>>> cross-platform questions like "What percent of G&R do you handle?".
>>>>> Albert Rich (RUBI)[0] has been doing this for integration using
>>>>> patterns.
>>>>> This can only benefit computational mathematics in the long term.
>>>>>
>>>>> I've also campaigned for associating algorithms with published tables.
>>>>> It is important in the long term to have reference versions. The ACM
>>>>> used to do this years ago. I'd like to see a Gruntz algorithm for
>>>>> limits
>>>>> where it can be applied, for instance. It would also provide a focus on
>>>>> "missing algorithms" or edge cases. Davenport/Trager/Bronstein
>>>>> algorithms promise a decision procedure but there are no existing
>>>>> complete implementations. The tables could highlight missing cases,
>>>>> giving focus to efforts to complete the procedure.
>>>>>
>>>>> It will also put back-pressure on the tables to define different
>>>>> versions
>>>>> of the same formulas based on domains (C, R, etc).
>>>>>
>>>>>
>>>>> "The GR work was more than I had anticipated"... wins the award for
>>>>> understatement of the decade.
>>>>>
>>>>> The goal of this effort is to make it possible to read those
>>>>> formulas directly into a CAS. Axiom is my primary target but
>>>>> it should be done in a somewhat system agnostic form.
>>>>>
>>>>> I've spent well over a year creating the computer algebra test suite.
>>>>> It would be so much easier and more useful if the original sources
>>>>> could be read directly.
>>>>>
>>>>> I read your paper. There is an interesting mix of syntax and semantics.
>>>>>
>>>>> I guess the difference in this effort is that the semantic markup is
>>>>> intended to be transparent and grounded. The transparent aspect
>>>>> keeps the tables unchanged in print form. The grounded aspect keeps
>>>>> the semantics based on some algorithmic base. This implies, of
>>>>> course, that there IS an algorithmic base which does not yet exist
>>>>> for some of the work.
>>>>>
>>>>> As you can see from the CATS work I've been trying to validate
>>>>> Axiom's results with respect to published texts. This has found both
>>>>> Axiom bugs and misprints in published texts.
>>>>>
>>>>> The Kamke[1] suite was the first effort for differential equations.
>>>>>
>>>>> The Spiegel[2] chapter 14 on indefinite integrals for integration.
>>>>>
>>>>> The von Seggern[3] book on curves and surfaces for graphics.
>>>>>
>>>>> The Legendre and Grazini[4] on Pasta by Design for 3D graphics.
>>>>>
>>>>> The RUBI work on integration.
>>>>>
>>>>> and, currently I'm re-creating the numerics that were lost when NAG
>>>>> released the open source version, leaving me swimming through
>>>>> Luke's[5] Algorithms book.
>>>>>
>>>>> which, to quote a famous phrase "was more than I had anticipated".
>>>>>
>>>>> Your Handbook of Integration[6] has a section on various known
>>>>> "Caveats, How an integration result may be incorrect". This raises the
>>>>> wonderful topic of branch cuts yet again. I did some testing and it
>>>>> seems that Axiom and Mathematica share one set while Maple and
>>>>> Maxima share another.
>>>>>
>>>>> All of which leads to a need to create better reference materials that
>>>>> are generally available (unlike the ACM algorithms for non-paying
>>>>> customers) and directly useful for computational mathematics.
>>>>>
>>>>> The current plan is to take some tables, find or re-create the latex,
>>>>> invent a semantic markup, and then write the "weaver". At this point
>>>>> the research is still at the "proof of concept" stage. (tex formula
>>>>> sources are most welcome).
>>>>>
>>>>> Ultimately I'd really like to see a book of formulas and algorithms
>>>>> that I can just drag-and-drop into Axiom and be able to use them
>>>>> without lifetimes of work.
>>>>>
>>>>> Actually, that 's only the penultimate goal. I have augmented
>>>>> Axiom to include proofs (ACL2,COQ) so I'd also like to see proofs,
>>>>> (this IS mathematics, after all) but maybe we'll leave that for
>>>>> next month :-)
>>>>>
>>>>> Tim
>>>>>
>>>>> [0] Rich, Albert "Rule-based Mathematics"
>>>>> http://www.apmaths.uwo.ca/~arich/
>>>>>
>>>>> [1] Kamke. E. "Differentialgleichungen Losungsmethoden und Losungen"
>>>>> Chelsea Publishing Company, New York, 1959
>>>>>
>>>>> [2] Spiegel, Murray R. "Mathematical Handbook", Schaum's Outline
>>>>> Series; McGraw-Hill Book Company 1968
>>>>>
>>>>> [3] von Seggern, David "CRC Standard Curves and Surfaces",
>>>>> CRC Press, 1993 ISBN 0-8493-0196-3
>>>>>
>>>>> [4] Legendre, George L. and Grazini, Stefano "Pasta by Design",
>>>>> Thames and Hudson, 2001
>>>>>
>>>>> [5] Luke, Yudell "Algorithms for the Computation of Mathematical
>>>>> Functions", Academic Press, 1977 ISBN 0-12-459940-6
>>>>>
>>>>> [6] Zwillinger, Daniel "Handbook of Integration" Jones and Bartlett,
>>>>> London, 1992
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 21, 2016 at 10:16 AM, Dan Zwillinger <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> All
>>>>>>
>>>>>> I began reading this topic's emails when they first appeared, and
>>>>>> then fell behind.
>>>>>> Sorry for my late comments.
>>>>>>
>>>>>> I admire your efforts.
>>>>>> They may be somewhat related to what I have done in the past.
>>>>>> My experience is as follows:
>>>>>>
>>>>>> (1) CRC SMTF (Standard Mathematical Tables and Formula)
>>>>>>
>>>>>> I put the ~700 integrals in CRC's SMTF into a format from which
>>>>>> (A) they could be typeset in LaTeX
>>>>>> (B) they could be converted into Mathematica (which either did a
>>>>>> symbolic or numeric computation) - and this was done
>>>>>>
>>>>>> I let Richard Fateman use them for his experiments.
>>>>>>
>>>>>>
>>>>>> (2) Elsevier's GR (Gradshteyn and Ryzhik's "Table of Integrals,
>>>>>> Series, and Products")
>>>>>>
>>>>>> I put the ~12,000 (if my memory is correct) integrals into a format
>>>>>> from which
>>>>>> (A) they could be beautifully typeset in LaTeX
>>>>>> (B) they could be converted into Mathematica - and this was NOT done
>>>>>> Enclosed is a PDF file describing the work and the resulting format.
>>>>>>
>>>>>>
>>>>>> A different format was used for SMTF and GR.
>>>>>> While the SMTF work was not too arduous, the GR work was more than I
>>>>>> had anticipated.
>>>>>> The input (the previous version of GR) had little syntactic structure
>>>>>> and it took much effort to get it into shape.
>>>>>> I used (many different) regular expressions (in perl) to translate
>>>>>> the bulk of the book, and then lots of hand tuning.
>>>>>>
>>>>>> While I think you are well beyond my thinking on these topics, please
>>>>>> let me know if I can help.
>>>>>> I am friends with Howard Cohl at NIST, who may be the current lead
>>>>>> for DLMF ("Digital Library of Mathematical Functions" at
>>>>>> dlmf.nist.gov).
>>>>>> Let me know if you need an introduction.
>>>>>>
>>>>>>
>>>>>> Dan [email protected]
>>>>>>
>>>>>> On 8/20/2016 11:30 PM, Tim Daly wrote:
>>>>>>
>>>>>> The game is to define latex markup that is transparent to the syntax
>>>>>> but adds semantics for post processing.
>>>>>>
>>>>>> Some simple tests show that this works. Suppose selatex.sty contains:
>>>>>>
>>>>>> \newcommand{\INT}[1]{#1}
>>>>>> \newcommand{\VARIABLE}[1]{#1}
>>>>>> \newcommand{\POLY}[1]{#1}
>>>>>> \newcommand{\INTEG}[2]{\int{#1}}
>>>>>>
>>>>>> This defines 4 new latex markups. The number in the square brackets
>>>>>> defines the number of expected arguments. The brace argument
>>>>>> delimites the characters that will occur during expansion with the #1
>>>>>> replaced by the first argument.
>>>>>>
>>>>>> (As an aside, INT, VARIABLE, and POLY just happen to be valid
>>>>>> Axiom domain abbreviations, hence the name choice. This choice
>>>>>> of names gives grounding to the semantics.)
>>>>>>
>>>>>> Notice that \INTEG takes two arguments but will display only one.
>>>>>> This allows the variable of integration to be passed in the semantics
>>>>>> without showing up in the output. This allows the semantics to carry
>>>>>> additional, non-display information needed by the CAS.
>>>>>>
>>>>>> Some examples follow.
>>>>>>
>>>>>> An integer 3 can be wrapped as \INT{3} but will still display as 3.
>>>>>>
>>>>>> A variable x can be wrapped as \VARIABLE{x}, displayed as x.
>>>>>>
>>>>>> $\POLY{\INT{3}\VARIABLE{x}}$ will display as 3*x
>>>>>>
>>>>>> $\INTEG{\POLY{\INT{3}\VARIABLE{x}~dx}}{x} will be the same result
>>>>>> as $\int{3x~dx}$, that is, an
>>>>>>   (integralsign) 3x dx
>>>>>> but notice that the variable of integration is in the semantic markup.
>>>>>>
>>>>>> These trivial macros can be made less verbose and certainly
>>>>>> more clever but that's not the point being made here.
>>>>>>
>>>>>> A 'weaver' program would see the integration expression as
>>>>>>
>>>>>> $\INTEG{\POLY{\INT{3}\VARIABLE{x}~dx}}{x}$
>>>>>>
>>>>>> with all of the semantic tags. The weaver's job is to rewrite this
>>>>>> expression into an inputform for the CAS. In Axiom that would be
>>>>>>
>>>>>> integrate(3*x,x)
>>>>>>
>>>>>> The semantics markup makes the display pretty and the semantic
>>>>>> parsing possible. Depending on the system, more or less parsing
>>>>>> markup can exist. Axiom, for example, would not need the \INT or
>>>>>> \VARIABLE to get a correct parse so the expression could be
>>>>>> $\INTEG{3x~dx}{x}$
>>>>>>
>>>>>> This validates the fundamental idea.
>>>>>>
>>>>>> The next step is to write a simple weaver program. The clever path
>>>>>> would be to embed a declarative form of the parser syntax (BNF?)
>>>>>> as comments in selatex.sty. That way the latex semantics and the
>>>>>> weaver syntax are kept in sync.
>>>>>>
>>>>>> Weaver would read the BNF comments from selatex.sty and
>>>>>> the formula with semantic markup as input and parse the semantic
>>>>>> markup into inputforms. (Wish I thought of this homework problem
>>>>>> when I taught the compiler course :-) ).
>>>>>>
>>>>>> Note that, depending on the BNF, weaver could be used to
>>>>>> generate output for Maxima's tree-based representation.
>>>>>>
>>>>>> An alternative next step is to look at a CRC book, re-create the
>>>>>> syntactic latex and then create the selatex.sty entries necessary
>>>>>> to generate weaver input.
>>>>>>
>>>>>> Infinitesimal progress, but progress non-the-less.
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 19, 2016 at 12:45 AM, Tim Daly <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> One of the Axiom project goals is to develop a "Computer Algebra
>>>>>>> Test
>>>>>>> Suite" (CATS). Albert Rich has done this with RUBI and integration.
>>>>>>> That
>>>>>>> work is already partially in the test suite and work has been done
>>>>>>> on the
>>>>>>> pattern matching. Large datasets (like Kamke) are always welcome.
>>>>>>> Some,
>>>>>>> such as Schaums were hand-developed. This is tedious.
>>>>>>>
>>>>>>> As Axiom develops more explanations and documentation it would be
>>>>>>> useful to execute the formulas directly so there is a local
>>>>>>> incentive to be
>>>>>>> clear about semantics.
>>>>>>>
>>>>>>> In the long term the hope is that we can just grab formulas directly
>>>>>>> from
>>>>>>> their sources (ala literate programming). Your work makes it plain
>>>>>>> that raw
>>>>>>> latex does not carry sufficient semantics. There is a global
>>>>>>> question of
>>>>>>> how to make this work. Unfortunately a general cross-platform
>>>>>>> solution
>>>>>>> is difficult (cite Dewar/Davenport/et al. for OpenMath).
>>>>>>>
>>>>>>> Since Axiom is literate and extracting formulas is trivial it seems
>>>>>>> that
>>>>>>> literate markup is a natural goal. Since Axiom uses abstract algebra
>>>>>>> as a scaffold the type tower already has a lot of axiomatic
>>>>>>> semantics.
>>>>>>> The natural join of literate latex and abstract algebra is clearly
>>>>>>> semantic markup, aka selatex.
>>>>>>>
>>>>>>> ===========
>>>>>>> Consideration 10: semantic->inputform translation (weaver? :-) )
>>>>>>>
>>>>>>> >x and not x   has no particular meaning,  but if x is explicitly
>>>>>>> true or false,
>>>>>>> >Maxima simplifies it to false.  If SEALATEX has a semantics -- are
>>>>>>> you
>>>>>>> >defining yet another CAS?  Or perhaps you should link it 100% to
>>>>>>> Axiom's
>>>>>>> >semantics, which you presumably know about and can modify.
>>>>>>>
>>>>>>> I am NOT defining another CAS. The goal is a "well-designed hack"
>>>>>>> using
>>>>>>> universally understood latex, a latex package, and a translation
>>>>>>> program.
>>>>>>>
>>>>>>> The selatex idea is only partially Axiom specific. \INT, for
>>>>>>> instance, seems
>>>>>>> pretty generic. However, if the idea is to read formulas and
>>>>>>> disambiguate
>>>>>>> a=b (boolean) vs a=b (equation) then the markup needs to be grounded
>>>>>>> to have meaning. Axiom's domains (BOOLEAN) and (EQ) as the ground
>>>>>>>
>>>>>>> \BOOLEAN(a=b)
>>>>>>> \EQ(a=b)
>>>>>>>
>>>>>>> are unambiguous relative to each other in Axiom. I don't know enough
>>>>>>> about Maxima to understand how this might translate.
>>>>>>>
>>>>>>> The extracted formulas with the decorated semantics still needs a
>>>>>>> semantics->inputform (weaver) pre-processor which could be Maxima
>>>>>>> specific. This would lead to debate about what "equality" means, of
>>>>>>> course.
>>>>>>>
>>>>>>> Axiom has tried to create a first-order "rosetta stone" to translate
>>>>>>> between
>>>>>>> systems (rosetta.pdf [1]) but it is too shallow to consider
>>>>>>> providing
>>>>>>> cross-platform semantics.
>>>>>>>
>>>>>>> =============
>>>>>>> Consideration 11: \scope in selatex
>>>>>>>
>>>>>>> >As far as recording stuff in DLMF -- there are presumably scope
>>>>>>> issues
>>>>>>> >("in this chapter n,m are natural numbers....")  and maybe even a
>>>>>>> need
>>>>>>> >to make value assignments.
>>>>>>> >I think you need to model these in SEALATEX too.
>>>>>>>
>>>>>>> (See Consideration 6)
>>>>>>>
>>>>>>> Clearly there are scoping issues. My current thinking is to create a
>>>>>>> \scope markup that would manage the environment(s). This is not
>>>>>>> a new issue (see "Lisp in Small Pieces" [0])
>>>>>>>
>>>>>>> There seem to be three concerns.
>>>>>>>
>>>>>>> First is the scope name, with something like 'global' as a keyword.
>>>>>>> Second is the "closure chain" of other scopes.
>>>>>>> Third is the symbol being scoped.
>>>>>>>
>>>>>>> \scope{name}{chain}{symbol}
>>>>>>>
>>>>>>> The weaver program would walk this chain to create the proper
>>>>>>> file syntax for system input.
>>>>>>>
>>>>>>> ============
>>>>>>> Consideration 12: System specific commands \axiom
>>>>>>>
>>>>>>> Along with the formulas it is clear that some system specific
>>>>>>> input may be required, such as loading files, clearing workspaces,
>>>>>>> etc. Some of these may be done in the weaver program, such as
>>>>>>> between formulas. Others may need to be added to the semantics
>>>>>>> block. So a markup that provides verbatim quoting per system
>>>>>>> might be defined, e.g.
>>>>>>>
>>>>>>> \axiom{)clear all}  %clear the workspace
>>>>>>>
>>>>>>> which would simply quote an input line.
>>>>>>>
>>>>>>> ==============
>>>>>>>
>>>>>>> Note that so far all that is being suggested is transparent formula
>>>>>>> markups which do not impact the presentation, some special tags
>>>>>>> (\scope, \axiom,...) and a weaver program, along with the ability to
>>>>>>> read the latex and extract named formulas (aka a literate program,
>>>>>>> which Axiom already can do).
>>>>>>>
>>>>>>> It ought to be possible (by design) to create a semantic version of
>>>>>>> CRC that any system could import, assuming a "sufficiently clever
>>>>>>> weaver".
>>>>>>>
>>>>>>> On a more ambitious note, I am trying to find a way to keep the
>>>>>>> selatex
>>>>>>> markup "hidden" in a pdf and use it as the clipboard paste when the
>>>>>>> formula is selected. Anyone with a clue, please help.
>>>>>>>
>>>>>>> ===============
>>>>>>>
>>>>>>> [0] Queinnec, Christopher, "Lisp in Small Pieces" ISBN
>>>>>>> 978-0521545662
>>>>>>> (2003)
>>>>>>>
>>>>>>> [1] Wester, Michael J. and Daly, TImothy "Rosetta"
>>>>>>> http://axiom-developer.org/axiom-website/rosetta.pdf
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 18, 2016 at 5:30 PM, Richard Fateman <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> thanks for all the references :)
>>>>>>>>
>>>>>>>> I'm not sure if I'm going to repeat comments I  made already
>>>>>>>> somewhere.
>>>>>>>> First,  has Dan Zwillinger weighed in?  I think that it would be
>>>>>>>> useful
>>>>>>>> to see what he has done.
>>>>>>>>
>>>>>>>> Next, there are ambiguities among CAS and even within a single CAS.
>>>>>>>>
>>>>>>>> For example, in Macsyma/ Maxima  there is generally no semantics
>>>>>>>> associated with "=" or ">".   But in some contexts, there is some
>>>>>>>> meaning.
>>>>>>>>
>>>>>>>> x>2*y
>>>>>>>>
>>>>>>>> is a tree expression.  It is not associated with x or with y.
>>>>>>>>
>>>>>>>> assume(x>2*y)   does mean something ... it puts info in a database.
>>>>>>>> Somehow encoding the method to extract this information into
>>>>>>>> SEALATEX
>>>>>>>> (SeLaTeX?) in a CAS-independent way -- that's quite a task.   In
>>>>>>>> particular, it would seem to require an understanding of what
>>>>>>>> assume()
>>>>>>>> does in Maxima, and what is() does also.
>>>>>>>>
>>>>>>>> x and not x   has no particular meaning,  but if x is explicitly
>>>>>>>> true or false,
>>>>>>>> Maxima simplifies it to false.  If SEALATEX has a semantics -- are
>>>>>>>> you
>>>>>>>> defining yet another CAS?  Or perhaps you should link it 100% to
>>>>>>>> Axiom's
>>>>>>>> semantics, which you presumably know about and can modify.
>>>>>>>>
>>>>>>>> As far as recording stuff in DLMF -- there are presumably scope
>>>>>>>> issues
>>>>>>>> ("in this chapter n,m are natural numbers....")  and maybe even a
>>>>>>>> need
>>>>>>>> to make value assignments.
>>>>>>>> I think you need to model these in SEALATEX too.
>>>>>>>>
>>>>>>>> Just musing about where you are heading.
>>>>>>>>
>>>>>>>> RJF
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/18/2016 11:45 AM, Tim Daly wrote:
>>>>>>>>
>>>>>>>> Fateman [0] raised a set of issues with the OpenMath
>>>>>>>> approach. We are not trying to be cross-platform in this
>>>>>>>> effort. Axiom does provide an algebraic scaffold so it is
>>>>>>>> possible that the selatex markup might be useful elsewhere
>>>>>>>> but that is not a design criterion.
>>>>>>>>
>>>>>>>> Fateman[1] also raises some difficult cross-platform issues
>>>>>>>> that are not part of this design.
>>>>>>>>
>>>>>>>> Fateman[2] shows that parsing tex with only syntactic markup
>>>>>>>> succeeded on only 43% of 10740 inputs. It ought to be posible
>>>>>>>> to increase this percentage given proper semantic markup.
>>>>>>>> (Perhaps there should be a competition similar to the deep
>>>>>>>> learning groups? PhDs have been awarded on incremental
>>>>>>>> improvements of the percentage)
>>>>>>>>
>>>>>>>> This is a design-by-crawl approach to the semantic markup
>>>>>>>> idea. The hope is to get something running this week that
>>>>>>>> 'works' but giving due consideration to global and long-term
>>>>>>>> issues. A first glance at CRC/NIST raises more questions
>>>>>>>> than answers as is usual with any research.
>>>>>>>>
>>>>>>>> It IS a design goal to support a Computer Algebra Test Suite
>>>>>>>> (http://axiom-developer.org/axiom-website/CATS). It is very
>>>>>>>> tedious to hand construct test suites. It will be even more
>>>>>>>> tedious to construct them "second-level" by doing semantic
>>>>>>>> markup and then trying to use them as input, but the hope is
>>>>>>>> that eventually the CRC/NIST/G&R, etc will eventually be
>>>>>>>> published with semantics so computational mathematics can
>>>>>>>> stop working from syntax.
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 4: I/O transparency
>>>>>>>>
>>>>>>>> Assume for the moment that we take a latex file containing
>>>>>>>> only formulas. We would like to be able to read this file so
>>>>>>>> it has computational mathematics (CM) semantics.
>>>>>>>>
>>>>>>>> It is clear that there needs to be semantic tags that carry the
>>>>>>>> information but these tags have to be carefully designed NOT
>>>>>>>> to change the syntactic display. They may, as noted before,
>>>>>>>> require multiple semantic versions for a single syntax.
>>>>>>>>
>>>>>>>> It is also clear that we would like to be able to output formulas
>>>>>>>> with CM semantics where currently we only output syntax.
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 5: I/O isomorphism
>>>>>>>>
>>>>>>>> An important property of selatex is an isomorphism with
>>>>>>>> input/output. Axiom allows output forms to be defined for a
>>>>>>>> variety of targets so this does not seem to be a problem. For
>>>>>>>> input, however, this means that the reader has to know how
>>>>>>>> to expand \INT{3} into the correct domain. This could be done
>>>>>>>> with a stand-alone pre-processor from selatex->inputform.
>>>>>>>>
>>>>>>>> It should be possible to read-then-write an selatex formula,
>>>>>>>> or write-then-read an selatex formula with identical semantics.
>>>>>>>>
>>>>>>>> That might not mean that the I/O is identical though due to
>>>>>>>> things like variable ordering, etc.
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 6: Latex semantic macros
>>>>>>>>
>>>>>>>> Semantic markup would be greatly simplified if selatex provided
>>>>>>>> a mechanism similar to Axiom's ability to define types "on the fly"
>>>>>>>> using either assignment
>>>>>>>>
>>>>>>>>    TYP:=FRAC(POLY(INT))
>>>>>>>>
>>>>>>>> or macro form
>>>>>>>>
>>>>>>>>    TYP ==> FRAC(POLY(INT))
>>>>>>>>
>>>>>>>> Latex is capable of doing this and selatex should probably include
>>>>>>>> a set of pre-defined common markups, such as
>>>>>>>>
>>>>>>>>   \FRINT ==> \FRAC\INT
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 7: selatex \begin{semantic} environment?
>>>>>>>>
>>>>>>>> Currently Axiom provides a 'chunk' environment which surrounds
>>>>>>>> source code. The chunks are named so they can be extracted
>>>>>>>> individually or in groups
>>>>>>>>
>>>>>>>>    \begin{chunk}{a name for the chunk}
>>>>>>>>       anything
>>>>>>>>    \end{chunk}
>>>>>>>>
>>>>>>>> We could provide a similar environment for semantics such as
>>>>>>>>
>>>>>>>>   \begin{semantics}{a name for the block}
>>>>>>>>   \end{semantics}
>>>>>>>>
>>>>>>>> which would provide a way to encapsulate markup and also allow
>>>>>>>> a particular block to be extracted in literate programming style.
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 8: Latex-time processing
>>>>>>>>
>>>>>>>> Axiom currently creates specific files using \write to create
>>>>>>>> intermediate files (e.g. for tables). This technique can be used
>>>>>>>> to enhance latex-time debugging (where did it fail?).
>>>>>>>>
>>>>>>>> It can be used to create Axiom files which pre-construct domains
>>>>>>>> needed when the input file with semantic markup is read.
>>>>>>>>
>>>>>>>> This would help a stand-alone selatex->inputform preprocessor.
>>>>>>>>
>>>>>>>> ===========
>>>>>>>> Consideration 9: Design sketches
>>>>>>>>
>>>>>>>> It is all well-and-good to hand-wave at this idea but a large
>>>>>>>> amount of this machinery already exists.
>>>>>>>>
>>>>>>>> It would seem useful to develop an incremental test suite that
>>>>>>>> starts with "primitive" domains (e.g. INT), creating selatex I/O.
>>>>>>>>
>>>>>>>> Once these are in place we could work on "type tower" markup
>>>>>>>> such as \FRAC\INT or \POLY\COMPLEX\FLOAT.
>>>>>>>>
>>>>>>>> Following that might be pre-existing latex functions like \int,
>>>>>>>> \sum,
>>>>>>>> \cos, etc.
>>>>>>>>
>>>>>>>> To validate these ideas Axiom will include an selatex.sty file and
>>>>>>>> some unit tests files on primitive domain markup. That should be
>>>>>>>> enough to start the bikeshed discussions.
>>>>>>>>
>>>>>>>> Ideas? Considerations? Suggestions?
>>>>>>>>
>>>>>>>> Tim
>>>>>>>>
>>>>>>>> [0] Fateman, Richard J.
>>>>>>>> "A Critique of OpenMath and Thoughts on
>>>>>>>> Encoding Mathematics, January, 2001"
>>>>>>>> https://people.eecs.berkeley.edu/~fateman/papers/openmathcrit.pdf
>>>>>>>>
>>>>>>>> [1] Fateman, Richard J.
>>>>>>>> "Verbs, Nouns, and Computer Algebra, or What's Grammar Got to
>>>>>>>> do with Math? ", December 18, 2008
>>>>>>>> https://people.eecs.berkeley.edu/~fateman/papers/nounverbmac.pdf
>>>>>>>>
>>>>>>>> [2] Fateman, Richard J.
>>>>>>>> "Parsing TeX into Mathematics",
>>>>>>>> https://people.eecs.berkeley.edu/~fateman/papers/parsing_tex.pdf
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
Axiom-developer mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/axiom-developer

Re: [Axiom-developer] Design Thoughts on Semantic Latex (SELATEX)

Reply via email to