Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 26 February 2011 04:32, rjf fate...@gmail.com wrote: On Feb 25, 4:28 pm, David Kirkby david.kir...@onetel.net wrote: Of course, creating the BNF is a non-trivial task, but it seems the descriptions of most languages don't actually include a BNF. I think you can find a formal grammar for almost every computer programming language except for Mathematica, which presumably has a grammar but it is secret. Look at it this way: with such a formal description it is relatively easy to write a parser, and to be assured that the parser corresponds to the grammar, which a computer scientist would use to help design a language. Most people would run the grammar through a parser generator and eliminate constructions that are not consistent with the needs of the parser generator. It cleans up ambiguities, among other things. Since you were planning to write a parser, why not write the grammar first... I was hoping someone might have done it, as the task seems non-trivial. Did you write one for MockMMA, and if so are you willing to share it? I would have expected that in your perusal of compiler books you would get this idea. Yes, I did, though geting it from reading the Mathematica docs is a lot more difficult than generating one for a language one chose to design oneself. Perhaps one can be found for C, but it's not in KR book. C Programming Language (2nd Edition) (Paperback) by Brian W. Kernighan, Dennis M. Ritchie page 234 et seq I overlooked that. Thank you for the correction. Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 26, 5:37 am, David Kirkby david.kir...@onetel.net wrote: I was hoping someone might have done it, [Write a grammar for Mathematica] as the task seems non-trivial. Did you write one for MockMMA, and if so are you willing to share it? 1. I (with a student) looked at the task and wrote a partial grammar, but despaired of completing it as we found too many ad hoc patches. Now it may be that someone has devised a complete and accurate grammar, especially one suited to some automated parser generator, e.g. LALR(1) or LL(1) or GLR but we were not clever enough to find a neat one. I believe, and this is a contentious point, it is likely for someone to think he has a complete and accurate grammar, even if it is not complete or accurate. People found errors in my parser (since corrected), after several years. There may still be errors. And now there are deliberate differences because I found more convenient notation, e.g. for comparisons. 2. The material in the file http://www.cs.berkeley.edu/~fateman/lisp/mma4max/parser.lisp is source code for a parser. The parser is a recursive descent parser with programs named in a fairly uniform fashion from which one can generate a grammar. e.g. parse-dot, parse-power, But the actions associated with some of these reduction routines are nasty, and in some cases depend critically on material which should have been abstracted out of the grammar into the lexical phase. You (or anyone) is welcome to read up on recursive descent parsing and run the algorithm for generating a R.D. parser in reverse, and generate a grammar. But it won't be complete and accurate because the component subroutines do screwy things, like looking ahead for the occasional odd character, -- not token, but character. Now it may be possible to do all this with a parser; indeed it most assuredly IS possible, but the obvious grammar might be quite huge. A cleverly-encoded grammar might be smaller, but I think it would not be trivial to do that. Among other things it would have to take into account that blank space is sometimes a token, but sometimes not. 3. I am of course perfectly willing to share the code. It has, in fact, even been used in a proposal to the NSF under some small business initiation grant. I had nothing to do with the proposal, and the proposers did not seek my permission (not that they needed it). I'm sure that this discussion of the Mathematica syntax could be placed in a more accessible location, but who knows. Here's a puzzle. a.b.c is parsed as Dot[a,b,c] or in Lisp, (Dot a b c). 1 . 2 . 3 is parsed as Dot[1,2,3], and is displayed by Mathematica as 1.2.3 ... Observe the white-space or the lack of it around the dots. But if you type in 1.2.3 what do you think you get? And what about 1. 2.3 ? For those who do not have Mathematica handy 1.2.3 comes out as 0.36 And if you type it in to WolframAlpha you get 6. If you type it in to MockMMa, you get 6/5.3 (that is, (Dot 6/5 3). Note that 1.2 is converted to 6/5. What do you suppose Ira's parser produces? It is not the first case in which the person who wrote the display for Mathematica gives a result which is apparently at odds with the input syntax. [the C grammar] I overlooked that. Thank you for the correction. You are welcome. I think the important point I might re-emphasize is this: Parsing Mathematica syntax with a Sage back end does not get you very far. Any non-trivial piece of Mathematica will require a mathematica evaluator with matching, binding of variables, etc. while parsing Sin[x] to get sin(x) is trivial, it is mostly pointless. You might think that converting Integrate[f,x] into integrate(f,x) gets you somewhere, but the accessible syntax for translating f into Sage (or Maxima) consists of +,*,/, [,], and the typical functions like sin, cos, tan. Why would someone write in Mathematica syntax in the first place? Wolfram Alpha seems to not expect Mathematica syntax -- its parser seems to disagree. RJF -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
It was late and I transcribed the wrong shell output. Here is the parse of Fatemans' example. C:\DMS\Domains\Mathematica\Tools\Parser\Sourcetype C:\DMS\Domains \Mathematica\Examples\Fateman.m r[s[]] C:\DMS\Domains\Mathematica\Tools\Parser\Sourcerun ../domainparser + +AST C:\DMS\Domains\Mathematica\Examples\Fateman.m Domain Parser for Mathematica 2.3.3 Copyright (C) Semantic Designs 1996-2010; All Rights Reserved 25 tree nodes in tree. (Mathematica@Mathematica=1#481c6a0^0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m (Commands@Mathematica=3#481c3e0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m (Commands@Mathematica=3#481c660 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m (Commands@Mathematica=2#4819dc0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m)Commands (Command@Mathematica=5#481ca20 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m |(ExpressionSequence@Mathematica=17#481c840 Line 1 Column 1 File C:/ DMS/Domains/Mathematica/Examples/Fateman.m | (Rule@Mathematica=29#481c5a0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m | (Disjunction@Mathematica=34#481cae0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m | (Conjunction@Mathematica=36#481c920 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m | |(EqualitySequence@Mathematica=38#481c6e0 Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | (Sum@Mathematica=56#481c680 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m | | (Primary@Mathematica=108#481c5c0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m | | (QualifiedIdentifier@Mathematica=203#4819e20 Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | |(IDENTIFIER@Mathematica=206#4819da0[`r'] Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/Fateman.m)IDENTIFIER | | )QualifiedIdentifier | | (ExpressionSequence@Mathematica=17#481c940 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | |(Rule@Mathematica=29#481c2e0 Line 1 Column 3 File C:/DMS/ Domains/Mathematica/Examples/Fateman.m | | | (Disjunction@Mathematica=34#481c4c0 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | | (Conjunction@Mathematica=36#481c780 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | | (EqualitySequence@Mathematica=38#4819f40 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | | |(Sum@Mathematica=56#481c340 Line 1 Column 3 File C:/ DMS/Domains/Mathematica/Examples/Fateman.m | | | | (Primary@Mathematica=108#481c700 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | | | (QualifiedIdentifier@Mathematica=203#481c1c0 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m | | | | (IDENTIFIER@Mathematica=206#481c220[`s'] Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/Fateman.m)IDE NTIFIER | | | | )QualifiedIdentifier | | | | (ExpressionsStar@Mathematica=177#481c760 Line 1 Column 5 File C:/DMS/Domains/Mathematica/Examples/Fateman.m)Expr essionsStar | | | | )Primary | | | |)Sum | | | )EqualitySequence | | | )Conjunction | | | )Disjunction | | |)Rule | | )ExpressionSequence | | )Primary | | )Sum | |)EqualitySequence | )Conjunction | )Disjunction | )Rule |)ExpressionSequence )Command )Commands (Command@Mathematica=4#481c980 Line 2 Column 1 File C:/DMS/Domains/ Mathematica/Examples/Fateman.m)Command )Commands )Mathematica Exiting with final status 0 For good measure, here's a larger file without the parse tree dump: C:\DMS\Domains\Mathematica\Tools\Parser\Sourcewc C:\DMS\Domains \Mathematica\Examples\dominators.m 1359 6375 71215 C:\DMS\Domains\Mathematica\Examples\dominators.m C:\DMS\Domains\Mathematica\Tools\Parser\Sourcerun ../domainparser C: \DMS\Domains\Mathematica\Examples\dominators.m Domain Parser for Mathematica 2.3.3 Copyright (C) Semantic Designs 1996-2010; All Rights Reserved 18748 tree nodes in tree. Exiting with final status 0 -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. Ira D. Baxter, CTO Semantic Designs, Inc. Thank you Ira for clarifying this. (For the record, I contacted Ira off-list and asked him to respond to Richard Fateman's comments on sage-devel. ). Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 3:05 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. Ira D. Baxter, CTO Semantic Designs, Inc. Part of this discussion started because Dave suggested that writing and maintaining a hand-written parser was harder than a parser-generator one, and consequently that Wolfram probably didn't write a parser by hand. Writing a top-down recursive descent parser isn't terribly hard for a lot of languages; I've done many this way. Mathematica in fact is probably singularly easy, because of its Lisp-like syntax. Lisp parsers have been pretty much hand-rolled from day one because of this rather simply notation with convenient brackets to guide the parser as to when to push, and when to pop. The rumors that I heard was the the early versions of MMa were based in Fortran. My guess is that Wolfram did write a recursive descent parser by hand, because you can do that easily even in Fortran. I would further guess there isn't/wasn't any great reason to change it. Final remark: the effort to write the parser is tiny compared to the effort to build the rest of any interesting system attached to a parser. The same is true for maintenance costs. Ira Baxter, CTO Semantic Designs -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 25 February 2011 10:48, Ira Baxter idbax...@semdesigns.com wrote: On Feb 25, 3:05 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. Ira D. Baxter, CTO Semantic Designs, Inc. Part of this discussion started because Dave suggested that writing and maintaining a hand-written parser was harder than a parser-generator one, That's the impression I've got from spending a bit of time on reading compiler books. I personally don't have a computer science background, so have very little knowledge of this area, whereas clearly this is one of your areas of expertise. and consequently that Wolfram probably didn't write a parser by hand. Wolfram is a bright guy, so I concluded (perhaps incorrectly), he would have done it the easiest way possible. I got the impression that would have been to use pre-written tools for the lexical analysis and parsing, rather than hand craft them, which I understood had high maintenance costs. Writing a top-down recursive descent parser isn't terribly hard for a lot of languages; I've done many this way. Mathematica in fact is probably singularly easy, because of its Lisp-like syntax. Lisp parsers have been pretty much hand-rolled from day one because of this rather simply notation with convenient brackets to guide the parser as to when to push, and when to pop. Richard Fateman has written a parser for Mathematica in Lisp http://www.eecs.berkeley.edu/~fateman/papers/lmath.ps I was interested if his comments in the section Lexical Analysis and Parsing and as to how accurate they are. His code is public http://www.cs.berkeley.edu/~fateman/mma1.6/ I personally find Richard can be helpful, but some of his comments (like those directed at you and your company) are somewhat less than helpful. He wrote a review of Mathematica some time ago. http://www.math.bme.hu/~jtoth/FelsMma/mma.review.pdf The rumors that I heard was the the early versions of MMa were based in Fortran. My guess is that Wolfram did write a recursive descent parser by hand, because you can do that easily even in Fortran. Would you mind answering the following question, and I would not blame you if you chose not to! If someone wanted to write an open-source, cross-platform parser for Mathematica, without using commercial tools like that produced by your company, what approach (or approaches) would you consider sensible ones? Would this be easier in Lisp than in say C or Python? I would further guess there isn't/wasn't any great reason to change it. That makes sense, Final remark: the effort to write the parser is tiny compared to the effort to build the rest of any interesting system attached to a parser. The same is true for maintenance costs. Yes, that's clear. But Sage does have a reasonable subset (and in some cases superset) of Mathematica, so hooking up a Mathematica parser to Sage (which is basically Python-like), could have some uses and would not involve writing all of Mathematica. However, the approach of an open-source multi-platform clone of Mathematica, no using Sage (but some components of it, such an MPIR, MPFR etc), would be an interesting open-source project. Ira Baxter, CTO Semantic Designs Anyway, thank you Ira. You have convinced me that Richards numbers 1 3 were wrong. Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 6:55 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 10:48, Ira Baxter idbax...@semdesigns.com wrote: On Feb 25, 3:05 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. Part of this discussion started because Dave suggested that writing and maintaining a hand-written parser was harder than a parser-generator one, That's the impression I've got from spending a bit of time on reading compiler books. I personally don't have a computer science background, so have very little knowledge of this area, whereas clearly this is one of your areas of expertise. Your impression is right. Many langauges can be parsed with top-down parsers (in fact, with enough hacking, you can probably do them all). But many are hard and that's the point of a parser generator. Your orginal point, and the one we follow, is that you have a big nasty parser generator, you should use it for everything including the easy stuff. Wolfram is a bright guy, so I concluded (perhaps incorrectly), he would have done it the easiest way possible. I got the impression that would have been to use pre-written tools for the lexical analysis and parsing, rather than hand craft them, which I understood had high maintenance costs. Easy is relative. Using a parser generator isn't as easy if you are coding in Fortran (probabaly not terribly hard, either, but if you've coding everything in Fortran, that thought may not cross your mind). And if you happen to have a top-down easy-to-parse langauge, you don't need to. I suspect (but I haven't asked Wolfram) this is what happened. It makes perfect sense to me. Richard Fateman has written a parser for Mathematica in Lisp No surprise, and no surprise it is hand-written. LISPers do a lot of that. Would you mind answering the following question, and I would not blame you if you chose not to! If someone wanted to write an open-source, cross-platform parser for Mathematica, without using commercial tools like that produced by your company, what approach (or approaches) would you consider sensible ones? Would this be easier in Lisp than in say C or Python? I thing you are asking the wrong question. Building the parser, however done, is such a small part of the overall work that goes into a system that it really doesn't matter how you do it. Nor would it be easier or harder in interesting ways in any language you choose; you can code a parser by hand in all these lanuages, and if you look, you can find parser generators for all of them. The easiest thing to do is write it using a parser generator. Richard has made some noise about MMa not being LALR(1) which would kill most conventional parser generators, but I'm not sure I agree with that analysis. Our GLR parser generator happens to analyze its grammar as to whether it is LALR or not; for the particular grammar I have, it says it isn't LALR-like in a few places. This is pretty typical of a simple grammar when a parser engineer first crashes his grammar against an LALR parser generator like YACC. There's fiddly stuff and black magic you can do to a grammar that usually fixes that kind of stuff, and there's always resorting to hacking (Richards lexer asks the parser for help is one of them). People succeed with standard parser generators by spending energy doing this, and this is why parser generators are widely accepted: you can get them to work. Based on our parser analyzer output, I suspect an LALR version wouldn't be difficult. We didn't bother with an LALR version, because we have the GLR version, and I have no hacks in the parser which made this grammar singularly easy to build. Because its an easy grammar, you could quite reasonably do the top down version. For instance, you could simple steal, uh, reuse Richards code as a direct design if he coded it cleanly. The always hard part is, what grammar will you use? Wolffram doesn't publish one. I invented ours based on my long experience and lots of staring at the MMa reference manuals. If you are an expert at building parsers, and understood MMa really well, my guess is you could build a MMa parser (either way) in a few weeks. Probabaly what Richard took. However, the approach of an open-source multi-platform clone of Mathematica, no using Sage (but some components of it, such an MPIR, MPFR etc), would be an interesting open-source project. That's a matter of personal taste. If you want a hobby, this is a fine proposal. Our customer base is composed of people that want to get on with the larger task; the JPL guys took our MMa parser because it meant they could concentrate on a oode analysis/code generation/some MMa generation task immediately. And that's what DMS with the predefined base of langauge parsers for lots of (hard-to-parse) langauges, and
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 24, 9:57 pm, Ira Baxter idbax...@semdesigns.com wrote: On Feb 23, 11:37 am, rjf fate...@gmail.com wrote: On Feb 23, 9:17 am, Dr. David Kirkby david.kir...@onetel.net wrote: On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: [snip]. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. It does? It is a context-free langauge, therefore parsable by any parser capable of parsing context free langauges. The traditional efficient and easily automated version of syntax- directed compiling separates lexical analysis and parsing into two stages. Lexical analysis, or the collection of characters into tokens, can generally be done with a simple technique based on finite-state machines. Read about, for example, the program Lex or flex.. http://en.wikipedia.org/wiki/Flex_lexical_analyser The generation of a parser that operates on a stream of tokens is well-known and can be written by hand or through the use of a tool like YACC or Bison http://en.wikipedia.org/wiki/Yacc or a little googling can show you both, working together. The separation of stages vastly shrinks the size of the parser description, arguably speeds up the processing (speed depends on lots of things), and makes programming and debugging simpler. If the grammar that is used fits one of the favorite sub-categories of context free languages (LALR(1), LL(1)) the parser is going to be fast relative to the length of the sentence or program being parsed. Like linear in the length e.g. O(n). Now is it possible to have a grammar that is not in one of these categories, and even one that is ambiguous. And it is possible to write a parser, e.g. using something like the technology used by Semantic Designs. While this can be handy to demonstrate, the parser may take considerably more time.. I think it may be more like O(n^3) for the hard parts, or worse if it is an ambiguous grammar. Since computers are so fast, maybe it doesn't matter much anymore. When a program seems sluggish people probably blame it on their network bandwidth or the fact that their computer is checking for updates in their mailbox. etc. I know you said that, but I've heard different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser. It does. He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. It isn't, and GLR is (capable of parsing a context free language) but AFAICT, isn't really needed to parse MMa. For most of the language it is not. The only problems are the places where extra information on the context of the parse position is needed to determine how to proceed. Here you can see a Mathematica parser is listed for the DMS toolkit http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMS... So I don't know what to believe Richard. You are saying the Mathematica language can't be parsed with a conventional parser, so (you?) had to hand-write the parser for MockMMA, Our parser for MMa consists of a relatively conventional lexical definition for tokens, and a very straightforward grammar for the language itself. Relatively conventional? Not conventional. Very straightforward? Well, how do you know it is correct? WRI does not publish a grammar. You could publish it, if you chose to, though you might treat that as proprietary. yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. Hmph. For your example r[s[]] below, which you claim is *so* hard to parse, Well, it is not *so* hard to parse; I even suggest how to do it. I also provided a parser to do it. It is just one of several glitches. here's the output of DMS parsing it using our Mathematica grammar: C:\DMS\Domains\Mathematica\Tools\Parser\Sourcerun ../domainparser + snipped long and not particularly illuminating example... Exiting with final status 0 If that is supposed to be the useful output of DMS, then the tool looks rather difficult to use. I suppose one could generate such an output from my parser by essentially tracing all programs that do reductions. A parse tree dump?? I think this is better.. r[[s[]]] -- (Part r (s )) Yes, it parses much bigger, much more complex examples. JPL has used it internally. I would be fascinated to learn why. If JPL has a Mathematica license, then the parsing can be done without
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 1:05 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. No, you don't. It is sufficient to mess with the grammar and augments to do essentially any program at all. It is a matter of convenience in programming (which is partly a matter of taste), and partly a question of quality. I have noticed, for example, that my parser gives different and sometimes better syntax error messages than WRI's. Ira D. Baxter, CTO Semantic Designs, Inc. Thank you Ira for clarifying this. (For the record, I contacted Ira off-list and asked him to respond to Richard Fateman's comments on sage-devel. ). Dave Thanks for contacting Ira. Readers of this list will have to make their own judgment as to whether Ira's claims are reliable or not. If he wishes to (for example) provide a web site where one can enter text alleged to be Mathematica code, and which will translate it into (say) Lisp, that would be interesting. Of course if the back end were actually Mathematica, this would be simple, so it could be faked. so maybe it's not really a good test. I also suspect that the Sage list and his customer base have a zero intersection. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 2:48 am, Ira Baxter idbax...@semdesigns.com wrote: On Feb 25, 3:05 am, David Kirkby david.kir...@onetel.net wrote: On 25 February 2011 05:57, Ira Baxter idbax...@semdesigns.com wrote: Mathematica is otherwise not hard to parse, and you don't need a hand-written parser to do it. Ira D. Baxter, CTO Semantic Designs, Inc. Part of this discussion started because Dave suggested that writing and maintaining a hand-written parser was harder than a parser-generator one, and consequently that Wolfram probably didn't write a parser by hand. Unless a language changes, it is not often that one is compelled to change a parser. There are lots of reasons not to change a language. Which is harder .. changing a hand-written (typically the only plausible technique is recursive descent), parser vs. changing the syntax table and augments for a (say) LALR or perhaps GLR parser? Why Dave should have an educated opinion on this, who knows. He claims to not be an expert, and I accept that assessment. Writing a top-down recursive descent parser isn't terribly hard for a lot of languages; I've done many this way. Mathematica in fact is probably singularly easy, because of its Lisp-like syntax. I suspect you do not mean syntax here, unless you mean FullForm. Lisp parsers have been pretty much hand-rolled from day one because of this rather simply notation with convenient brackets to guide the parser as to when to push, and when to pop. Um, there are a bunch of things wrong with this simple sentence. 1. Programs that read lisp are called readers, not parsers. One can write a lisp reader for traditional lisp symbolic expressions in a few lines of code. 2. That code would not include any instructions like push or pop. It would be a set of mutually recursive programs to read atoms and to read lists or dotted pairs. 3. Common Lisp has a highly customizable reader, where programs can be attached to characters. This is fairly hairy. The rumors that I heard was the the early versions of MMa were based in Fortran. My guess is that Wolfram did write a recursive descent parser by hand, because you can do that easily even in Fortran. I am unaware of such rumors, but it sounds pretty doubtful. Wolfram and friends wrote SMP at Caltech. I think that Wolfram was able to write programs in C. I would further guess there isn't/wasn't any great reason to change it. Final remark: the effort to write the parser is tiny compared to the effort to build the rest of any interesting system attached to a parser. The same is true for maintenance costs. I agree. For example, I find that merely parsing the language is a lisp program of about 1,300 lines. The pattern matcher is about that size, as is the display program. Doing polynomial and rational arithmetic is again comparable in size to the parser. RJF -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 10:36 am, rjf fate...@gmail.com wrote: On Feb 24, 9:57 pm, Ira Baxter idbax...@semdesigns.com wrote: On Feb 23, 11:37 am, rjf fate...@gmail.com wrote: On Feb 23, 9:17 am, Dr. David Kirkby david.kir...@onetel.net wrote: On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: [snip]. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. It is clear that Fateman with no evidence insulted our ability to parse Mathematica, as well as our character. It should be clear that we can parse it just fine. You can decide about our character. Mr. Fateman does not apparantly understand what we do with our tools, or how they work. Given the way this conversation started, and the way he is continuing, I see no good reason why I should explain any of this to him. -- IDB -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 25 February 2011 21:36, Ira Baxter idbax...@semdesigns.com wrote: Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. It is clear that Fateman with no evidence insulted our ability to parse Mathematica, as well as our character. It should be clear that we can parse it just fine. You can decide about our character. Mr. Fateman does not apparantly understand what we do with our tools, or how they work. Given the way this conversation started, and the way he is continuing, I see no good reason why I should explain any of this to him. -- IDB I think Richard Fateman was probably born spiteful and did a degree in spitefulness[1]. Although Richard has never tried Sage, he sometimes does have some useful contributions to the sage-devel list, even though that might be hard for you to believe given his attitude towards you and your company. I do agree with Richard that getting the input to FullForm[] is useful. The format of your output is hard to understand, but I've not taken the time to read your web site in great detail. I still don't understand all the issues around parsing Mathematica, but you have given me good reason to believe some of Richard's comments on this topic may be inaccurate. I did have a look at Fateman's Lisp code, but there is no README file explaining how to use it. This is despite Richard's paper Software Fault Prevention by Language Choice: Why C is Not My Favorite Language http://www.cs.berkeley.edu/~fateman/papers/software.pdf saying The program should be written so that it that can be modified, extended, or re-used in the future by the original author or others. To most people, including a README file with the source code would help in this matter. I tried to use Richard's code with the ECL Lisp interpreter but that failed. I was then told the Allegro Lisp interpreter would work, which I've not tried as that is a commercial product and I don't have it. Do you think lex and yacc could be suitably employed for the task? These are quite nice in that they are included in most operating systems (there are versions for Windows, Linux, Solaris, AIX, HP-UX ... etc etc). If I recall correctly, Richard stated they would not be suitable, but I don't trust his judgment on this issue. Despite the impression you might get, the sage-devel list is usually quite well behaved and professional. Of course we get conflicts some times (I've had a few with people myself), but generally there is respect for each others opinions. Dave [1] Adapted from a comment made by someone in the procurement department at University College London about his colleague who administered the Mathematica site license. He was born awkward and did a degree in awkwardness -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 1:36 pm, Ira Baxter idbax...@semdesigns.com wrote: It is clear that Fateman with no evidence insulted our ability to parse Mathematica, as well as our character. It should be clear that we can parse it just fine. You can decide about our character. Mr. Fateman does not apparantly understand what we do with our tools, or how they work. You claim that you can parse Mathematica. You claim that it should be clear that you can parse Mathematica. You provide no evidence that you can parse Mathematica except a (rather obscure) trace of a a parse of an 8 character string. It is not clear even that the parse is correct. By comparison, the result I provided, (Part r (s)) explicitly represents the fact that the parse results in a Part[] subexpression. The output from the semanticdesigns parser does not seem to anywhere mention Part. I don't see a resulting form in any readable format corresponding to the expression. So if you can parse the given expression, you have not shown that to me at all. Given the way this conversation started, and the way he is continuing, I see no good reason why I should explain any of this to him. The other evidence that you provide is that your program produces an output consisting of the number of nodes in a tree from (allegedly) parsing a Mathematica file. Is the program parsed the same way as done by Mathematica? Now none of this is THAT hard; after all WRI has been doing this for quite a few years. So has MockMMA, though the internal languages now differ by choice. For example, I parse xy as Comparison[x,Less,y] which generalizes xy=z to Comparison[x,Less,y,LessEqual,z]. Instead of Inequality, I use Comparison, so I can also reasonably say Comparison[x,Equal,y]. Now I can understand that a Mathematica parser is not a big seller because, as you point out, a parser per se is not particularly useful (or THAT hard), because a very substantial effort towards providing more of the system, you can't do much with it. So commercially, it doesn't make sense to put a lot of effort into a parser. Fair enough. I asked why JPL would be interested in such a thing, but you didn't respond. Now do you actually have a Mathematica language parser or not? You might. I suggested that perhaps you didn't REALLY have a parser; it is apparently not a supported product. But maybe you do have a parser. In order to parse Mathematica programs you would need to have a representation for arbitrary precision integers and floating-point numbers. Probably omitted. Does your intermediate form of nodes allow for such things? Does your GLR parser overcome such issues? It is possible by using such tricks as encoding a 200-digit integer using 200 nodes of 1-digit each. To what extent is the GLR parser taking time O(n^3) for programs of length n? Or worse? A conventional parser, in my view, is an O(n) time/space parser. Typically LALR(1), but not necessarily. Most programming language constructs are easily parsed. Adopting a grammar and strategy that is of higher complexity is unconventional, but of course still possible. Writing a parser using GLR that also happens to be O(n) is possible with the right grammar. Devising a correct grammar for Mathematica is itself tricky; getting one that is correct and has an O(n) complexity eluded me when I first decided to write a parser; the alternative was the parser I actually wrote, and you can view. Now as for whether I understand how your tools work or not, there is not a bad top-level description of GLR in wikipedia, and a google search gets you lots of other stuff. This is not difficult technology. Should you (Ira) bother to respond? I don't care. RJF -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 26 February 2011 00:05, rjf fate...@gmail.com wrote: Should you (Ira) bother to respond? I don't care. RJF It would be good if Ira did respond, but given your attitude (use of the word liar for example), who could blame if he did not? I certainly would not blame him. Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 25 February 2011 23:30, David Kirkby david.kir...@onetel.net wrote: Do you think lex and yacc could be suitably employed for the task? These are quite nice in that they are included in most operating systems (there are versions for Windows, Linux, Solaris, AIX, HP-UX ... etc etc). If I recall correctly, Richard stated they would not be suitable, but I don't trust his judgment on this issue. Dave Or more to the point, flex and bison (the GNU versions), since they are widely available. From http://en.wikipedia.org/wiki/GNU_bison Bison also supports “Generalized Left-to-right Rightmost” (GLR) parsers for grammars that are not LALR Since you are using a GLR parser, it makes me suspect bison would be suitable. Of course, creating the BNF is a non-trivial task, but it seems the descriptions of most languages don't actually include a BNF. Perhaps one can be found for C, but it's not in KR book. Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 25, 4:28 pm, David Kirkby david.kir...@onetel.net wrote: Of course, creating the BNF is a non-trivial task, but it seems the descriptions of most languages don't actually include a BNF. I think you can find a formal grammar for almost every computer programming language except for Mathematica, which presumably has a grammar but it is secret. Look at it this way: with such a formal description it is relatively easy to write a parser, and to be assured that the parser corresponds to the grammar, which a computer scientist would use to help design a language. Most people would run the grammar through a parser generator and eliminate constructions that are not consistent with the needs of the parser generator. It cleans up ambiguities, among other things. Since you were planning to write a parser, why not write the grammar first... I would have expected that in your perusal of compiler books you would get this idea. Perhaps one can be found for C, but it's not in KR book. C Programming Language (2nd Edition) (Paperback) by Brian W. Kernighan, Dennis M. Ritchie page 234 et seq any language reference (not necessarily manual or programming in X for dummies) should have a grammar. for more grammars, da google is your friend http://www.thefreecountry.com/sourcecode/grammars.shtml -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 23, 11:37 am, rjf fate...@gmail.com wrote: On Feb 23, 9:17 am, Dr. David Kirkby david.kir...@onetel.net wrote: On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: [snip]. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. It does? It is a context-free langauge, therefore parsable by any parser capable of parsing context free langauges. I know you said that, but I've heard different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser It does. He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. It isn't, and GLR is (capable of parsing a context free language) but AFAICT, isn't really needed to parse MMa. Here you can see a Mathematica parser is listed for the DMS toolkit http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMS... So I don't know what to believe Richard. You are saying the Mathematica language can't be parsed with a conventional parser, so (you?) had to hand-write the parser for MockMMA, Our parser for MMa consists of a relatively conventional lexical definition for tokens, and a very straightforward grammar for the language itself. yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. Hmph. For your example r[s[]] below, which you claim is *so* hard to parse, here's the output of DMS parsing it using our Mathematica grammar: C:\DMS\Domains\Mathematica\Tools\Parser\Sourcerun ../domainparser + +AST C:\DMS\Domains\Mathematica\Examples\multiply.m Domain Parser for Mathematica 2.3.3 Copyright (C) Semantic Designs 1996-2010; All Rights Reserved 17 tree nodes in tree. (Mathematica@Mathematica=1#481c320^0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/multiply.m (Commands@Mathematica=3#481c300 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m (Commands@Mathematica=3#481c2c0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m (Commands@Mathematica=2#4819dc0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m)Commands (Command@Mathematica=5#481c2a0 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m |(ExpressionSequence@Mathematica=17#481c280 Line 1 Column 1 File C:/ DMS/Domains/Mathematica/Examples/multiply.m | (Rule@Mathematica=29#4819f80 Line 1 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m | (Disjunction@Mathematica=34#4819fc0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/multiply.m | (Conjunction@Mathematica=36#481c040 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/multiply.m | |(EqualitySequence@Mathematica=38#481c080 Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/multiply.m | | (Sum@Mathematica=56#481c0e0 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/multiply.m | | (Product@Mathematica=60#481c220 Line 1 Column 1 File C:/DMS/ Domains/Mathematica/Examples/multiply.m | | (QualifiedIdentifier@Mathematica=203#4819e40 Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/multiply.m | | |(IDENTIFIER@Mathematica=206#4819da0[`a'] Line 1 Column 1 File C:/DMS/Domains/Mathematica/Examples/multiply.m)IDENTIFIE R | | )QualifiedIdentifier | | (QualifiedIdentifier@Mathematica=203#481c260 Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/multiply.m | | |(IDENTIFIER@Mathematica=206#4819e20[`b'] Line 1 Column 3 File C:/DMS/Domains/Mathematica/Examples/multiply.m)IDENTIFIE R | | )QualifiedIdentifier | | )Product | | )Sum | |)EqualitySequence | )Conjunction | )Disjunction | )Rule |)ExpressionSequence )Command )Commands (Command@Mathematica=4#481c2e0 Line 2 Column 1 File C:/DMS/Domains/ Mathematica/Examples/multiply.m)Command )Commands )Mathematica Exiting with final status 0 Yes, it parses much bigger, much more complex examples. JPL has used it internally. Does it parse all of current 2011 MMa syntax? Probablly not, we haven't used it much recently. But I spent 4 years working on a 80,000 line MMa program so I think I understand the basics of the language, and given its Lisp-like syntax, I don't think I'll be surprised. Wolfram could be crazy, though. 2. The company has a really neat parser generating tool and a lot of engineering to go with it and Mathematica can be easily parsed with it. It is indeed the case
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: A parser for the maxima language is not only easier to write, it is available in source form. It is also based on a well known technique which is also used by Reduce. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. I know you said that, but I've herd different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser (I assume the Mathematica parser costs extra too). He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. Here you can see a Mathematica parser is listed for the DMS toolkit http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMSDomains So I don't know what to believe Richard. You are saying the Mathematica language can't be parsed with a conventional parser, so had to hand-write the parser for MockMMA, yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Clearly Wolfram|Alpha is a bit more clever, as it parsers written English and tries (sometimes not very successfully) to work with that. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 23, 9:17 am, Dr. David Kirkby david.kir...@onetel.net wrote: On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: A parser for the maxima language is not only easier to write, it is available in source form. It is also based on a well known technique which is also used by Reduce. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. I know you said that, but I've herd different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser (I assume the Mathematica parser costs extra too). He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. Here you can see a Mathematica parser is listed for the DMS toolkit http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMS... So I don't know what to believe Richard. You are saying the Mathematica language can't be parsed with a conventional parser, so had to hand-write the parser for MockMMA, yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Clearly Wolfram|Alpha is a bit more clever, as it parsers written English and tries (sometimes not very successfully) to work with that. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? Dave Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. 2. The company has a really neat parser generating tool and a lot of engineering to go with it and Mathematica can be easily parsed with it. 3. The company has nothing much beyond a good term project in a compiler-technology course (perhaps at a graduate level) plus a bunch of engineering and marketing. My guess is 1 + 3 A VERY simple example. r[s[]] is legal in mathematica. A traditional lexical analyzer, including the one apparently used by mathematica, typically looks for the longest string of characters that makes a token. Hence a===bhas a token === which is SameQ even though there are tokens = and ==. So the longest one is found, in general. now in Mathematica, s[[4]] means take the 4th part of the list or structure s. s[4] means (essentially) call the function s on the argument 4. { really it has to do with pattern matching too, but that's a nuance not needed here.} anyway, how does one do lexical analysis or scanning on r[s[]] ? The correct tokenization is r, [, s, [, ], ] . but the maximal token deal returns r, [, s, [, ]] . What does this mean? It means that the conventional separation of lexical analysis and parsing must be intermixed in parsing Mathematica. I know of no other programming language that requires this. Oh, there are also other glitches in mathematica of this sort. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 23, 2011, at 6:37 PM, rjf wrote: On Feb 23, 9:17 am, Dr. David Kirkby david.kir...@onetel.net wrote: On 02/22/11 10:57 PM, Dr. David Kirkby wrote: On 02/22/11 03:49 PM, rjf wrote: A parser for the maxima language is not only easier to write, it is available in source form. It is also based on a well known technique which is also used by Reduce. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. I know you said that, but I've herd different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser (I assume the Mathematica parser costs extra too). He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. Here you can see a Mathematica parser is listed for the DMS toolkit http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMS... So I don't know what to believe Richard. You are saying the Mathematica language can't be parsed with a conventional parser, so had to hand-write the parser for MockMMA, yet someone from a commercial company selling this DMS toolkit claims the language is not particularly difficult to parse, and have a front end for their toolkit (a GLR parser) able to parse Mathematica. Clearly Wolfram|Alpha is a bit more clever, as it parsers written English and tries (sometimes not very successfully) to work with that. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? Dave Here are my suggestions: 1. The guy is lying. He doesn't really have a Mathematica parser that works. 2. The company has a really neat parser generating tool and a lot of engineering to go with it and Mathematica can be easily parsed with it. 3. The company has nothing much beyond a good term project in a compiler-technology course (perhaps at a graduate level) plus a bunch of engineering and marketing. My guess is 1 + 3 A VERY simple example. r[s[]] is legal in mathematica. A traditional lexical analyzer, including the one apparently used by mathematica, typically looks for the longest string of characters that makes a token. Hence a===bhas a token === which is SameQ even though there are tokens = and ==. So the longest one is found, in general. now in Mathematica, s[[4]] means take the 4th part of the list or structure s. s[4] means (essentially) call the function s on the argument 4. { really it has to do with pattern matching too, but that's a nuance not needed here.} anyway, how does one do lexical analysis or scanning on r[s[]] ? The correct tokenization is r, [, s, [, ], ] . but the maximal token deal returns r, [, s, [, ]] . What does this mean? It means that the conventional separation of lexical analysis and parsing must be intermixed in parsing Mathematica. I know of no other programming language that requires this. C++0x will require something similar for templates, so that std::vectorSomeTypebool x; will parse instead of requiring std::vectorSomeTypebool x; That said, I don't think many people consider C++ to be an easy language to parse. :-) -Ivan -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 23, 1:45 pm, Ivan Andrus darthand...@gmail.com wrote: (RJF) I know of no other programming language that requires this. C++0x will require something similar for templates, so that std::vectorSomeTypebool x; will parse instead of requiring std::vectorSomeTypebool x; That said, I don't think many people consider C++ to be an easy language to parse. :-) -Ivan that looks similarly nasty. This semanticdesigns company is using a GLR parser, which is fairly general but potentially slow if the grammar in use is more ambiguous than not. I suppose there is another way around this in Mathematica, which is to not recognize [[ or ]] as tokens, but only [ and ]. And some investigation shows this is what they do. let L={a,b,c,d}a list of 4 elements L[[2]] evaluates to the 2nd element, namely b. L[ [ 2 ] ] evaluates to b as well. There are other 2- or 3- character tokens, such as ==, and ===. The question naturally occurs as to whether you can put spaces in the middle of those too. A==B tests to see if A is equal to B A= =B is syntactically incorrect. Probably a C++ parser would use the same trick and not have a token at all. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Wed, Feb 23, 2011 at 5:29 PM, rjf fate...@gmail.com wrote: On Feb 23, 1:45 pm, Ivan Andrus darthand...@gmail.com wrote: (RJF) I know of no other programming language that requires this. C++0x will require something similar for templates, so that std::vectorSomeTypebool x; will parse instead of requiring std::vectorSomeTypebool x; That said, I don't think many people consider C++ to be an easy language to parse. :-) -Ivan that looks similarly nasty. The worst example I've seen is #if FIRST_MEANING templatebool B class foo { }; #else static const int foo = 0; static const int bar = 15; #endif [...lots of code...] static int foobar( foo 2 ? 1 1 : 0 bar ); courtesy of http://stackoverflow.com/questions/1172939/is-any-part-of-c-syntax-context-sensitive . - Robert -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
A parser for the maxima language is not only easier to write, it is available in source form. It is also based on a well known technique which is also used by Reduce. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. For example, the lexical analysis cannot be done by a finite state machine, and is not LALR(1), a category of grammar sufficient for almost any reasonable language. But they already have that. (Oh, there is one of those, free in MockMMA...) Writing a sloppy parser that provides many parses for an alleged sentence has an attraction, too (I used it in TILU). It's fine for short utterances, even if it is exponentially expensive. It's also very easy to write. You can even allow things like integrate sinxdx, where the segmentation into tokens is done heuristically. You have some kind of after-the-fact sorting to pick out the intended meaning. This can be done by computer or with the help of the human. While I assume that some people at WRI may be observing the Sage activity, I doubt that they feel Sage breathing down their necks. On Feb 21, 10:13 am, Robert Bradshaw rober...@math.washington.edu wrote: On Mon, Feb 21, 2011 at 8:37 AM, kcrisman kcris...@gmail.com wrote: On Feb 21, 2:00 am, Eviatar eviatarb...@gmail.com wrote: I've noticed this too. I wonder if they purposely implemented Sage syntax or if it's just a very comprehensive parser. I think the goal is to understand any natural syntax for many questions, and certainly this syntax is relatively unambiguous, and pretty much the Maxima (i.e. old and well-known) syntax in the first case. It tries to also understand much more 'natural' things like the integral of ... with respect to x etc. so I don't see any need to impute any extra thought on this. Yes, I'd say this is *much* more natural than, e.g., using square brackets for function calls and capitalizing log/trig functions. - Robert -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
While I assume that some people at WRI may be observing the Sage Yes, I can definitely confirm this. activity, I doubt that they feel Sage breathing down their necks. I agree - so far. But it was amazing how many visitors we had at the JMM booth talking about doing an institutional switch. If larger state consortia and other institutions keep having year after year of hard budgets, this could change. But for most places that can afford it, continuity is the biggest thing - hence places that still use Derive or other less well-known systems than Mma or Maple. - kcrisman -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On 02/22/11 03:49 PM, rjf wrote: A parser for the maxima language is not only easier to write, it is available in source form. It is also based on a well known technique which is also used by Reduce. The real difficulty is to implement a Mathematica language parser, since the language fails to fit the standard expectations for computer languages. I know you said that, but I've herd different from another source. See http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599 The person there, who is the CTO of a company producing this http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html which has an option for a Mathematica parser (I assume the Mathematica parser costs extra too). He says Mathematica is not a particularly difficult language to parse, and a GLR parser is a bit over the top. Do you have any comments about the viability of using a GLR parser? If you believe it is not suitable, it might be helpful if you contributed to that discussion on comp.compilers. I find it someone hard to believe Steven Wolfram would have written his own parser, rather than use a standard one, as it would have made his life a lot more difficult. (Of course, he could have done it to obfuscate the language, but I'm not so convinced that he did that. Otherwise he would not have left so much of Mathematica in simple text files - now more is built into the kernel of course. For example, the lexical analysis cannot be done by a finite state machine, and is not LALR(1), a category of grammar sufficient for almost any reasonable language. But they already have that. (Oh, there is one of those, free in MockMMA...) If a standard parser of some sort could be used, it is much more attractive than a hand-written one like you have. From what I understand, after reading some compiler books, writing the parser by hand is not only tedious, but it's quite difficult to make the inevitable small changes, so the maintenance costs are much higher than using a standard parser. While I assume that some people at WRI may be observing the Sage activity, I doubt that they feel Sage breathing down their necks. Time will tell. I think there is an increased acceptance of open-source software now, especially in these rather tight economic times. Sage certainly lacks a lot of the features of Mathematica, and since it is stitches together a large range of separate tools, Sage is less uniform in its usage. I don't know how many maths departments are now using Sage, and if they are, whether Mathematica or Maple is used too. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Feb 21, 2:00 am, Eviatar eviatarb...@gmail.com wrote: I've noticed this too. I wonder if they purposely implemented Sage syntax or if it's just a very comprehensive parser. I think the goal is to understand any natural syntax for many questions, and certainly this syntax is relatively unambiguous, and pretty much the Maxima (i.e. old and well-known) syntax in the first case.It tries to also understand much more 'natural' things like the integral of ... with respect to x etc. so I don't see any need to impute any extra thought on this. - kcrisman -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
On Mon, Feb 21, 2011 at 8:37 AM, kcrisman kcris...@gmail.com wrote: On Feb 21, 2:00 am, Eviatar eviatarb...@gmail.com wrote: I've noticed this too. I wonder if they purposely implemented Sage syntax or if it's just a very comprehensive parser. I think the goal is to understand any natural syntax for many questions, and certainly this syntax is relatively unambiguous, and pretty much the Maxima (i.e. old and well-known) syntax in the first case. It tries to also understand much more 'natural' things like the integral of ... with respect to x etc. so I don't see any need to impute any extra thought on this. Yes, I'd say this is *much* more natural than, e.g., using square brackets for function calls and capitalizing log/trig functions. - Robert -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs
I've noticed this too. I wonder if they purposely implemented Sage syntax or if it's just a very comprehensive parser. On Feb 20, 7:40 pm, David Kirkby david.kir...@onetel.net wrote: I noticed a couple of things on sage-devel recently about integration with Maxima. It appears Maxima can't do either of these two. (Well, it does the second one, but leaves it in a overly complex form, that Sage's n(). can't even use). integrate( sqrt(x^2+4)/(x^2+1), x ) integrate(log(1+x)/(x^2+1),(x,0,1)) # (This one from the Putman 2005 challenge) I noticed that if one sticks that exact syntax into Wolfram|Alpha, it evaluates the integrals. There's no need to re-write the integrals in Mathematica's syntax.. If one was to use Mathematica directly, then the sage syntax would not be understood. I guess none of this is not totally surprising, but can be useful to get a second opinion on something, without even taking the trouble to rewrite the problem in Mathematica's syntax. See: http://www.wolframalpha.com/input/?i=integrate%28%20sqrt%28x^2%2B4%29%2F%28x^2%2B1%29%2C%20x%20%29t=ff3tb01 http://www.wolframalpha.com/input/?i=integrate%28log%281%2Bx%29%2F%28x^2%2B1%29%2C%28x%2C0%2C1%29%29t=ff3tb01 Dave -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org