Martin Baker wrote:
>
> On Saturday 13 Nov 2010 16:34:40 Waldek Hebisch wrote:
> > Spad compiler is divided into stages. Lexical analysis and parsing
> > are not a serious problem. Standard tools could help a bit but
> > actually part that tools can handle is very simple while some
> > other aspects (like whitespace sensitivity) must be done in
> > ad-hoc way because tools do not support them.
>
> Better error messages and build environment would be very nice from my
> point of view! (the error message on the first pass seem worst of all,
> I can't even work out what line is causing the problem).
> Regarding whitespace, I seem to remember a message, from you, saying
> you were planning to delimit blocks using braces instead of
> indentation? So if you made a clean break to such a syntax would that
> make things easier? Also, when working on graphics stuff aldor-like OO
> capability would be very nice.
I do not want to make a "clean break": we have about 140_000 lines
of Spad code and this code must compile correctly. If compiler
stops supporting some construct, then we need to modify algebra.
This is practical only for rarely used constucts. In particular
we need to support indentation based syntax for long time. The
plans are to support alternative "nopile" mode and have better
support for multiline constructs in pile mode.
Let me add that I do not expect really good error messages
from Spad parser or lexer: basically Spad syntax contains so
little redundancy that parser has no chance to guess what
malformed code is intended to say. But messages could be
better than now: currently indentaion and parentheses get
mixed in lexer and parser so indentation errors give quite
strange error messages and similarely unbalanced parentheses.
I have experimental code which uses Spad parser with lexer
from the interpreter. When ready this code should give
much clearer messages about indentation errors and
unbalanced parenthesis. But ATM it interprets indentation
differently that old Spad, so I need to fix this.
> > When one comes to semantic analysis things get more tricky. First,
> > when trying to write simple specification of what compiler should
> > do one quickly sees that desired behaviour is not computable.
> > So one has to introduce restrictions to turn problem into
> > computable one (but retain as much as possible from nice
> > features). Second difficulty is that currently at some places
> > the compiler is doing rather strange things, but if one tries
> > to disable such behaviour the algebra no longer compiles.
>
> I was assuming that a big part of the difficulty is the pattern
> matching (not sure if that's semantic analysis?) which is why I
> thought of factoring it out, but I guess there are lots of other
> issues.
>
> > Also, there is a saying that real men can do simultanously
> > several things. But my experiece is that trying to multitask
> > slows me down. When I concentate on one thing, I can do it,
> > then I can move to another thing. In the last two years
> > I spent most of time working on algebra and only a little
> > on the compiler.
>
> Yes I can't multitask and I would not claim your level of expertise on
> either, however if you did decide to use standard tools and if you
> could think of some way that I could help, I would be happy to try to
> do so.
>
Concerning tools: of existing tools in my experience parser and lexer
generators are useful, other I have found of little use. I particular
I have met generators of data structures and tree-walker generators.
I would say that I feel that language should have apropriate support
for data structures and tools doe not change much here. Similarely
for tree walkers: we use tree walkers to perform some operation and
typically tree walker code is small compared to operations. Actually,
IME pattern ML-style matching helps more for creating tree walkers
than external tools.
As I wrote, for FriCAS lexer generator would help. But we have
working lexer, it has few hunderd lines so not using lexer
generator is not a big deal. I have doubts if parser generator
would help: Spad syntax is very well suited to existing parser
and not so well typical parser generators. More precisely, bulk
of syntax is in operator priorities. Because we use _four_
priorites per operator Spad priorities are awkward to describe
in a parser generator. OTOH part of FriCAS parser handling
priorites is quite small and the other parts of Spad syntax are
rather simple, so it is easy to handle them in hand written
parser. For some comparison, current parser has 543 lines,
_including_ priority tables. Previous version using a
home-grown parser generator (META) has slightly more than
200 lines. Grammar rules of PL/1 parser have more than 600 lines.
Grammar plus actions for GNU Pascal have more than 2000 lines.
So, potential gain form using parser generator is quite
limited.
As I wrote, for tree walker I find ML-style pattern matching
quite useful. Now, Boot has special support for pattern
matching -- not such good as ML, but better than many
"standard" languages.
Now, if you want to use ML (or Haskell, ...) and associated tools
and develop alternative compiler, than go on.
However:
- Both compiler and Spad runtime have to deal with types. It
is preferable to share type machinery
- We have "interpter", I feel that "interpter" and Spad compiler
could share a lot (currently there is only limited sharing)
So code of Spad compiler should communicate with algebra code,
which is easy if Spad compiler is in Spad, Boot or Lisp but
gets tricky otherwise. Let me say that I considered using
lexer generator and communicate with lexer via FFI -- my
conclusion was that gain from lexer generator was not worth
the trouble with FFI.
Concering difficulty of compiling Spad: the main work is
in "type checking". Parametric types, overloading and
conditionals involving types make it more difficult than
in other languages -- doing it really well is a research topic.
After type checking it is possible to releatively easy generate
code. Spad compiler could do more optimizations after
type checking and probably code generation could be more
efficient. But currently most of compile type is spent in
type checking, type checking generates error messages so
clearly improvement in type checking will be visible to
users. Also, currently type checking resolves overloading
and after that essentially forgets types (it keeps only type
of result). Which means that to do more after type checking
one probably needs to re-work typechecking to keep types
(and other information) longer.
--
Waldek Hebisch
[email protected]
--
You received this message because you are subscribed to the Google Groups
"FriCAS - computer algebra system" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/fricas-devel?hl=en.