Re: OODL: Interpreter/NuParser

DeRobertis Sat, 12 Feb 2000 02:46:10 -0800
At 8:07 AM +1000 on 2/10/00, Paul Sutton wrote:
>Adrian: Anthony, I have been reviewing the things that are sitting around
>on my hard drive here at home and discovered a file "NuParser.png" which
>is a diagram of the conceptual foundation of NuParser.  There are a
>couple of things that concern me about this diagram but there is not
>enough information in the diagram to be sure that these concerns are
>founded, or to discuss them informatively.

I'm not sure to which diagram you are referring. Please send me a copy.
I don't have any file called "NuParser.png" around here. And the mail
archive search is down.

>
>Adrian: Has documentation relating to NuParser been uploaded to the UFP
>server yet?

No. NuParser is not done yet, and has not been documented yet. It has
online help, the incomplete version of which has been pasted to the end
of this message.

NOTE: NuParser is a parser generator, like bison. NuInterpreter is the
(presently nonexistant) new interpreter generated with NuParser instead
of Bison. Interpreter is the old snapshot from summer(?). The naming
confusion will be straitened out sometime. Anyone with suggestions for
better names should mail me.

>
>Adrian: Finally, I believe that FreeScripts are translated to nullCPU
>assembly which is then interpreted,

Yes.

>if this is still the case could any
>documentation relating to nullCPU assembly be uploaded as well?

Look in the Interpreter sources, in NullCPUInstr.h. That's the best
documentation availible. Most of it is fairly obvious (e.g., multiply),
some is based in HT (e.g., mod) and some in PPCAsm (e.g., bc).

Also, OTDisAsm.cp contains the disasembler and CodeRunner.cp the actual
emulator.

>It would also help with my
>understanding of C/C++

The Interpreter is _not_ a good place to try and understand C/C++. The
interpreter is the only place that when the choice is a 1% speed gain
vs. 100x more readability, the speed gain is chosen.

You can see all sorts of evils, such as a single string being used a
both a C and Pascal string in OTVar.h (which, stunningly, is
well-documented)





--- PRELIMINARY NUPARSER DOCS ---

Welcome to NuParser!

Contents
    Contents
    Introduction
    Warning
    General Operation
    Grammar
        Basic Syntax
        Specials
        More to Come Soon

Introduction
NuParser is an object-oriented, user-friendly, infinite-lookahead
finite(?)-state machine parser with a simple integrated lexical
analyzer.

NuParser is not intended to be used alone as an interpreter. Rather, it
is intended to be a frontend to a compiler (the compiler can, of
course, be part of an interpreter). For that reason, NuParser does not
use every possible hack in town to get the greatest possible speed.

NuParser is being used - and being written for the purpose of - an
xTalk interpreter, called Interpreter, for FreeCard, a free (as in
speech, not beer) replacement for HyperCard. For more information,
visit <http://ufp.uqam.ca/OpenCard/>.

Warning
This is a pre-alpha version of NuParser. In all likelyhood, there are
bugs in it. In all likelyhood you can come up with a grammar it can't
handle, or input that just goes wrong. I am interested in hearing of
such; by examining those grammars and inputs, I can better debug
NuParser. Please send all comments, flames, bug reports, patches, et
cetera to me at [EMAIL PROTECTED]

In case it is not yet blatantly obvious what I'm saying: Do not depend
on NuParser for serious work, unless you're ready to track down bugs in
NuParser and fix them. In any case, I'm not liable for any errors, and
make no warranty that it will work. It's not like you actually paid
anything for NuParser!

General Operation
In general, NuParser reads text from the getter function and continues
to do so until the getter functions says there are no more things the
be read.

NuParser makes no guarantee that it will actually use the text it
reads, and usually does not. This is because it may read in many
characters checking to see if a rule fits, only to find that the rule
does not fit at all. It then will reparse with the next rule, which may
take far fewer characters. And if it is then done parsing, it could
have read a significant number of unused characters.

This could make it very hard to put two NuParsers on the same input
stream. There is presently no solution to this mess, but there will be
by alpha.

Grammar
The NuParser grammar is surprisingly simple (after all, remember the
motto). Each card contains a seperate rule segment. Parsing starts on
the card called "main." Note that main is implicitly a list.

Basic Syntax
The syntax for a line is:
  ["words"] <target>|special ["words"] <target>|special �
The less than, greater than, and quote are part of the syntax. The
verticle bar represents alternatives (you may have either a target or a
special). Items in brackets are optional (you don't need to have words
between your target's and special's).

The way parsing works is that first the quoted string (words) is
matched. If the match, parsing recurses into target or handles special,
depending on which is there. So, take an example:
  "Hello, " <object>;
You can see the words reather readily; in this case it is "Hello, ".
Both the contained space and comma are part of words. The quotes are
not. Next you see target is object, which will be defined below.
Lastly, you see there is an ommitted words, followed by a special. An
ommitted words matches the inherent nothingness between bytes in the
input stream. Thus, a null grammar matches a null input, as it should
be (a null grammar is just a semicolon).

Grammar rules are delimited by a newline.

Specials
Specials are essentially targets with a special (hence the name)
meaning. They are listed below, along with their meaning.

Semicolon-Reduce immediatly, continue parse at main
Comma-Declare there to be no backtracking across this point.
Period-Reduce immediatly, then end parse.
Exclamation Mark-Abort parse immediatly.
Verticle bar-mark of ludditism.
Re: OODL: Interpreter/NuParser

Reply via email to