About the THIF

Jeffrey Kegler Fri, 10 Jan 2014 18:08:57 -0800

In cleaning up my repository, I came across my design notes for what wasto becomethe THIF. I found that, with some rewriting, they would serve quitewell as an

introduction to the THIF for someone already familiar with Marpa.  They are

particularly relevant to anyone thinking of creating their own Marpainterface.

Here they are:


In making design choices, whenever a choice allowed the THIF to gain
in efficiency or flexibility, that was the choice that I made. I did
this, even when it carried a price in terms of programmer-friendliness.
My reasoning was that a decision against user- and programmer-friendliness
was something that an upper layer could always undo.  By emphasizing
flexibility and efficiency, I believed that I was in fact giving the
final user the best of both worlds.

To be clear, because of the difficulty of working directly with the THIF,
I do not expect that typical application writers will use it directly.
The THIF is targeted at writers of higher-level interfaces.  For writers
of higher-level interfaces, use of the THIF will have several advantages.

1.)  The THIF is entirely in heavily-optimized C.  It is fast.

2.)  The THIF does not have the restriction of the "thicker" interfaces.
The thicker interfaces have to make many choices that the THIF leaves up
to the user.  These choices are by no means always minor or matters of
detail: among them are all naming conventions, most defaults and almost
the entire semantics.

3.)  The THIF has, by C language standards, an object-oriented interface.
There are separate objects for (in sequence) grammars, recognizers,
bocages, parse orderings, parse trees and parse valuators. Objects in
each class in this sequence have a one-to-many relationship to objects
in the next class in the sequence.

4.) In the THIF, everything is an integer.  There are no floats or
strings.  All decisions involving strings are up to the higher levels.
Symbols and rules are integers in libmarpa, leaving naming conventions up
to the higher levels.  Similarly, errors are represented by integer codes.

5.) Token values are tracked by the THIF, but only as a convenience
for the higher levels.  (The higher level is free track token values on
its own).  But even token values are coded as integers.  The intent is
that a typical interface will use the THIF's token values to index an
array to find the actual token value.  This leaves all questions about
what an token value actually is, up to the higher level.

6.)  Even though the THIF has a valuator, almost all of the semantics
is left up to the interface.  The THIF's valuator does not directly
maintain the evaluation stack -- instead it issues instructions on how
to manage it.  These instructions indicate the rule or token involved,
where the result goes in the stack and, in the case of a rule, where
the operands for its semantics are to be found.  What goes on the stack,
and what form the semantics of rules and symbols take, is left entirely
up to the higher-level.

The THIF's valuator may sound strange, and it does in fact take some
getting used to.  But it offers two major advantages over callbacks.
First, and most important, the problems of debugging foreign semantics
(for example, Perl closures) inside the environment of a C library
is avoided.  All code is executed in its native environment. Second,
the THIF's valuator is faster than callbacks.

Note that, if you really do think callbacks are the right way to do
semantics, the upper layer can also convert THIF instructions into
callbacks.  This is, in fact, exactly what Marpa::R2 does.

--
You received this message because you are subscribed to the Google Groups "marpa 
parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

About the THIF

Reply via email to