On Friday 22 February 2002 12:36, you wrote:
> Just for fun, I'm wandering how a regular expressions parser could be
> implemented on MSX.
If you just want to tokenize config files, you don't need full regex support
on MSX. From the regular expressions for your config language, create a state
machine. You can do this step using any convenient language, I would suggest
something like Python or Perl, but you can use an MSX language instead if you
like. Then implement a small engine in MSX assembly that executes a state
machine.
Advantages:
- INS won't increase much in size: only state machine and engine would have
to be integrated, the rest is an external program
- you can do most programming in a high-level language
- the state machine generator and engine are independent of the config
language, so you don't have to re-write them if the config syntax changes
Example of state machine:
Let's say you have 2 keywords, being "on" and "off". The state machine would
look like this:
0 --(o)--> 1
1 --(n)--> 2
1 --(f)--> 3
3 --(f)--> 4
Where:
0 is the initial state
2 is the end state for "on"
4 is the end state for "off"
--(x)--> means you perform this state transition if the current input
character is "x"
If there is no valid transition, the string you're reading is not a
recognised token. In most languages, you would scan until a separator and the
result is an identifier (non-keyword token). In a language without separators
you would have to jump to the correct internal state, which is hard, so don't
design such a language unless you actually need it.
Ofcourse "on | off" is a very trivial regex, but any regex can be expressed
as a state machine. A non-trivial regex would make a too large example,
this is also why state machine generation should be automated.
Bye,
Maarten
--
For info, see http://www.stack.nl/~wynke/MSX/listinfo.html