I'm underway on an implementation, so this is just a quick follow-up.
First, thanks to a side comment by Ivan Goddard, I'm doing away with the
notion of token priorities. Token IDs will remain, because we need them for
lexing. I'm basically going to a two-priority system. An RE that consists
exclusively of single character matching and concatenation (e.g. "do") is
assumed to match in preference to one that uses other operations (e.g.
"[_a-zA-Z][_a-zA-Z0-9]*"). At some point I'll probably extend that to
case-insensitive keyword matching somehow, but for now I don't need that.
I've also zeroed in on a slightly different bytecode than previously
described. The differences exist mainly for convenience of representation:
STOP (current thread)
STOPALL (all threads)
ACCEPT *tokenid*
JUMP *targetpc*
FORK *targetpc*
MATCH { (extended opcode taking the following parameters:
BASIC *first last* (Unicode code points from basic plane)
EXTENDED *first last* (Any unicode code point set)
}
The BASIC parameter is purely a representation optimization; anything you
can encode with BASIC can also be encoded with EXTENDED.
FORK is just a different encoding of SPLIT, for instruction space reasons.
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev