Jens!

It looks as if the “poker” language mainly consists of a simple list of key value pairs, so a plain tokenizer might suffice. Compared to tools like flex, or the scanner part of ANTLR, the focus is more on states and transitions compared to entire. When matching arbitrary text you probably also need to have a look at the longest-match Kleen star operator. With Ragel you can do many things “on the fly”, but if you are just transforming a list into a different format, you may not need this power. Likewise you wouldn't need the power of an LR or LL(*) parser (though LL(*) grammars are very easy to code and the speed penalty might be acceptable). You could use the tokenizer of the C runtime library and subsequently match keywords using the output from gperf. Simple and not slow.

I am using both techniques to remotely control an Asterisk PBX server (telphony system) using the AMI protocol (http://www.voip-info.org/wiki/view/Asterisk+manager+API). The AMI protocol shares a lot with your "poker" language. The main difference is that I am dealing with a real-time system (asynchronous communication, timing issues, net problems, etc.) and I know that a valid input stream always has terminating characters (or I insert a "timeout" token at the socket level, so no need for expr**). Unfortunately not all Asterisk modules follow the AMI protocol exactly (instead of violating the protocol view this as extending the protocol) and there are a couple of exceptions that makes the handwritten code now very ugly. This is where Ragel starts to shine.

There are also various text transformation tools out there. I think it could be possible to transform your key value lists into SQL code without any written line of source code (if you don't count the code the transformation specification).

If you have a well behaved source, the system supplied tokenizer (+ gperf) is probably preferrable, otherwise Ragel. Ragel is more fun, though. There's a graphviz installer for your windows machine and starting from your simple example you could add some output for all available actions to see what's going on during execution time. It won't take long before you get a feeling how things work and where you must be careful.

Of course, if Adrian had a lot of spare time left over, he could add an "instrumentation" option to Ragel by adding diagnostic code to all states and transitions (essentially adding any allowable action with some code). In the simplest form there would be just some console output. A better solution would be to fire events though a socket and with a little extra work to control the input, one could write a nice graphics tool to visualize the FSM and its transitions. This would be helpful for beginners, but generally would be useful as a teaching tool for a college level CS class.

Happy tokenizing,
jg


_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to