Hmm, delightful. Thanks for sharing. There's obviously some very talented people out there :-)
Gotta put this in my input queue for later consumption. JJ downs Wrote: > Justin Johansson wrote: > >downs Wrote: > >> > >> Justin Johansson wrote: > >>> Can D people please recommend suitable tools for generating a parser (in > >>> D) for an LL(1) grammar. There's bound to be much better parser > >>> generator tools available nowadays, since my last foray into this area > >>> 10+ years ago with YACC. I've heard of tools like bison, SableCC etc but > >>> apart from the names know nothing about them. > >>> > >>> (Note. This question is not about writing a parser for D. It is about > >>> writing a parser in D for another language which has an LL(1) grammar). > >>> > >>> Thanks in advance for all help. > >>> > >>> -- Justin Johansson > >>> > >> In a completely different vein, tools.rd is a simplicistic recursive > >> descent parser framework implemented at compiletime that I've used for > >> most/all of my toy languages. It keeps things trivial - there's no lexing > >> stage, it parses straight from input string. It's not that well > >> documented, but if you want, give me a simple language description and I > >> can write you a sample parser. It's probably the easiest to use though - > >> just mix it in from D code :) > > > > Hi downs, > > > > Thanks for the offer but since YACC is my prior background I'll probably go > > to the closest tool which is the modern variant for LL(1). Still if you > > have a small sample to share I'm sure other D people will be delighted. > > > > <JJ/> > > > > Well for instance, take the PAD (Pastebin Adventure) component of my IRC bot, > that can run simple text adventures from a variety of sources, like local > Gobby sessions, Wikis and (originally) Pastebin.com: > > http://dsource.org/projects/scrapple/browser/trunk/idc/pad > > Let's look at > http://dsource.org/projects/scrapple/browser/trunk/idc/pad/engine.d > > L175: gotToken > > Functions like this form the building blocks of tools.rd parsing. They always > have the form "bool gotBlarghle(ref string st, out T result)" and return true > if result could be parsed from st, otherwise false (in which case st is not > modified). > > gotToken trivially removes a token from the input text. > > L200: bool accept(ref string st, string cmp): This function is called > internally by the parser framework to decide if st starts with a comparison > string, in which case it is removed and true returned. bool accept removes > tokens from both strings and compares until a comparison fails (false, st not > modified) or cmp is used up (true). > > L230: The first use of the actual Parser DSL. > > return mixin(gotMatchExpr("s: log")); > > This simply matches "log" against the input string s. Nothing fancy. > > L282: Not related to the parser but still, IMHO, insanely cool. > const string Table = ` > | bool | int | string | float > --------+---------------+-------------+----------------------+-------- > Boolean | b | b | b?q{true}p:q{false}p | ø > Integer | i != 0 | i | Format(i) | i > String | s == q{true}p | atoi(s) | s | atof(s) > Float | ø | cast(int) f | Format(f) | f`; > > This table contains a conversion matrix for internal types to basic type. Two > things are of interest: > > 1) q{}p is unrolled by .litstring_expand() into nested and escaped ""s. It's > a backport of D2 nestable string literals to D1. > > 2) The table itself. tools.ctfe contains functionality to select rows, > columns, and iterate the table in column-major order. This means the above > table can be automatically translated into nested if/switch statements. > > L487: A more instructive use of the parser framework. > > if (mixin(gotMatchExpr("st: > [==$#eq=true$|!=$#neq=true$|<=$#eq=smaller=true$|>=$#eq=greater=true$|<$#smaller=true$|>$#greater=true$] > " > "$dg2 <- genExprMath$" > ))) { ... } > > Okay, first we have a conditional branch: [a|b|c|d]. This matches each of the > possible branches against the input string in turn. Segments in $$ indicate > variable matches and/or programmatic reactions. $#eq=smaller=true$ basically > translates to "execute eq=smaller=true when this part of the parse string is > successfully reached. ". > > "$dg2 <- genExprMath$" means "Generate dg2 using the genExprMath function" It > is assumed that this function follows the convention of bool(ref string, out > typeof(dg2)). > > It hasn't been used in that sample, but "y <- foo/x" means "pass x as an > extra parameter to foo". And that's basically it. :) > > Oh, just for fun, here's the unrolled D syntax for the above expression: > > (ref string s) { > auto scratch = s; > return ( > true && (ref string s) { > auto scratch = s; > return (true && scratch.accept("==") && (((eq=true), > true))) && ((s=scratch), true) > || (((scratch=s), true) && scratch.accept("!=") && (((neq=true), > true))) && ((s=scratch), true) > || (((scratch=s), true) && scratch.accept("<=") && > (((eq=smaller=true), true))) && ((s=scratch), true) > || (((scratch=s), true) && scratch.accept(">=") && > (((eq=greater=true), true))) && ((s=scratch), true) > || (((scratch=s), true) && scratch.accept("<") && (((smaller=true), > true))) && ((s=scratch), true) > || (((scratch=s), true) && scratch.accept(">") && (((greater=true), > true))) && ((s = scratch), true); > }(scratch) && ( genExprMath(scratch, dg2 )) > ) && ((s = scratch), true); > }(st)
