On 12/20/12, David A. Wheeler <dwhee...@dwheeler.com> wrote: > Alan Manuel Gloria: >> I pretty much learned parsing from Parsec, so I'm actually more >> familiar with Parsec than with the standard parsing syntax in books. > > Yes, but we have to explain it to *others*. That's an unusual road to learn > parsing!!
Yes! I found yacc/bison too difficult for me, especially regarding shift/reduce and reduce/reduce conflicts (I still have no idea what those are, I only know it has something to do with the state machine, which can either "shift" a token into a stack (??) or "reduce" a sequence on the stack (??) or input stream (????) or something (?????) into some parse rule). I was actually looking for uses of Monads when I stumbled on Parsec, then I followed Haskell's rules and expanded little bits of Parsec parsers by hand, and lo and behold, it reduces to a rather cute recursive-descent parser (especially if you avoid using `try`). (Monads are kewl, btw. It's what I used in letterfall, which is why the letterfall program (the main module) looks almost like old-style BASIC programs. Even though it's a modern GUI program, which implies inversion of control: the inner core is a continuation monad, whose continuation is executed when an event occurs. For various reasons (Guile's heavyweight continuations, in particular, but there are others that are not at all tied to performance, I can't remember why out-of-hand) I don't use call/cc) Personally, I'll probably grok a parser spec if I can get it converted in my head into Parsec (or some elaboration thereof), because I implicitly trust Parsec because it's so kewl (^^);; > My expectation is that most readers will know standard parsing > notation, not Parsec's. I'd also guess (though this is far less certain to > me) that many readers won't know Haskell. Haskell's lazy evaluation is > *really* different from Scheme and ML's eager evaluation, even though they > can all be used as functional languages. Yes, although IMO that's fine. My proposed approach is: 1. We define some syntax for our parser parser (call it FOOPP). FOOPP supports %INDENT %SAME %DEDENT internally as special guard symbols, the rest of FOOPP is standard parsing lore. 2. Write our parser spec in FOOPP input. 3. Run FOOPP on our parser spec. It outputs X code (where X = ANTLR | Parsec | YACC | whatever). 4. Process X code into target code 5. Show that the target parses our examples correctly. -- Of course, admittedly that's a lot more complicated than writing in some pre-existing parser parser language directly (like ANTLR). My main concern is that there is no current parser parser that already implements our notion of INDENT / SAME / DEDENT (i.e. guards, not actual input tokens), so I worry that any spec we make may actually depend on peculiarities of the parser parser's operation. That is, we should be able to define %INDENT %DEDENT %SAME and use them in a parser that isn't t-expr parsing, and show that they are general and valid transformations (or something, anyway). In implementation, I suppose that INDENT/DEDENT/SAME would end up having the bulk of its operations inside the action part. In Parsec, the "action" part of a rule can still signal failure of the rule as a whole (I think this is a property of LL parsers) and Parsec has a valid transformation for that case, I still have to hack into the ANTLR docs to see if it can be done. > I've also found that there may be a future work-related reason for me to > learn ANTLR, which is another reason (for me) to try using it. > > I think I'll try a "stupid stub" grammar with ANTLR v3 and see what I learn > about it. Sometimes a little experiment teaches more than anything else. > Basically, I'll just try a few rules (not trying to process the whole thing > or complicated parts) & see if I can get it to run. > >> we can't use a preprocessor since we are "supposed" to use only flat >> character streams provided by Scheme. > > I think we can express it at a higher level in the BNF that makes it > *appear* to be a preprocessor, and then, in a hand-written recursive descent > parser, optimize that away. I already have some ideas of how to do that. It's pretty standard in LL parsing lore to have "tokens" be just productions, and have inner "lexer" productions use characters directly as tokens. At least this is pretty standard in Parsec parsing, anyway. (^^);; Sincerely, AmkG ------------------------------------------------------------------------------ LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d _______________________________________________ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss