On 12/20/12, David A. Wheeler <dwhee...@dwheeler.com> wrote:
> Alan Manuel Gloria:
>> I pretty much learned parsing from Parsec, so I'm actually more
>> familiar with Parsec than with the standard parsing syntax in books.
>
> Yes, but we have to explain it to *others*.  That's an unusual road to learn
> parsing!!

Yes!  I found yacc/bison too difficult for me, especially regarding
shift/reduce and reduce/reduce conflicts (I still have no idea what
those are, I only know it has something to do with the state machine,
which can either "shift" a token into a stack (??) or "reduce" a
sequence on the stack (??) or input stream (????) or something (?????)
into some parse rule).

I was actually looking for uses of Monads when I stumbled on Parsec,
then I followed Haskell's rules and expanded little bits of Parsec
parsers by hand, and lo and behold, it reduces to a rather cute
recursive-descent parser (especially if you avoid using `try`).

(Monads are kewl, btw.  It's what I used in letterfall, which is why
the letterfall program (the main module) looks almost like old-style
BASIC programs.  Even though it's a modern GUI program, which implies
inversion of control: the inner core is a continuation monad, whose
continuation is executed when an event occurs.  For various reasons
(Guile's heavyweight continuations, in particular, but there are
others that are not at all tied to performance, I can't remember why
out-of-hand) I don't use call/cc)

Personally, I'll probably grok a parser spec if I can get it converted
in my head into Parsec (or some elaboration thereof), because I
implicitly trust Parsec because it's so kewl (^^);;

>  My expectation is that most readers will know standard parsing
> notation, not Parsec's.  I'd also guess (though this is far less certain to
> me) that many readers won't know Haskell.  Haskell's lazy evaluation is
> *really* different from Scheme and ML's eager evaluation, even though they
> can all be used as functional languages.

Yes, although IMO that's fine.  My proposed approach is:

1.  We define some syntax for our parser parser (call it FOOPP).
FOOPP supports %INDENT %SAME %DEDENT internally as special guard
symbols, the rest of FOOPP is standard parsing lore.

2.  Write our parser spec in FOOPP input.

3.  Run FOOPP on our parser spec.  It outputs X code (where X = ANTLR
| Parsec | YACC | whatever).

4.  Process X code into target code

5.  Show that the target parses our examples correctly.

--

Of course, admittedly that's a lot more complicated than writing in
some pre-existing parser parser language directly (like ANTLR).  My
main concern is that there is no current parser parser that already
implements our notion of INDENT / SAME / DEDENT (i.e. guards, not
actual input tokens), so I worry that any spec we make may actually
depend on peculiarities of the parser parser's operation.

That is, we should be able to define %INDENT %DEDENT %SAME and use
them in a parser that isn't t-expr parsing, and show that they are
general and valid transformations (or something, anyway).

In implementation, I suppose that INDENT/DEDENT/SAME would end up
having the bulk of its operations inside the action part.  In Parsec,
the "action" part of a rule can still signal failure of the rule as a
whole (I think this is a property of LL parsers) and Parsec has a
valid transformation for that case, I still have to hack into the
ANTLR docs to see if it can be done.

> I've also found that there may be a future work-related reason for me to
> learn ANTLR, which is another reason (for me) to try using it.
>
> I think I'll try a "stupid stub" grammar with ANTLR v3 and see what I learn
> about it.  Sometimes a little experiment teaches more than anything else.
> Basically, I'll just try a few rules (not trying to process the whole thing
> or complicated parts) & see if I can get it to run.
>
>> we can't use a preprocessor since we are "supposed" to use only flat
>> character streams provided by Scheme.
>
> I think we can express it at a higher level in the BNF that makes it
> *appear* to be a preprocessor, and then, in a hand-written recursive descent
> parser, optimize that away.  I already have some ideas of how to do that.

It's pretty standard in LL parsing lore to have "tokens" be just
productions, and have inner "lexer" productions use characters
directly as tokens.  At least this is pretty standard in Parsec
parsing, anyway. (^^);;

Sincerely,
AmkG

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Readable-discuss mailing list
Readable-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/readable-discuss

Reply via email to