+++ Michel Fortin [May 11 08 22:26 ]: > Le 2008-05-11 à 20:55, Jacob Rus a écrit : > >> You should write it in something closer to a BNF-like format. The >> current version is about 10x more verbose than necessary, and it makes >> reading the spec considerably more difficult. > > The reason I'm doing it like this is that I doubt everything will be > expressible in a BNF format.
You can come pretty close with a PEG grammar: http://github.com/jgm/peg-markdown/tree/master/markdown_parser.leg#L236 I have implemented the basic markdown syntax + the footnote syntax from PHP markdown extra, and so far I've found only two things that can't be cleanly expressed using a PEG: 1. Indented block contexts like lists and blockquotes. Here I use a multi-pass approach. The first pass takes, say, a list item 1. my list item - with - nested list and returns a listitem with "raw" contents my list item - with - nested list which are piped through the markdown parser again. 2. Inline code. PEG can't express "a row of backticks, followed by a string of characters not containing an equally long row of backticks, followed by an equally long row of backticks." It can express, for particular values of N, "a row of N backticks, followed by a string of characters not containing a row of N backticks, followed by a row of N backticks." So if you have a fixed limit on the number of backticks that can start a stretch of inline code, you're okay. peg-markdown sets this limit at 5, which should be enough for most purposes. But one could set it higher without much of a performance penalty. The PEG representation is concise, precise, and readable. But the big advantage is that it can be converted automatically into a fast parser. This means that you can be sure that your markdown program really does implement the formal specification. An informal English specification won't give you that. John _______________________________________________ Markdown-Discuss mailing list [email protected] http://six.pairlist.net/mailman/listinfo/markdown-discuss
