Re: Backtick Hickup

Eric Astor Mon, 27 Aug 2007 14:03:52 -0700

Michel Fortin wrote:

As to how to parse it with an incremental parser, I assume you could dothat:
    text: this
    mark: **
    text: is
    mark: `
    (switch tokenizer into "raw" mode until it sees a backtick)
    text: raw** text
    mark: `
    (take last text token, remove backtick marks, and make a code span)
    (switch back tokenizer into "span" mode)
    end reached in span
The hard part comes when no matching backtick is found (assumingnon-paired backticks do not constitute code). Here's what I suggest forthe same case with no ending backtick:
    text: this
    mark: **
    text: is
    mark: `
    (switch tokenizer into "raw" mode until it sees a backtick)
    text: raw** text
    end reached in raw
      (reparse last text token in "span" mode)
        text: raw
        mark: **
(take tokens between the two ** marks and put them in emphasis,the two marks are removed)
        text: text
        end
Note that in this case backtracking is limited to the last token, whichis itself limited in length by the current block (paragraph, list item,...). I have no idea how that could fit any formal grammar languagehowever.

Well - has anyone else looked into ANTLR 3.0 at all? The LL(*) grammarlanguage it uses (an EBNF) allows for full backtracking support, andunspecified lookahead as far as necessary. It's fairly well-optimized,as I understand it, taking advantage of some of the packrat-parsingideas to save handling a single text section repeatedly...

I suspect Markdown might be formally specifiable in ANTLR v3, and I'dbet that even if it's not, it's very close. If it is - getting Markdownparsers into various languages would just be a matter of helping developnew ANTLR v3 language-translation backends.


- Eric Astor
_______________________________________________
Markdown-Discuss mailing list
[email protected]
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Backtick Hickup

Reply via email to