Re: Parsing Code Blocks

Michel Fortin Thu, 22 May 2008 20:25:37 -0700

Le 2008-05-16 à 0:31, Yuri Takhteyev a écrit :

Your first two examples are not treated as the same by any
implementation.  It seems that all implementations interprete this:

~~~

One

Two

   Three

  Four

  Five
~~~

as meaning that "One" is in a code block, but "Two" is not.

Or did you mean to put a few more spaces in front of "Two"?

Hum, yes I did, and in fact I had. It just looks like my email client(Mac OS X's Mail) eat the first space on each line that begins with aspace... I really wish it wasn't using Web Kit as its text editor whenin text-only mode.

[spec]: <http://michelf.com/specs/markdown-extra/#block-element-generator>


I think it would help if the spec maked it more clear what part of
each line of the blockquote is consumed before we go looking for
sub-elements, especially as far as consuming initial whitespace goes.


Quoting item 2 of blockquote (at the moment you wrote the above):

> A run of the [block element generator](#block-element-generator) by
> pushing the following sequence to the <var>context-line-prefix</var>
> stack:
> 1.  Zero or one [insignificant-indent](#insignificant-indent)
> 2.  ">"
> 3.  Zero or one [space](#space)

This means that the block element generator is used as a grammar ruleat this point. It matches if it can generate one or more blockelements. Since each rule in the block generator first checks for ahard-block-content-line-prefix, you could check for yourself that youcan match a hard-block-content-line-prefix prior calling the generator(this *could* be more performant).


I've added this to the block element generator section:

> The block element generator is used as a parsing rule in thegrammar of> the document element generator and the block element generator. Theblock> element generator matches if it one of the following rule matchesand creates

> an element.

That said, I decided to revamp the blockquote rule to no longer usedirectly the block element generator. Everything now passes through arule named block-element-run, matching one or more block element(using the block-element generator), and the blockquote first ">" isparsed separately in the blockquote rule instead of indirectly fromattempting to parse block elements.


Does this makes it clearer?

By the way, I agree things are not optimal at the moment. They arealso way off the tracks of what PHP Markdown and Markdown.pl actuallydo in many cases. The plan is to start by making something that mostlywork. Then I'll compare with the actual regular expressions used inthe code and do the adjustments as necessary. After that, I'll comparewith test cases in MDTest, and with the output given by otherimplementations in Babelmark. And I might mix the order a bit.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


_______________________________________________
Markdown-Discuss mailing list
[email protected]
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Parsing Code Blocks

Reply via email to