I only played a little with PetitParser but I think the answer is in
PetitXml>>#element. You see in the action block that it compares the
"qualified" of the open and close tags and if they're different it returns a
PPFailure. It also takes care of the inlineTag in the same block by asking
if the fifth node is '/>'.
element
"[39] element ::= EmptyElemTag | STag content ETag"
^ $< asParser , qualified , attributes , whitespace optional , ('/>'
asParser / ($> asParser , content , [ :stream | stream position ] asParser ,
'</' asParser , qualified , whitespace optional , $> asParser)) ==> [ :nodes
|
*nodes fifth = '/>'*
ifTrue: [ Array with: nodes second with: nodes third with: #() ]
ifFalse: [
*nodes second = nodes fifth fifth*
ifTrue: [ Array with: nodes second with: nodes third with: nodes fifth
second ]
ifFalse: [ PPFailure message: 'Expected </' , nodes second qualifiedName ,
'>' at: nodes fifth third ] ] ]
I hope this helps.
Cheers,
Richo
On Mon, Apr 25, 2011 at 6:42 PM, Esteban Lorenzano <[email protected]>wrote:
> Hi Lukas, all
> I'm finally working on a HTML petit parser (a very basic one, based on XML
> petit parser) and I have a serious problem (well... besides my complete
> ignorance about petit parser, he...)
> I need to match this pattern:
>
> openTag, contents, closeTag (that will be something like "<html> ...
> </html>")
> inlineTag (that will be something
> like "<br/>")
> openTag (that will be something
> like "<link ...>" or "<img src='anUrl'>")
>
> so, after try some variants... I came with this construct:
>
> element
> "[39] element ::= EmptyElemTag | STag content
> ETag"
>
> ^(self inlineTag / (self openTag, content, self closeTag) / self
> openTag)
> ==> [ :nodes | ].
>
> openTag
> ^ $< asParser, qualified, whitespace optional, attributes,
> whitespace optional, $> asParser
>
> inlineTag
> ^ $< asParser, qualified, whitespace optional, attributes,
> whitespace optional, '/>' asParser
>
> closeTag
> ^'</' asParser , qualified , whitespace optional , $> asParser
>
>
> so... the problem here is that the statement
>
> self openTag, contents, self closeTag
>
> matchs with
>
> ...
> <link ...>
> </html>
>
> and for that reason, the resulting tree is invalid.
>
> So, I need a way to ensure the openTag name is equal to the closeTag name.
>
> How can I do that?
>
> Cheers,
> Esteban
>