I played with PEGs awhile ago. Recommend reading the wikipedia article. You can use a couple of lookahead type tricks, pay attention to the bit about "syntactic predicates."
I can't remember what the exact phrasing is in PetitParser, but there should be an "and" predicate and a "not" predicate that can effectively give you look ahead because they don't consume any input... Or something like that. After grokking the whole predicates bit, you will probably look at PetitParser again and it'll just make sense. That's how it went for me anyhow, though it's already rather jumbled because it was about a week of my life a year ago;) On Apr 25, 2011, at 2:42 PM, Esteban Lorenzano <[email protected]> wrote: > Hi Lukas, all > I'm finally working on a HTML petit parser (a very basic one, based on XML > petit parser) and I have a serious problem (well... besides my complete > ignorance about petit parser, he...) > I need to match this pattern: > > openTag, contents, closeTag (that will be something like "<html> ... > </html>") > inlineTag (that will be something like "<br/>") > openTag (that will be something like "<link ...>" or "<img > src='anUrl'>") > > so, after try some variants... I came with this construct: > > element > "[39] element ::= EmptyElemTag | STag content ETag" > > ^(self inlineTag / (self openTag, content, self closeTag) / self openTag) > ==> [ :nodes | ]. > > openTag > ^ $< asParser, qualified, whitespace optional, attributes, whitespace > optional, $> asParser > > inlineTag > ^ $< asParser, qualified, whitespace optional, attributes, whitespace > optional, '/>' asParser > > closeTag > ^'</' asParser , qualified , whitespace optional , $> asParser > > > so... the problem here is that the statement > > self openTag, contents, self closeTag > > matchs with > > ... > <link ...> > </html> > > and for that reason, the resulting tree is invalid. > > So, I need a way to ensure the openTag name is equal to the closeTag name. > > How can I do that? > > Cheers, > Esteban
