How do I support nested comments with ANTLR?

The standard example wont work with greedy/non-greedy, matching either
too much or not enough.

Here is a basic sample:
        <!---
                a comment
                <!--- nested comment --->
                still comment
        --->
        this is not commented
        <!--- more commenting --->

I want to tokenise comments, rather than ignore them, since the text
inside might be significant later on.

How the comments are tokenised doesn't matter - so I don't mind if
that first comment is split into 1/2/3 tokens.


However, Given that comments can be nested to an arbitrary level, I'm
having trouble working out how to define the rule without recursing
into itself.

(For practical purposes, I could probably set a maximum of maybe
ten-twenty levels deep, but I'd prefer not to create potential edge
cases like that.)


Here's my current attempt...

        COMMENT:
                SUB_CommentStart
                ( SUB_NoComment | COMMENT )*
                SUB_CommentEnd
        ;
        
        fragment SUB_CommentStart : '<!---' ;
        fragment SUB_CommentEnd   : '--->' ;
        fragment SUB_NoComment    : ~(SUB_CommentStart|SUB_CommentEnd)+;


Which of course fails due to including itself - so what's the best way
to fix that?

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to