Hi, Thanks to both of you for sharing your approaches. Right now I am pondering how to alter the sequence of tokens before they hit the parser. Intuitively I want to have three processing units (lexer, pre-processor, parser) connected together through io pipes of tokens (e.g. token fifos), but this is not how ANTLR was architected (it's how I would have done it in hardware though!).
Martin On 11-04-04 09:25 AM, Sam Harwell wrote: > I used a hand-crafted implementation of TokenSource between the lexer and > parser. In the preprocessor, whenever I manipulated a token I used a new > token class derived from CommonToken (call it SubstitutedToken) which > contained a linked list leading from the effective position in the stream > (stored in CommonToken) all the way back to the original location (file and > position) of the token definition. When a CommonToken substitution occurs, > the linked list has one node containing the original source position where > defined. Whenever a SubstitutedToken substitution occurs, a new node for the > token's previous effective position is added to the linked list and that new > head pointer is stored in the new token. > > `define x 3 > `define y `x > `y > > In this case, token `y is eventually replaced with a SubstitutedToken which > appears at (line 2, column 1, length 1, text "3") containing the following > linked list: > > Line 3, column 1, length 2 (list head, the location where `y was substituted > with `x) > Line 2, column 11, length 2 (the location where `x was substituted with '3') > Line 1, column 11, length 1 (the actual source location where the token '3' > is defined) > > This list allows true relative ordering of all tokens in the processed > source: when two tokens appear to be at the same location in the > preprocessed stream, you simply compare the positions of the first node in > the position list. > > Sam > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of A Z > Sent: Monday, April 04, 2011 12:13 AM > To: Martin d'Anjou > Cc: [email protected] > Subject: Re: [antlr-interest] Q: how to incorporate a preprocessor in the > flow? > > Hi Martin, > > I just completed an SV preprocessor which can parse UVM 1.0 successfully. > After 2 revisions I settled on a completely separate preprocessor(lexer and > parser). As you saw, you need to tokenize the macro_text in order to easily > support macros with arguments and detect the three escaped tokens `", `\`" > and ``. I'm not sure how well a lexer only approach could handle cases where > a macro substitution can merge text with a previously lexed token. The > separate approach still has flaws, such as good error reporting. Of course I > could be missing an obvious easy solution. > > > > On Sun, Apr 3, 2011 at 9:51 PM, Martin d'Anjou<[email protected]> wrote: > >> Hello, >> >> I am trying to find a way to incorporate a preprocessor in the ANTLR >> flow. I thought of doing this before the lexer, but I need to tokenize >> the incoming char stream for macro substitution to be easy. I thought >> of doing it between the lexer and the parser, and replace the >> preprocessor tokens with their expansion before feeding the token >> stream to the parser, so I guess I would end up using something like >> the TokenRewriteStream??? Can someone steer me in the right direction >> please? Or should I be using lexer rule actions? In which case, any >> example on how to access the token stream of the replacement token >> list of an identifier? Too many questions sorry. >> >> The language I am hoping to tokenize is SystemVerilog and has C-like >> preprocessor macros (`include, `ifdef, `define NAME(params,...), token >> concatenation, etc.). >> >> Regards, >> Martin >> >> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest >> Unsubscribe: >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address >> > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
