This is a very tricky thing to perfectly, but not so difficult to do as a 'best guess' type of algorithm. For instance if the comments are found before certain tokens and can be just pushed to the output before the translated version (like doxygen comments or javadoc etc), or if 'comments close by' is a reasonable guess. It is difficult to speak to you problem generically, but some translations make this easy enough and some very difficult.
However, what you will need to do is locate the token that 'starts' your construct output, then find its equivalent token position in the original tokenized input stream. If the token in the tree is from the original input stream then it is easy, otherwise you can use the user1, user2, user3 fields of a token to record the token that 'starts' the code you have translated or perhaps the start and end tokens that are the comment block. Now, knowing the input token position, you can traverse backwards in the token stream (use get and not LT as LT skips off channel tokens) and find the first of the comment tokens that precedes it (by checking the token's channel). This will be easier if you set the comments to a particular channel and not just HIDDEN (which is channel 99). When you know the token position of the comment token, then you can traverse forwards and copy the token text to the output (changing the comment lead-in characters should you need to) using the pointers available in the token (which point to the original text). So, you just need to get familiar with asking the tree nodes for their tokens and then asking the tokens what index they are and using the get methods to access the tokens in the input stream. So: // A comment // Another // yet another int Cfunc( .... So, if the comments are going on channel 2 then you will have: 0 COMMENT 1 COMMENT 2 COMMENT 3 ID 4 ID 5 LPAREN Now, your first parser is probably going to generate ^(FUNCDECL ID ID .....) You can now attach the index of the first comment (0) to user1 and then index of the last comment to user2 of say FUNCDECL, or the first ID. Assuming that the token is preserved through all the rewrites, then this information will propagate to your final AST. Of course this is just illustrating what you need to do generally as I do not know exactly what you are trying to do. Jim > -----Original Message----- > From: [email protected] [mailto:antlr-interest- > [email protected]] On Behalf Of Howard Nasgaard > Sent: Wednesday, August 11, 2010 1:13 PM > To: [email protected] > Subject: [antlr-interest] How do I preserve comments in a language to > language translator > > I am writing a translator that will convert from one version of a language to a > newer version of that language. The versions are syntactically similar so their > underlying ASTs are similar. I am using parsers for the grammar and tree > grammars generated as C++. The old language is parsed and an AST is built. > Then numerous walks of the AST are done using generated tree grammars. > One of the walks creates a new AST, the translation, which conforms to the > tree hierarchy that describes the new language elements. A final walk of the > new AST "pretty prints" the translation. > > As part of the translation walk, or whatever works, I would like to copy as > many of the comment tokens across to the new AST as possible. Based on > my reading, the comments are there as they are being directed to the > HIDDEN channel. It is just not clear how, in my tree grammar, I would access > them. I have been unable to find any descriptions of how to do this that > apply to antlr3 and C++. > > Howard W. Nasgaard > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
