Thanks for the quick response. There was a bug in my printf statements causing the pointer addresses to be incorrect. I was fairly certain they worked as you described but I wanted to be sure.
On 11/19/10, Jim Idle <[email protected]> wrote: > The very first token gives you a =1 for the char position in line I am > afraid, I need to work around that I think, but the indexes are pointers in > to memory (your input) and not 0, 1, 2 etc. Note that the token also > remembers that start of the line that it is located on. > > If the start of the first token is not the start of your data, then perhaps > there are comments and newline tokens that are skipped before the first > token that the parser sees? If this did not work, there would be a lot of > broken parsers out there. > > So, use the pointer to get the start, subtract it from the end pointer to > get the length and print out that many characters, which will show you what > the token matched. The line start is updated when a '\n' is seen by the > parser, but you can change the character. This is useful for error messages > when you want to print the text line that an error occurs in. > > The offset of the token is the start point minus the input start (use the > address you pass in (databuffer) and not input->data), however, the pointer > is pointing directly at that anyway. I think that you are forgetting that > the token stream does not return off channel tokens or SKIP()ed tokens. > > Jim > > > >> -----Original Message----- >> From: [email protected] [mailto:antlr-interest- >> [email protected]] On Behalf Of A Z >> Sent: Friday, November 19, 2010 4:44 AM >> To: [email protected] >> Subject: [antlr-interest] C target character position >> >> Hello, >> >> I'm trying to record the offset of the start of a token, relative to >> the beginning of the input buffer. My program passes a (char *) buffer >> to ANTLR and then runs a simple grammar that builds a data structure >> containing the element types and pointer to their position in the text >> buffer. The problem is I can't find a way to get the true character >> offset from ANTLR in order to set the pointer. Below it prints out the >> results of most of the values for the ANTLR3_COMMON_TOKEN for the very >> first token. The two subsequent values are the data member and the >> address of the character buffer. I would expect start, getStartIndex >> and input->data to be the same but they are different. How can I find >> the offset of a token, in terms of the number of characters from the >> start of the stream? >> >> Thanks >> >> charPosition : -1 >> getCharPositionInLine : -1 >> getLine : 1 >> getStartIndex : 23213648 >> getStopIndex : 23213653 >> getTokenIndex : 0 >> index : 0 >> line : 1 >> lineStart : 23213648 >> start : 23213648 >> stop : 23213653 >> >> (pANTLR3_INPUT_STREAM)input->data 23217928 >> (uint8_t*)dataBuffer 23213624 >> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest >> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- >> email-address > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
