Sorry, that is antlr.markmail.org Jim
> -----Original Message----- > From: [email protected] [mailto:antlr-interest- > [email protected]] On Behalf Of Jim Idle > Sent: Tuesday, September 07, 2010 12:38 PM > To: [email protected] > Subject: Re: [antlr-interest] getText() of C runtime. > > Please consult markmail.antlr.org, where I answer this question numerous > times ;-), the documentation of the API, or the code. I am contemplating just > getting rid of it and having C programmers just use the token to build the > string in whatever way they want. > > The STRING stuff is meant as an aide and is not useful if you want to parse > lots of things. Also, it is not a leak as it auto tracks the memory and releases it > when you free the tree walker. It is basically the support for $text. It gets a > new copy at each reference because I cannot know what you did with the > last copy. So, you must store the pointer if you want to reuse it. > > However, if you want something more efficient, then you must use the > token struct directly, which will give you pointers directly to the text in the > input. The demo C parser in the downloadable examples shows some > manipulation of this, but it is just a pointer to the start of the text and a > pointer to the end of the text. Assuming that you know the encoding of your > input, then you have everything you need. If you are not manipulating the > text, then you can use it directly without copying it, as in the downloadable > examples. > > Jim > > > -----Original Message----- > > From: [email protected] [mailto:antlr-interest- > > [email protected]] On Behalf Of Kenneth Domino > > Sent: Tuesday, September 07, 2010 12:23 PM > > To: [email protected] > > Subject: [antlr-interest] getText() of C runtime. > > > > Hi All, > > > > I'm using the C runtime of an Antlr-generated parser. I noticed a > > huge memory leak in my code, but it turns out it's because I call > > function > getText() > > (def'ed in antlr3commontoken.c of the Antlr C runtime) quite a bit, on > tree > > nodes during my hand-coded tree walking interpreter. > > Apparently, getText() creates > > a new copy of the string every time. Eg: > > > > pANTLR3_BASE_TREE node = ...; > > char * text = node->getText(node); > > char * text2 = node->getText(node); // text2 is another malloc'ed > > buffer containing the same string for node. > > > > However, if you read the source code, it obviously intends to do some > > memoizing, because it takes into consideration "token->textState", > > where the previous value computed is returned for ANTLR3_TEXT_STRING. > > I can, of course, and probably will, create a string table wrapper for > getText(). > > But I'm wondering if anyone knows if there is some way of hooking into > this > > part of the API so that that I don't have to. > > > > Ken > > > > The source for the runtime function is: > > > > static pANTLR3_STRING getText (pANTLR3_COMMON_TOKEN token) > > { > > switch (token->textState) > > { > > case ANTLR3_TEXT_STRING: > > > > // Someone already created a string for this token, so we just > > // use it. > > // > > return token->tokText.text; > > break; > > > > case ANTLR3_TEXT_CHARP: > > > > // We had a straight text pointer installed, now we > > // must convert it to a string. Note we have to do this here > > // or otherwise setText8() will just install the same char* > > // > > if (token->strFactory != NULL) > > { > > token->tokText.text = > > token->strFactory->newStr8(token->strFactory, > > (pANTLR3_UINT8)token->tokText.chars); > > token->textState = ANTLR3_TEXT_STRING; > > return token->tokText.text; > > } > > else > > { > > // We cannot do anything here > > // > > return NULL; > > } > > break; > > > > default: > > > > // EOF is a special case > > // > > if (token->type == ANTLR3_TOKEN_EOF) > > { > > token->tokText.text = > > token->strFactory->newStr8(token->strFactory, > (pANTLR3_UINT8)"<EOF>"); > > token->textState = ANTLR3_TEXT_STRING; > > return token->tokText.text; > > } > > > > > > // We had nothing installed in the token, create a new string > > // from the input stream > > // > > > > if (token->input != NULL) > > { > > > > ////////////////////// The following code does a malloc/string copy > > every time I call getText. ////////// > > return token->input->substr( token->input, > > > token->getStartIndex(token), > > token->getStopIndex(token) > > ); > > } > > > > // Nothing to return, there is no input stream > > // > > return NULL; > > break; > > } > > } > > > > > > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > > email-address > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
