You could look at the JavaFX lexer. JavaFX allows expressions in strings in a similar manner but I did not need to use so many predicates. It would probably help you. Find the JavaFX project on Kenai and you can download the source code. Just serach for *.g and you will find the lexer.
Jim > -----Original Message----- > From: Marcus Klimstra [mailto:[email protected]] > Sent: Thursday, May 27, 2010 7:58 AM > To: Jim Idle > Subject: Re: [antlr-interest] Solution for specialStateTransition > exceeding 65k > > Hi Jim, > > Basically the language has string literals which can contain > 'placeholders'; expressions surrounded by angle brackets: > > stringLiteral > : SQUOTE! stringPart* SQUOTE! > ; > > stringPart > : STRCONT > | LT! expr XGT! > ; > > expr can also be a string, so 'foo <bar('baz')> quux' would be a valid > expression. The only exception is that '> is not allowed within > placeholders. > > The lexer handles this with a stack of 'modes'. All operators and > keywords have a predicate that the current mode must be 'normal' (i.e. > outside a string or in a placeholder). When inside a placeholder the > '>' character yields a XGT token instead of the normal GT, to prevent > it from being cobbled up by a relational expression. > > PLUS : {inNormal}?=> '+' ; MINUS : > {inNormal}?=> '-' ; MUL : {inNormal}?=> > '*' ; DIV : {inNormal}?=> '/' ; > MOD : {inNormal}?=> '%' ; //etc NOT : > {inNormal}?=> 'not' ; OR : {inNormal}?=> > 'or' ; AND : {inNormal}?=> 'and' ; > TRUE : {inNormal}?=> 'true' ; FALSE : > {inNormal}?=> 'false' ; //etc > > SQUOTE > : {inNormal}?=> '\'' { pushMode(MODE_STRING); } > | {inString}?=> '\'' { popMode(); } > ; > > XGT : {inPlaceholder}?=> '>' { popMode(); } > ; > > GT : {inNormal}?=> '>' > ; > > LT : '<' { if (inString) { > pushMode(MODE_NORMAL); } } > ; > > STRCONT > : {inString}?=> ('a'..'z'|'A'..'Z'|'0'..'9'|' '|'_')+ > ; > > As you can see, at the moment strings can only contain /[a..z][0..9] > _/i, since using (~('\''|'<'))+ results in an OutOfMemoryError... > > inNormal, inString and inPlaceholder are booleans which are updated by > pushMode and popMode: > > private void updateMode() { > Integer mode = stack.peekFirst(); > inNormal = (stack.isEmpty() || mode == MODE_NORMAL); > inString = (mode == MODE_STRING); > inPlaceholder = (mode == MODE_NORMAL); } > > Although my current approach seems to work pretty well, I am ofcourse > open for suggestions. I can't really wait for ANTLR v4 however :) > > Thanks, > > - Marcus > > > On Thu, May 27, 2010 at 3:50 PM, Jim Idle <[email protected]> > wrote: > > > > There is quite often a way to rejig the lexer to avoid the huge > > expansion, if you post your grammar, maybe we can help. I think that > > such issues will go away in v4 :-) > > > > Jim > > > > > -----Original Message----- > > > From: [email protected] [mailto:antlr-interest- > > > [email protected]] On Behalf Of Marcus Klimstra > > > Sent: Thursday, May 27, 2010 2:19 AM > > > To: [email protected] > > > Subject: [antlr-interest] Solution for specialStateTransition > > > exceeding 65k > > > > > > Hi, > > > > > > I ran into the problem of the huge specialStateTransition bytecode > > > size when using many gated semantic predicates in the lexer (in all > > > my lexer rules actually). After a google search I found that this > > > is a known issue to which there are some workarounds, but no real > > > solutions. At first I used the workaround to manually add local > > > variables for the outer-class references, but at some point even > that no longer worked. > > > Therefore I changed the Java code generator to create seperate > > > methods for each switch-case. This works quite well for me, so I > > > wanted to share it with the community. Note that I only tested this > > > in the lexer, since my parser has no specialStateTransition-method > > > at the moment. I also added annotations to suppress the useless > > > warnings in the generated code. A diff-file with these changes is > attached. > > > > > > - Marcus > > > > > > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > > http://www.antlr.org/mailman/options/antlr-interest/your-email- > address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
