On 09/08/2011 02:08 PM, [email protected] wrote: > Hello- I'm working on a grammar that needs to support embedded blanks in > strings: "identifier=two words" > The interpreter keeps breaking at 'two' and doesn't know what to do with > 'words'. > I was initially ignoring white space (because 'id1 = oneword, id2 =" two > words"' must also be supported with spaces around the = and ,), but > obviously, can't do that. > I have tried what was suggested in an archived post: > > STRING_LITERAL : (STRCHAR)+ ( ((' ')+ STRCHAR)=> (' ')+ (STRCHAR)+ )* > > But that didn't work either! (no viable alternative at input 'words'). It's > not including 'words' as part of the string. > > In my grammar: > fragment LETTER :('a'..'z' | 'A'..'Z'); > fragment DIGIT : '0'..'9'; > fragment OTHERCHARS : ('.' | '/' | '-' | '&'); > STRCHAR : (LETTER | DIGIT | OTHERCHARS)+;
Why can't ' ' be a part of either OTHERCHARS or STRCHAR? Then you don't need the syntactic predicate in your STRING_LITERAL rule.... I don't see your rule for handling the " characters. If you are worried about strings containing NLs or TABs (which would be errors), then you might want your STRING_LITERAL rule to check for them (in a semantic predicate instead and explicitly disallow them) instead of trying to allow blanks. > I have tried various combinations of handling the blank in the lexing v. the > parsing, including trying to create a quoted-string rule. > I will have to support the following: > > "identifier =two words" > identifier ="two words" Ouch, your not going to try and parse: "identifier =two words" as if it was identifier ="two parts" are you?? > The identifier=value pairs appear in a comma-separated line. There are > various nested structures of identifier=value pairs involved, which is why > both of the above formats are supported. > > *** Bottom line*** I just want to indicate: If a space appears between > quotation marks, include it as part of the current token; if not, throw it > away. > > I have everything working in a complex structure and tree walker except for > the embedded blanks allowed in strings! Any suggestions are appreciated. > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- Kevin J. Cummings [email protected] [email protected] [email protected] Registered Linux User #1232 (http://counter.li.org) List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
