Hi David, Thank you for your suggestion. However, a MIX can start with number, '_' and '.' :((
Actually I am trying to write a CIMP message format in Antlr. Reference: http://www.parse2.com/example-cargoimp-FFA4.shtml Alpha = %x41-5A; Numeric = %x30-39; Decimal = %x30-39 / "."; Mixed = Alpha / Numeric; Text = %x41-5A / %x30-39 / "." / "-" / " "; <--- this is my MIX token This format can be written in ABNF easily... but in Antlr, once I introduce the MIX token, everything which is mixed of numeric and alpha is returned as a MIX. Currently I have to use Java code in action to split the MIX string. I wonder if there's a better way to define tokens because my grammar now is full of Java code :(! For example: manifestHeader :((n=NUMBER) SLANT (r1=field) SLANT (r2=field) SLANT (r3=ALPHA) (SLANT (r4=field)?)? ) { ffm.setAttribute("MessageSequenceNumber", $n.text); ffm.setAttribute("CarrierCode", $r1.value.substring(0,2)); ffm.setAttribute("FlightNumber", $r1.value.substring(2)); ffm.setAttribute("Day", $r2.value.substring(0,2)); ffm.setAttribute("Month", $r2.value.substring(2)); ffm.setAttribute("AirportCode", $r3.text); if ($r4.value != null) ffm.setAttribute("AircraftIdentification", $r4.text); } ; Regards, Helen > Message: 1 > Date: Thu, 22 Oct 2009 03:20:47 +0100 > From: David-Sarah Hopwood <[email protected]> > Subject: Re: [antlr-interest] [Antlr3 grammar] how to specify alpha > token, numeric token and mix of both > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=UTF-8 > > Hieu Phung wrote: > > Hi all, > > > > My grammar has 3 kinds of tokens: > > 1) number: contain numeric character > > 2) alpha: contain alphabetic character; > > 3) mix: contain number and alpha and hyphen, full stop or space > > > > For example: > > 1/VEC305/03MAR/PTY > > => in the above input data, 03MAR should be interpreted as a number of > > length 2 followed by alpha of length 3. But VEC305 is a mix of length 6. > > > > If I define grammar like below: > > > > NUMBER : ('0'..'9')+ ; > > ALPHA : ('a'..'z'|'A'..'Z')+; > > MIX : (NUMBER | ALPHA | OTHER)+; > > fragment OTHER : (' ' | '-' | '.')+; > > SLANT : '/'; > > > > Antlr will return me VEC305 and 03MAR as two MIX tokens. Is there any way > to > > define tokens such that Antlr will return me number, slant, mix, slant, > > number, alpha, slant, alpha for the input "1/VEC305/03MAR/PTY" ? > > Since you don't want "03MAR" to be interpreted as a MIX, presumably you > mean that a MIX cannot start with a NUMBER. In that case, try: > > fragment DIGIT : '0'..'9' ; > fragment LETTER : 'a'..'z' | 'A'..'Z' ; > fragment SYMBOL : ' ' | '-' | '.' ; > > NUMBER : DIGIT+ ; > ALPHA : LETTER+ ; > MIX : LETTER+ (DIGIT | SYMBOL) (DIGIT | LETTER | SYMBOL)* > | SYMBOL (DIGIT | LETTER | SYMBOL)* > ; > SLANT : '/'; > > -- > David-Sarah Hopwood ? http://davidsarah.livejournal.com > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---
List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
