Hieu Phung wrote:
> Hi all,
> 
> My grammar has 3 kinds of tokens:
> 1) number: contain numeric character
> 2) alpha: contain alphabetic character;
> 3) mix: contain number and alpha and hyphen, full stop or space
> 
> For example:
> 1/VEC305/03MAR/PTY
> => in the above input data, 03MAR should be interpreted as a number of
> length 2 followed by alpha of length 3. But VEC305 is a mix of length 6.
> 
> If I define grammar like below:
> 
> NUMBER    : ('0'..'9')+ ;
> ALPHA    : ('a'..'z'|'A'..'Z')+;
> MIX    : (NUMBER | ALPHA | OTHER)+;
> fragment OTHER    : (' ' | '-' | '.')+;
> SLANT    :    '/';
> 
> Antlr will return me VEC305 and 03MAR as two MIX tokens. Is there any way to
> define tokens such that Antlr will return me number, slant, mix, slant,
> number, alpha, slant, alpha for the input "1/VEC305/03MAR/PTY" ?

Since you don't want "03MAR" to be interpreted as a MIX, presumably you
mean that a MIX cannot start with a NUMBER. In that case, try:

  fragment DIGIT  : '0'..'9' ;
  fragment LETTER : 'a'..'z' | 'A'..'Z' ;
  fragment SYMBOL : ' ' | '-' | '.' ;

  NUMBER : DIGIT+ ;
  ALPHA  : LETTER+ ;
  MIX    : LETTER+ (DIGIT | SYMBOL) (DIGIT | LETTER | SYMBOL)*
         | SYMBOL (DIGIT | LETTER | SYMBOL)*
         ;
  SLANT  : '/';

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to