[il-antlr-interest: 31627] Re: [antlr-interest] Problem in grammar (#a, #b, #c, #d, #f are not well recognized)

Jim Idle Fri, 25 Feb 2011 12:59:02 -0800

Why don't you try the grammar I contributed on antlr download page and see
if that works. The grammar you have here is not going to work as you have
hard coded the tokens in the grammar and ANTLR is generating a lexer that
will fall over on keywords etc. CSS is lot more difficult to parse than
you think.


Microsoft were going to hire me to write some parsers for a project but
they cancelled the project before I was even hired. I had jumped the gun
and written the CSS parser to get a head start on the project, so I just
contributed the grammar. I did not do a lot of testing, but it is pretty
accurate I think.

http://www.antlr.org/grammar/1240941192304/css21.g

Jim

> -----Original Message-----
> From: [email protected] [mailto:antlr-interest-
> [email protected]] On Behalf Of Aurélien Baudet
> Sent: Friday, February 25, 2011 11:43 AM
> To: [email protected]
> Subject: [antlr-interest] Problem in grammar (#a, #b, #c, #d, #f are
> not well recognized)
>
>    Hello,
>
> I'm currently writing an xtext plugin for css. I have a problem and I
> don't find any solution. My grammar works quite well for many css files
> but it fails on that:
>
> .dj_iPad #header #formSearch.disabled {
>       opacity: 1;
> }
>
>
>
> However, it works for (g instead of h) :
>
> .dj_iPad #header #gormSearch.disabled {
>       opacity: 1;
> }
>
>
> It works for any character different from a, b, c, e, f...
>
> So I think the parser recognize that as hex character.
>
> Can somebody help me fixing this bug ?
>
> The grammar:
>
> grammar css;
>
> options {
>       output=AST;
>       ASTLabelType=CommonTree;
>       language=Java;
>       //k=4;
> }
>
> tokens {
>       IMPORT;
>       NESTED;
>       NEST;
>       RULE;
>       ATTRIB;
>       PARENTOF;
>       PRECEDEDS;
>       ATTRIBEQUAL;
>       HASVALUE;
>       BEGINSWITH;
>       PSEUDO;
>       PROPERTY;
>       FUNCTION;
>       TAG;
>       ID;
>       CLASS;
>       PERCENTAGE;
>       UNIT;
>       PERCENTAGE;
>       EMS;
>       EXS;
>       LENGTH;
>       ANGLE;
>       TIME;
>       FREQ;
> }
>
>
> stylesheet:
>       charset?
>       importRule*
>       namespace*
>       (ruleset | media | page | font_face | keyframes)+;
>
> charset:
>       '@charset' STRING ';';
>
> namespace:
>       '@namespace' IDENT? (STRING|url) ';';
>
> importRule:
>       '@import' (STRING|url) (medias)? ';';
>
> medias:
>       IDENT (',' IDENT)*;
>
> keyframes:
>       '@keyframes' IDENT '{' keyframes_blocks* '}';
>
> keyframes_blocks:
>       keyframes_selectors block;
>
> keyframes_selectors:
>       'from' | 'to' | PERCENTAGE (',' 'from' | 'to' | PERCENTAGE)*;
>
> media:
>       '@media' medias '{' ruleset* '}';
>
> page:
>       '@page' IDENT? (':' IDENT)? block;
>
> font_face:
>       '@font-face' block;
>
> ruleset:
>       selectors block;
>
> selectors:
>       selector (',' selector)*;
>
> selector:
>       simple_selector (selectop? simple_selector)*;
>
> simple_selector:
>       (elem | '*') (attrib | pseudo)?;
>
> block:
>       '{' properties* ';'? '}';
>
> properties:
>       declaration (';' declaration)*;
>
> elem:
>       IDENT
>       | '#' IDENT
>       | '.' IDENT;
>
> pseudo:
>       (':' | '::') IDENT
>       | (':' | '::') function;
>
> attrib:
>       '[' IDENT (attribRelate (STRING | IDENT))? ']';
>
> declaration:
>       IDENT ':' args '!important'?;
>
> args:
>       expr (','? expr)*;
>
> expr:
>       ('-' | '+')? (NUM | PERCENTAGE | LENGTH | EMS | EXS | ANGLE | TIME
> | FREQ)
>       | IDENT
>       //| COLOR
>       | STRING
>       | URI
>       | function;
>
> function:
>       IDENT '(' args ')';
>
> // TODO: autoriser url(http://...)
> url:
>       'url(' STRING ')';
>
> attribRelate:
>       '='
>       | '~='
>       | '|=';
>
> selectop:
>       '>'
>       | '+';
>
>  URI:
>       'url(' STRING ')'
>       | 'url(' ('a'..'~')* ')';
>
>  PERCENTAGE:
>       NUM '%';
>
>  EMS:
>       NUM 'em';
>
>  EXS:
>       NUM 'ex';
>
>  LENGTH:
>       NUM ('px' | 'cm' | 'mm' | 'in' | 'pt' | 'pc');
>
>  ANGLE:
>       NUM ('deg' | 'rad' | 'grad');
>
>  TIME:
>       NUM ('ms' | 's');
>
>  FREQ:
>       NUM ('khz' | 'hz');
>
>  IDENT:
>       ('_' | 'a'..'z' | 'A'..'Z') ('_' | '-' | 'a'..'z' | 'A'..'Z' |
> '0'..'9')*
>       | '-' ('_' | 'a'..'z' | 'A'..'Z') ('_' | '-' | 'a'..'z' | 'A'..'Z'
> | '0'..'9')*;
>
>  NUM:
>       (('0'..'9')* '.')? ('0'..'9')+;
>
>  COLOR:
>       '#' ('0'..'9' | 'a'..'f' | 'A'..'F')+;
>
> STRING        :
>                       '"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|'\''|'\\') |
> ~('\\'|'"') )* '"' |
>                       '\'' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|'\''|'\\') |
> ~('\\'|'\'') )* '\''
>               ;
>
> // Single-line comments
> SL_COMMENT
>       :       '//'
>               (~('\n'|'\r'))* ('\n'|'\r'('\n')?)
>               {$channel=HIDDEN;}
>       ;
>
> // multiple-line comments
> COMMENT
>       :       '/*' .* '*/' { $channel = HIDDEN; }
>       ;
>
> // Whitespace -- ignored
> WS    : ( ' ' | '\t' | '\r' | '\n' | '\f' )+ { $channel = HIDDEN; }
>       ;
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 31627] Re: [antlr-interest] Problem in grammar (#a, #b, #c, #d, #f are not well recognized)

Reply via email to