[il-antlr-interest: 34095] [antlr-interest] Parsing CSS accurately and fast

Vivek Jhaveri Sun, 18 Sep 2011 19:04:55 -0700

We've been trying to build a high-performance yet accurate CSS parser using
Antlr for the last few months.


 

To date, our efforts have yielded accuracy, but not performance.

 

The main problem with CSS is what's called the CSS parsing conventions
<http://www.w3.org/TR/CSS21/syndata.html#parsing-errors> , or how to
correctly handle parse errors.

There is a core syntax
<http://www.w3.org/TR/CSS21/syndata.html#tokenization>  that all versions of
CSS use. Conceptually, to parse say CSS2.1, we first parse the file
according to the core syntax, and then flesh out the parse tree with the
CSS2.1 grammar. The core syntax causes the right things to happen when
invalid tokens are seen.

 

We implemented it this way - see this stackoverflow question:
http://stackoverflow.com/questions/5437835/parsing-css-2-1-with-the-correct-
css-parsing-conventions-in-antlr.

 

However, this double parsing creates a new instance of the CSS2.1 parser for
each successfully parsed piece of the core grammar. This results in
extremely slow parse times.

 

We also tried rewriting the input stream and adding custom terminators
around each piece parsed by the CSS core grammar, and feeding the result in
its entirety to the CSS2.1 parser (augmented with rules for the custom
terminators), but this turned out to be even slower.

 

Is there a way to do better than this in Antlr? (

 

At this point, we're considering writing a hand-coded recursive descent
parser, hopefully there is a better way  with Antlr J

 

Regards,

 

Vivek

 

 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 34095] [antlr-interest] Parsing CSS accurately and fast

Reply via email to