On Sun, May 9, 2010 at 1:43 PM, Husam Senussi <[email protected]> wrote:
> I'm trying to create RFC2822 parer but I'm having problem generating the code > for > some reason ragel keep running until I press CTRL C "was running for 20 > minutes", > below the grammar I'm trying to use. Whenever I've encountered very long Ragel processing times with my own grammars, the reason usually has been nondeterminism. For example, with a grammar like this: word = space* alpha+ space*; number = space* digit+ space*; main = ( space+ | word | number )*; there is an ambiguity: if the first input character is a space, it might be the start of the "space*" option in main, but it might also be the start of the "word" option in main or the start of the "number" option in main, since those can start with a space also. Internally, Ragel has to build a state graph that models those nondeterministic states. The more ambiguity there is in the grammar, the bigger this graph becomes, and the longer it takes for Ragel to run. With my own grammars, I've found that the run time of Ragel and the subsequent C compilation is a good estimator of how nondeterministic my grammar is. I've found that it helps, when using the "|" operator, to make the different options start with distinct prefix strings. In the case of my example grammar above, a less ambiguous alternative would be: word: alpha+; number: digit+; main := ( space+ | word | number )+; For more complicated languages, another good pattern I've learned from other people's grammars is to put optional space at the end of each rule and never at the start. -Brian _______________________________________________ ragel-users mailing list [email protected] http://www.complang.org/mailman/listinfo/ragel-users
