Feature Requests item #1540845, was opened at 2006-08-15 20:47 Message generated for change (Comment added) made by helly You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=616203&aid=1540845&group_id=96864
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed Priority: 5 Submitted By: Justin Mason (jmason) >Assigned to: Marcus Börger (helly) Summary: RFE: way for scanner to report subsumed tokens Initial Comment: hi -- Looking at re2c for SpamAssassin -- it's improved a lot since the last time I checked ;) nice work! one thing, though. it would be really great if re2c could track subsumed tokens. For example: /*!re2c "foo" {return "FOO";} "food" {return "FOOD";} [\000-\377] { return NULL; } */ Assume the input string is "food", and an appropriately-smart caller who knows to track the YYCURSOR state and call multiple times until it receives NULL is being used. This should return "FOO" on first call, then "FOOD" on the second call, then NULL on the third call. Instead, the longest matching token is used: return "FOOD" on first call, then NULL on the third call. most re2c users could write their token tables to automatically return *both* "FOO" and "FOOD" on the first call -- and initially I was doing this. however, in my usage, the tokens are derived from spamassassin rules, so I can't always know if one is subsumed by another... and determining this programatically in advance would require rewriting most of re2c ;) Instead, I've been changing my calling code to not support full regexp semantics in the input to re2c. This is obviously defeating much of the point, so I'd love to fix that... Are there any plans to implement this? cheers, --j. ---------------------------------------------------------------------- >Comment By: Marcus Börger (helly) Date: 2006-08-15 22:53 Message: Logged In: YES user_id=271023 re2c is designed in a way that requires most complex rule first. In this case it means the "FOOD" rule needs to be in front of the "FOO" rule. Then when re2c reads "FOO" you get the token and the story ends. However you can write some handling the code generated by re2c to do what you want. That is why re2c was built with a focus on extreme flexibility. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=616203&aid=1540845&group_id=96864 ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Re2c-general mailing list Re2c-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/re2c-general