Support Requests item #2407218, was opened at 2008-12-08 19:52
Message generated for change (Comment added) made by helly
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=616201&aid=2407218&group_id=96864

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Closed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: YYFill bounds checking insufficient

Initial Comment:
Looking at the generated code from re2c's examples directory, re2c calls
YYFILL(n) with an n value of at least the size of the longest possible
match. e.g.

/*!re2c
"a" { return 0; }
"b"{6} { return 1; }
*/

will generate:

if ((YYLIMIT - YYCURSOR) < 6) YYFILL(6);

If we use this value to check for end of buffer, re2c will quit before
matching "a" if it is in the last 5 characters of the buffer. If we do not
check this value, the generated code will exceed the buffer limits if it
matches 'b' in the last 5 characters, as there's no check against YYLIMIT
until another 6 bs are consumed.

It seems to me that re2c cannot generate a correct solution without an EOF
sentinel?




----------------------------------------------------------------------

>Comment By: Marcus Börger (helly)
Date: 2008-12-08 23:54

Message:
Well with the else rule you actually do not need a YYFILL. But in someway I
see where you are heading. That is instead of the YYFILL(6) you'd want to
see two YYFILL() calls. First before any char with length one to determine
whether the first char is a or b. In case of a your done and otherwise
you'd want a YYFILL(5). That would allow to check the remaining 5 b's. And
anything else would return an error. Maybe at some point in the furutre we
could add yet another mode to allow this but at the moment I see no easy
solution, sorry.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2008-12-08 22:39

Message:
with the code:

#define YYFILL(n) return 3

/*!re2c
"a" { return 0; }
"b"{6} { return 1; }
[^] { return 4; }
*/

If we try to scan "a" for 1 character, we get the incorrect result 3. 
This is because YYFILL gets called when YYLIMIT - YYCURSOR < 6 before any
scanning even happens, so we'll never get a correct result for any buffers
that are smaller than the largest constant string.






----------------------------------------------------------------------

Comment By: Marcus Börger (helly)
Date: 2008-12-08 21:50

Message:
In this example you need an else rule to determine the forth case of:
1) a
2) bbbbbb
3) not enough input
4) anything else

the rule to use is:
[^] { return 4; }

Case 3 gets handled by:
#define YYFILL(n) return 3

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=616201&aid=2407218&group_id=96864

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Re2c-general mailing list
Re2c-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/re2c-general

Reply via email to