Hi Marcus,

Understood, I've been trying to get current lexer written with flex over
to re2c and have been using the PHP ini scanner work as the basis for
doing so.

I've read the manual and looked through the examples but haven't really
found anything solid wrt to defining start conditions (%x/%s condition from
flex). It also seems like macros like BEGIN need to be defined, is there
a way to automagically add these to scanner if -F is specified? If not,
what are your thoughts about doing so? I've read the interface section
and understand that re2c does not generate complete scanners but I'd say
there should be some option to generate a complete or somewhat more
complete scanner.

Regards,
John

On Wed, Apr 23, 2008 at 01:50:20PM +0200, Marcus Boerger wrote:
> Hello John,
> 
> Wednesday, April 23, 2008, 6:45:59 AM, you wrote:
> 
> > Hi Marcus,
> 
> > Thanks for the response. I've gone ahead and added:
> 
> > #define YYFILL(n) { if(YYCURSOR >= YYLIMIT) return 0; }
> 
> > and:
> 
> > [^] {
> >         return 0;
> > }
> 
> That should do it :-)
> 
> > For the missing ?> case. However, I didn't really understand what you
> > meant wrt to the '/0' character. How exactly would I modify the ANYCHAR
> > rule to negate this character? Just a side note, I modified my PHP rule
> > and it did start working.
> 
> If you deal with zero '\0' terminated strings you need to have a special
> rule to avoid reading beyond the buffer.
> 
> > Also, is there some documentation with regards to states in re2c?
> 
> In your local checkout you have an up-to-date manual in htdocs and you also
> have an up-to-date man page. However there is no further stuff like lessons.
> I hope thae there will be time to add lessons as soon as the state stuff is
> finalized so that it is available when the next stable version 0.14.0 will
> be released.
> 
> marcus
> 
> > On Tue, Apr 22, 2008 at 06:14:39PM +0200, Marcus Boerger wrote:
> >> Hello John,
> >> 
> >> Friday, April 18, 2008, 8:46:38 AM, you wrote:
> >> 
> >> > I've run into a segfault porting some rules over from flex/bison which
> >> > used to work. Admittedly, I'm just scratching the surface and starting
> >> > with re2c.
> >> 
> >> > Relevant code:
> >> 
> >> > #include <stdio.h>
> >> > #include <stdlib.h>
> >> > #include <string.h>
> >> > char *scan(char *p)
> >> > {
> >> > /*!re2c
> >> >         re2c:define:YYCTYPE  = "unsigned char";
> >> >         re2c:define:YYCURSOR = p;
> >> >         re2c:yyfill:enable   = 0;
> >> >         re2c:yych:conversion = 1;
> >> >         re2c:indent:top      = 1;
> >> >         re2c:define:YYMARKER = p;
> >> >         PLACEHOLDER = "<!!([^>]+)?>[^<]+</!!>";
> >> >         ANYCHAR = (.|[\n\t]);
> >> >         PHP = "<\?php"({ANYCHAR})+"?>";
> >> >         {PHP} {
> >> >                 return "passed";
> >> >         }
> >> > */
> >> > }
> >> 
> >> > void main() {
> >> >         char *test;
> >> >         test = malloc(4096);
> >> >         bzero(test,4096);
> >> >         strcpy(test,"<?php ... ?>");
> >> >         char *res = scan(test);
> >> >         printf("%s",res);
> >> >         free(test);
> >> > }
> >> 
> >> > Run with re2c -Fis
> >> > re2c 0.13.4
> >> 
> >> > I found that this works as expected if I drop the ? from <?php.
> >> 
> >> The first thing here is the backslash in front of the ?. The second is that
> >> you did not provide any code that is to be executed in case something does
> >> not match. And last but not least there is a problem with unlimited
> >> captures like in your case. The rule ANYCHAR+ has no limitation, hence
> >> there will be calls to YYFILL to fill your buffer. If that results in a
> >> reallocation you need to update all pointers re2c uses. Now in your case if
> >> re2c does find the '<?php' but does not find the '?>' it will simply
> >> continue beyond the end of the string. To avoid that you can provide a rule
> >> for the '/0' character. Then either add that rule above the current one or
> >> change ANYCHAR to not contain '/0'. The other solution is to provide a
> >> check for YYFILL that stops matching at the end of the string.
> >> 
> >> > valgrind:
> >> 
> >> > ==14141== Invalid read of size 1
> >> > ==14141==    at 0x804850B: scan (<stdout>:50)
> >> > ==14141==    by 0x80485B5: main (lang.r:28)
> >> > ==14141==  Address 0x4169028 is 0 bytes after a block of size 4,096 
> >> > alloc'd
> >> > ==14141==    at 0x4020641: malloc (in
> >> > /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
> >> > ==14141==    by 0x8048579: main (lang.r:25)
> >> 
> >> 
> >> 
> >> -- 
> >> Best regards,
> >>  Marcus                            mailto:[EMAIL PROTECTED]
> >> 
> 
> 
> 
> -- 
> Best regards,
>  Marcus                            mailto:[EMAIL PROTECTED]
> 

Attachment: pgp2Sp23hfc9M.pgp
Description: PGP signature

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Re2c-general mailing list
Re2c-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/re2c-general

Reply via email to