linux and ast <regex.h> have
typedef int regoff_t;
this is a binary compatibility problem, not insurrmountable given
src/lib/libast/features/api => <ast_api.h>
all ast <regex.h> size-ish variables and struct members are [s]ssize_t -- good
some internal ast sizes-ish variables and struct members are [unsigned ] int --
not too bad
it would take a day or so to make sure all int => size_t were uncovered
On Sun, 6 May 2012 23:37:41 +0200 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= wrote:
> Glenn, this came through the perl mailing list today. I do not have a
> suitable machine but you may have one. Can you test if the libast
> regex engine can reliably match strings >2GB, please?
> Olga
> ---------- Forwarded message ----------
> From: David Leadbeater <[email protected]>
> Date: Sun, May 6, 2012 at 8:19 PM
> Subject: [perl #112790] Regexp engine cannot match >2GB strings
> To: [email protected]
> # New Ticket Created by David Leadbeater
> # Please include the string: [perl #112790]
> # in the subject line of all future correspondence about this issue.
> # <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=112790 >
> Matching unexpectedly fails when the string is longer than I32. The
> following fixes it, but I see a lot of I32 in the regexp engine itself so
> this might be masking other issues (see also RT #72784).
> diff --git a/pp_hot.c b/pp_hot.c
> index 89165d9..662b908 100644
> --- a/pp_hot.c
> +++ b/pp_hot.c
> @@ -1303,7 +1303,7 @@ PP(pp_match)
> rx = PM_GETRE(pm);
> }
> - if (RX_MINLEN(rx) > (I32)len)
> + if ((STRLEN)RX_MINLEN(rx) > len)
> goto failure;
> truebase = t = s;
> Reproduce with:
> $ perl -Mre=debug -le'$a="x"x 1048576; $b.=$a for 1 .. 2047; $b.="y"; print
> length $b; print $b =~ /y/ ? "Matched" : "No match"'
> Compiling REx "y"
> Final program:
> 1: EXACT <y> (3)
> 3: END (0)
> anchored "y" at 0 (checking anchored isall) minlen 1
> 2146435073
> Guessing start of match in sv for REx "y" against
> "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"...
> Found anchored substr "y" at offset 2146435072...
> Starting position does not contradict /^/m...
> Guessed: match at offset 2146435072
> Matched
> Freeing REx: "y"
> $ perl -Mre=debugcolor -le'$a="x"x 1048576; $b.=$a for 1 .. 2048; $b.="y";
> print length $b; print $b =~ /y/ ? "Matched" : "No match"'
> Compiling REx "y"
> Final program:
> 1: EXACT <y> (3)
> 3: END (0)
> anchored "y" at 0 (checking anchored isall) minlen 1
> 2147483649
> No match
> Freeing REx: "y"
> Matching unexpectedly fails when the string is longer than I32. The
> following fixes it, but I see a lot of I32 in the regexp engine itself
> so this might be masking other issues (see also RT #72784).
> diff --git a/pp_hot.c b/pp_hot.c
> index 89165d9..662b908 100644
> --- a/pp_hot.c
> +++ b/pp_hot.c
> @@ -1303,7 +1303,7 @@ PP(pp_match)
> rx = PM_GETRE(pm);
> }
> - if (RX_MINLEN(rx) > (I32)len)
> + if ((STRLEN)RX_MINLEN(rx) > len)
> goto failure;
> truebase = t = s;
> Reproduce with:
> $ perl -Mre=debug -le'$a="x"x 1048576; $b.=$a for 1 .. 2047; $b.="y";
> print length $b; print $b =~ /y/ ? "Matched" : "No match"'
> Compiling REx "y"
> Final program:
> 1: EXACT <y> (3)
> 3: END (0)
> anchored "y" at 0 (checking anchored isall) minlen 1
> 2146435073
> Guessing start of match in sv for REx "y" against
> "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"...
> Found anchored substr "y" at offset 2146435072...
> Starting position does not contradict /^/m...
> Guessed: match at offset 2146435072
> Matched
> Freeing REx: "y"
> $ perl -Mre=debugcolor -le'$a="x"x 1048576; $b.=$a for 1 .. 2048;
> $b.="y"; print length $b; print $b =~ /y/ ? "Matched" : "No match"'
> Compiling REx "y"
> Final program:
> 1: EXACT <y> (3)
> 3: END (0)
> anchored "y" at 0 (checking anchored isall) minlen 1
> 2147483649
> No match
> Freeing REx: "y"
> --
> , _ _ ,
> { \/`o;====- Olga Kryzhanovska -====;o`\/ }
> .----'-/`-/ [email protected] \-`\-'----.
> `'-..-| / http://twitter.com/fleyta \ |-..-'`
> /\/\ Solaris/BSD//C/C++ programmer /\/\
> `--` `--`
_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers