Re: process_regexp bug, infinite recursion

2024-01-16 Thread Yann Ylavic
On Mon, Jan 8, 2024 at 5:54 PM Ruediger Pluem  wrote:
>
> On 1/8/24 1:37 PM, Yann Ylavic wrote:
> >
> > As noted in v2 we have an issue here by "losing" the beginning of the
> > value on recursion:
> > /* XXX: recursing by using AP_REG_NOTBOL (because we are not at 
> > ^
> >  * anymore) and then "losing" the beginning of the string is not
> >  * always correct. Say we match "(?<=a)ba" against "ababa", on
> >  * recursion ap_regexec_len() will not know that the second "b" 
> > is
> >  * preceded by "a" thus not match. We'd need a new 
> > ap_regexec_ex()
> >  * that can take match_end as an offset to fix this..
> >  */
> >
> > Not sure how far we should go with this patch..
>
> I think things do not get worse in this respect because of this patch but 
> only improve
> in the sense that an infinite recursion is avoided.
> Hence +1 on the patch.

I finally went with the full thing in r1915267, r1915268 and r1915271
(with new tests in r1915269 for what didn't work well).


Regards;
Yann.


Re: process_regexp bug, infinite recursion

2024-01-08 Thread Ruediger Pluem



On 1/8/24 1:37 PM, Yann Ylavic wrote:
> On Mon, Jan 8, 2024 at 10:49 AM Ruediger Pluem  wrote:
>>
>> On 1/5/24 3:08 PM, Yann Ylavic wrote:
>>>
>>> process_regexp.diff
>>>
>>> Index: modules/metadata/mod_headers.c
>>> ===
>>> --- modules/metadata/mod_headers.c(revision 1914804)
>>> +++ modules/metadata/mod_headers.c(working copy)
>>> @@ -622,41 +622,70 @@ static char* process_tags(header_entry *hdr, reque
>>>  }
>>>  return str ? str : "";
>>>  }
>>> -static const char *process_regexp(header_entry *hdr, const char *value,
>>> -  request_rec *r)
>>> +
>>> +static const char *process_regexp_rec(header_entry *hdr, const char *value,
>>> +  request_rec *r, ap_regmatch_t 
>>> pmatch[],
>>> +  int flags)
>>>  {
>>> -ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
>>> -const char *subs;
>>> -const char *remainder;
>>>  char *ret;
>>> -int diffsz;
>>> -if (ap_regexec(hdr->regex, value, AP_MAX_REG_MATCH, pmatch, 0)) {
>>> +const char *subs, *remainder;
>>> +apr_size_t val_len, subs_len, rem_len;
>>> +apr_size_t match_start, match_end;
>>> +
>>> +val_len = strlen(value);
>>> +if (ap_regexec_len(hdr->regex, value, val_len,
>>> +   AP_MAX_REG_MATCH, pmatch,
>>> +   flags)) {
>>>  /* no match, nothing to do */
>>>  return value;
>>>  }
>>> +
>>>  /* Process tags in the input string rather than the resulting
>>> * substitution to avoid surprises
>>> */
>>> -subs = ap_pregsub(r->pool, process_tags(hdr, r), value, 
>>> AP_MAX_REG_MATCH, pmatch);
>>> +subs = ap_pregsub(r->pool, process_tags(hdr, r), value,
>>> +  AP_MAX_REG_MATCH, pmatch);
>>>  if (subs == NULL)
>>>  return NULL;
>>> -diffsz = strlen(subs) - (pmatch[0].rm_eo - pmatch[0].rm_so);
>>> +subs_len = strlen(subs);
>>> +
>>> +ap_assert(pmatch[0].rm_so >= 0 && pmatch[0].rm_so <= pmatch[0].rm_eo);
>>> +match_start = pmatch[0].rm_so;
>>> +match_end = pmatch[0].rm_eo;
>>>  if (hdr->action == hdr_edit) {
>>> -remainder = value + pmatch[0].rm_eo;
>>> +remainder = value + match_end;
>>>  }
>>>  else { /* recurse to edit multiple matches if applicable */
>>> -remainder = process_regexp(hdr, value + pmatch[0].rm_eo, r);
>>> -if (remainder == NULL)
>>> -return NULL;
>>> -diffsz += strlen(remainder) - strlen(value + pmatch[0].rm_eo);
>>> +if (match_end == 0 && val_len) {
>>> +/* advance on empty match to avoid infinite recursion */
>>> +match_start = match_end = 1;
>>
>> Not debating that
>>
>> Header edit* headername ^ prefix
>>
>> should be be
>>
>> Header edit headername ^ prefix
>>
>> but wouldn't that do the wrong thing and add the prefix *after* the first 
>> character of the header value?
> 
> Yes correct, this patch won't set the skipped char at the right place finally.
> New v2 attached.
> 
>>
>>> +}
>>> +if (match_end < val_len) {
>>> +remainder = process_regexp_rec(hdr, value + match_end,
>>
>> Shouldn't we increase match_end just here by 1 if match_end == 0 && val_len ?
> 
> See v2, match_end is not used below anyway.
> 
> 
>>
>>> +   r, pmatch, AP_REG_NOTBOL);
> 
> As noted in v2 we have an issue here by "losing" the beginning of the
> value on recursion:
> /* XXX: recursing by using AP_REG_NOTBOL (because we are not at ^
>  * anymore) and then "losing" the beginning of the string is not
>  * always correct. Say we match "(?<=a)ba" against "ababa", on
>  * recursion ap_regexec_len() will not know that the second "b" is
>  * preceded by "a" thus not match. We'd need a new ap_regexec_ex()
>  * that can take match_end as an offset to fix this..
>  */
> 
> Not sure how far we should go with this patch..

I think things do not get worse in this respect because of this patch but only 
improve
in the sense that an infinite recursion is avoided.
Hence +1 on the patch.

Regards

Rüdiger



Re: process_regexp bug, infinite recursion

2024-01-08 Thread Yann Ylavic
On Mon, Jan 8, 2024 at 10:49 AM Ruediger Pluem  wrote:
>
> On 1/5/24 3:08 PM, Yann Ylavic wrote:
> >
> > process_regexp.diff
> >
> > Index: modules/metadata/mod_headers.c
> > ===
> > --- modules/metadata/mod_headers.c(revision 1914804)
> > +++ modules/metadata/mod_headers.c(working copy)
> > @@ -622,41 +622,70 @@ static char* process_tags(header_entry *hdr, reque
> >  }
> >  return str ? str : "";
> >  }
> > -static const char *process_regexp(header_entry *hdr, const char *value,
> > -  request_rec *r)
> > +
> > +static const char *process_regexp_rec(header_entry *hdr, const char *value,
> > +  request_rec *r, ap_regmatch_t 
> > pmatch[],
> > +  int flags)
> >  {
> > -ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
> > -const char *subs;
> > -const char *remainder;
> >  char *ret;
> > -int diffsz;
> > -if (ap_regexec(hdr->regex, value, AP_MAX_REG_MATCH, pmatch, 0)) {
> > +const char *subs, *remainder;
> > +apr_size_t val_len, subs_len, rem_len;
> > +apr_size_t match_start, match_end;
> > +
> > +val_len = strlen(value);
> > +if (ap_regexec_len(hdr->regex, value, val_len,
> > +   AP_MAX_REG_MATCH, pmatch,
> > +   flags)) {
> >  /* no match, nothing to do */
> >  return value;
> >  }
> > +
> >  /* Process tags in the input string rather than the resulting
> > * substitution to avoid surprises
> > */
> > -subs = ap_pregsub(r->pool, process_tags(hdr, r), value, 
> > AP_MAX_REG_MATCH, pmatch);
> > +subs = ap_pregsub(r->pool, process_tags(hdr, r), value,
> > +  AP_MAX_REG_MATCH, pmatch);
> >  if (subs == NULL)
> >  return NULL;
> > -diffsz = strlen(subs) - (pmatch[0].rm_eo - pmatch[0].rm_so);
> > +subs_len = strlen(subs);
> > +
> > +ap_assert(pmatch[0].rm_so >= 0 && pmatch[0].rm_so <= pmatch[0].rm_eo);
> > +match_start = pmatch[0].rm_so;
> > +match_end = pmatch[0].rm_eo;
> >  if (hdr->action == hdr_edit) {
> > -remainder = value + pmatch[0].rm_eo;
> > +remainder = value + match_end;
> >  }
> >  else { /* recurse to edit multiple matches if applicable */
> > -remainder = process_regexp(hdr, value + pmatch[0].rm_eo, r);
> > -if (remainder == NULL)
> > -return NULL;
> > -diffsz += strlen(remainder) - strlen(value + pmatch[0].rm_eo);
> > +if (match_end == 0 && val_len) {
> > +/* advance on empty match to avoid infinite recursion */
> > +match_start = match_end = 1;
>
> Not debating that
>
> Header edit* headername ^ prefix
>
> should be be
>
> Header edit headername ^ prefix
>
> but wouldn't that do the wrong thing and add the prefix *after* the first 
> character of the header value?

Yes correct, this patch won't set the skipped char at the right place finally.
New v2 attached.

>
> > +}
> > +if (match_end < val_len) {
> > +remainder = process_regexp_rec(hdr, value + match_end,
>
> Shouldn't we increase match_end just here by 1 if match_end == 0 && val_len ?

See v2, match_end is not used below anyway.


>
> > +   r, pmatch, AP_REG_NOTBOL);

As noted in v2 we have an issue here by "losing" the beginning of the
value on recursion:
/* XXX: recursing by using AP_REG_NOTBOL (because we are not at ^
 * anymore) and then "losing" the beginning of the string is not
 * always correct. Say we match "(?<=a)ba" against "ababa", on
 * recursion ap_regexec_len() will not know that the second "b" is
 * preceded by "a" thus not match. We'd need a new ap_regexec_ex()
 * that can take match_end as an offset to fix this..
 */

Not sure how far we should go with this patch..


Regards;
Yann.
Index: modules/metadata/mod_headers.c
===
--- modules/metadata/mod_headers.c	(revision 1914804)
+++ modules/metadata/mod_headers.c	(working copy)
@@ -622,41 +622,90 @@ static char* process_tags(header_entry *hdr, reque
 }
 return str ? str : "";
 }
-static const char *process_regexp(header_entry *hdr, const char *value,
-  request_rec *r)
+
+static const char *process_regexp_core(header_entry *hdr,
+   const char *val, apr_size_t val_len,
+   request_rec *r, ap_regmatch_t *pmatch,
+   int flags)
 {
-ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
-const char *subs;
-const char *remainder;
 char *ret;
-int diffsz;
-if (ap_regexec(hdr->regex, value, AP_MAX_REG_MATCH, pmatch, 0)) {
+const char *sub, *rem;
+apr_size_t 

Re: process_regexp bug, infinite recursion

2024-01-08 Thread Ruediger Pluem



On 1/5/24 3:08 PM, Yann Ylavic wrote:
> On Fri, Jan 5, 2024 at 3:11 AM Eric Covener  wrote:
>> On Thu, Jan 4, 2024 at 9:04 PM Jason Pyeron  wrote:
>>> I am having some issue searching Bugzilla for any issue involving 
>>> process_regexp in mod_headers.c .
>>>
>>> It finds nothing, so I am assuming I did something wrong in my search. Will 
>>> file bug if not already filed.
>>>
>>> We are investigating an infinite loop (stack overflow) issue, caused by 
>>> "securing" a system.
>>>
>>> ZZZ-STIG-SV-214288r881493_rule.conf:Header always edit* Set-Cookie ^(.*)$ 
>>> $1;HttpOnly;secure
>> edit* just makes no sense at all here when you are capturing/replacing
>> the entire string and will loop by definition.
> Maybe we should avoid infinite recursion in any case, like in the
> attached patch?
> How does it work for you Jason?
> 
> Regards;
> Yann.
> 
> 
> process_regexp.diff
> 
> Index: modules/metadata/mod_headers.c
> ===
> --- modules/metadata/mod_headers.c(revision 1914804)
> +++ modules/metadata/mod_headers.c(working copy)
> @@ -622,41 +622,70 @@ static char* process_tags(header_entry *hdr, reque
>  }
>  return str ? str : "";
>  }
> -static const char *process_regexp(header_entry *hdr, const char *value,
> -  request_rec *r)
> +
> +static const char *process_regexp_rec(header_entry *hdr, const char *value,
> +  request_rec *r, ap_regmatch_t pmatch[],
> +  int flags)
>  {
> -ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
> -const char *subs;
> -const char *remainder;
>  char *ret;
> -int diffsz;
> -if (ap_regexec(hdr->regex, value, AP_MAX_REG_MATCH, pmatch, 0)) {
> +const char *subs, *remainder;
> +apr_size_t val_len, subs_len, rem_len;
> +apr_size_t match_start, match_end;
> +
> +val_len = strlen(value);
> +if (ap_regexec_len(hdr->regex, value, val_len,
> +   AP_MAX_REG_MATCH, pmatch,
> +   flags)) {
>  /* no match, nothing to do */
>  return value;
>  }
> +
>  /* Process tags in the input string rather than the resulting
> * substitution to avoid surprises
> */
> -subs = ap_pregsub(r->pool, process_tags(hdr, r), value, 
> AP_MAX_REG_MATCH, pmatch);
> +subs = ap_pregsub(r->pool, process_tags(hdr, r), value,
> +  AP_MAX_REG_MATCH, pmatch);
>  if (subs == NULL)
>  return NULL;
> -diffsz = strlen(subs) - (pmatch[0].rm_eo - pmatch[0].rm_so);
> +subs_len = strlen(subs);
> +
> +ap_assert(pmatch[0].rm_so >= 0 && pmatch[0].rm_so <= pmatch[0].rm_eo);
> +match_start = pmatch[0].rm_so;
> +match_end = pmatch[0].rm_eo;
>  if (hdr->action == hdr_edit) {
> -remainder = value + pmatch[0].rm_eo;
> +remainder = value + match_end;
>  }
>  else { /* recurse to edit multiple matches if applicable */
> -remainder = process_regexp(hdr, value + pmatch[0].rm_eo, r);
> -if (remainder == NULL)
> -return NULL;
> -diffsz += strlen(remainder) - strlen(value + pmatch[0].rm_eo);
> +if (match_end == 0 && val_len) {
> +/* advance on empty match to avoid infinite recursion */
> +match_start = match_end = 1;

Not debating that

Header edit* headername ^ prefix

should be be

Header edit headername ^ prefix

but wouldn't that do the wrong thing and add the prefix *after* the first 
character of the header value?

> +}
> +if (match_end < val_len) {
> +remainder = process_regexp_rec(hdr, value + match_end,

Shouldn't we increase match_end just here by 1 if match_end == 0 && val_len ?

> +   r, pmatch, AP_REG_NOTBOL);
> +if (remainder == NULL)
> +return NULL;
> +}
> +else {
> +remainder = "";
> +}
>  }
> -ret = apr_palloc(r->pool, strlen(value) + 1 + diffsz);
> -memcpy(ret, value, pmatch[0].rm_so);
> -strcpy(ret + pmatch[0].rm_so, subs);
> -strcat(ret, remainder);
> +rem_len = strlen(remainder);
> +
> +ret = apr_palloc(r->pool, match_start + subs_len + rem_len + 1);
> +memcpy(ret, value, match_start);
> +memcpy(ret + match_start, subs, subs_len);
> +memcpy(ret + match_start + subs_len, remainder, rem_len + 1);
>  return ret;
>  }

Regards

Rüdiger



RE: process_regexp bug, infinite recursion

2024-01-05 Thread Jason Pyeron
> -Original Message-
> From: Yann Ylavic 
> Sent: Friday, January 5, 2024 9:09 AM
> 
> On Fri, Jan 5, 2024 at 3:11 AM Eric Covener  wrote:
> >
> > On Thu, Jan 4, 2024 at 9:04 PM Jason Pyeron  wrote:
> > >
> > > I am having some issue searching Bugzilla for any issue involving 
> > > process_regexp in
> mod_headers.c .
> > >
> > > It finds nothing, so I am assuming I did something wrong in my search. 
> > > Will file bug if not
> already filed.
> > >
> > > We are investigating an infinite loop (stack overflow) issue, caused by 
> > > "securing" a system.
> > >
> > > ZZZ-STIG-SV-214288r881493_rule.conf:Header always edit* Set-Cookie ^(.*)$ 
> > > $1;HttpOnly;secure
> >
> > edit* just makes no sense at all here when you are capturing/replacing
> > the entire string and will loop by definition.
> 
> Maybe we should avoid infinite recursion in any case, like in the
> attached patch?

Cool.

Likely, Monday I was going do similar - then only difference was checking if 
the substitution "made no change", abort. s/a/a/.

> How does it work for you Jason?
> 

Test early next week, I forwarded to the team to build and test.



Re: process_regexp bug, infinite recursion

2024-01-05 Thread Yann Ylavic
On Fri, Jan 5, 2024 at 3:11 AM Eric Covener  wrote:
>
> On Thu, Jan 4, 2024 at 9:04 PM Jason Pyeron  wrote:
> >
> > I am having some issue searching Bugzilla for any issue involving 
> > process_regexp in mod_headers.c .
> >
> > It finds nothing, so I am assuming I did something wrong in my search. Will 
> > file bug if not already filed.
> >
> > We are investigating an infinite loop (stack overflow) issue, caused by 
> > "securing" a system.
> >
> > ZZZ-STIG-SV-214288r881493_rule.conf:Header always edit* Set-Cookie ^(.*)$ 
> > $1;HttpOnly;secure
>
> edit* just makes no sense at all here when you are capturing/replacing
> the entire string and will loop by definition.

Maybe we should avoid infinite recursion in any case, like in the
attached patch?
How does it work for you Jason?

Regards;
Yann.
Index: modules/metadata/mod_headers.c
===
--- modules/metadata/mod_headers.c	(revision 1914804)
+++ modules/metadata/mod_headers.c	(working copy)
@@ -622,41 +622,70 @@ static char* process_tags(header_entry *hdr, reque
 }
 return str ? str : "";
 }
-static const char *process_regexp(header_entry *hdr, const char *value,
-  request_rec *r)
+
+static const char *process_regexp_rec(header_entry *hdr, const char *value,
+  request_rec *r, ap_regmatch_t pmatch[],
+  int flags)
 {
-ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
-const char *subs;
-const char *remainder;
 char *ret;
-int diffsz;
-if (ap_regexec(hdr->regex, value, AP_MAX_REG_MATCH, pmatch, 0)) {
+const char *subs, *remainder;
+apr_size_t val_len, subs_len, rem_len;
+apr_size_t match_start, match_end;
+
+val_len = strlen(value);
+if (ap_regexec_len(hdr->regex, value, val_len,
+   AP_MAX_REG_MATCH, pmatch,
+   flags)) {
 /* no match, nothing to do */
 return value;
 }
+
 /* Process tags in the input string rather than the resulting
* substitution to avoid surprises
*/
-subs = ap_pregsub(r->pool, process_tags(hdr, r), value, AP_MAX_REG_MATCH, pmatch);
+subs = ap_pregsub(r->pool, process_tags(hdr, r), value,
+  AP_MAX_REG_MATCH, pmatch);
 if (subs == NULL)
 return NULL;
-diffsz = strlen(subs) - (pmatch[0].rm_eo - pmatch[0].rm_so);
+subs_len = strlen(subs);
+
+ap_assert(pmatch[0].rm_so >= 0 && pmatch[0].rm_so <= pmatch[0].rm_eo);
+match_start = pmatch[0].rm_so;
+match_end = pmatch[0].rm_eo;
 if (hdr->action == hdr_edit) {
-remainder = value + pmatch[0].rm_eo;
+remainder = value + match_end;
 }
 else { /* recurse to edit multiple matches if applicable */
-remainder = process_regexp(hdr, value + pmatch[0].rm_eo, r);
-if (remainder == NULL)
-return NULL;
-diffsz += strlen(remainder) - strlen(value + pmatch[0].rm_eo);
+if (match_end == 0 && val_len) {
+/* advance on empty match to avoid infinite recursion */
+match_start = match_end = 1;
+}
+if (match_end < val_len) {
+remainder = process_regexp_rec(hdr, value + match_end,
+   r, pmatch, AP_REG_NOTBOL);
+if (remainder == NULL)
+return NULL;
+}
+else {
+remainder = "";
+}
 }
-ret = apr_palloc(r->pool, strlen(value) + 1 + diffsz);
-memcpy(ret, value, pmatch[0].rm_so);
-strcpy(ret + pmatch[0].rm_so, subs);
-strcat(ret, remainder);
+rem_len = strlen(remainder);
+
+ret = apr_palloc(r->pool, match_start + subs_len + rem_len + 1);
+memcpy(ret, value, match_start);
+memcpy(ret + match_start, subs, subs_len);
+memcpy(ret + match_start + subs_len, remainder, rem_len + 1);
 return ret;
 }
 
+static const char *process_regexp(header_entry *hdr, const char *value,
+  request_rec *r)
+{
+ap_regmatch_t pmatch[AP_MAX_REG_MATCH];
+return process_regexp_rec(hdr, value, r, pmatch, 0);
+}
+
 static int echo_header(void *v, const char *key, const char *val)
 {
 echo_do *ed = (echo_do *)v;


Re: process_regexp bug, infinite recursion

2024-01-04 Thread Eric Covener
On Thu, Jan 4, 2024 at 9:04 PM Jason Pyeron  wrote:
>
> I am having some issue searching Bugzilla for any issue involving 
> process_regexp in mod_headers.c .
>
> It finds nothing, so I am assuming I did something wrong in my search. Will 
> file bug if not already filed.
>
> We are investigating an infinite loop (stack overflow) issue, caused by 
> "securing" a system.
>
> ZZZ-STIG-SV-214288r881493_rule.conf:Header always edit* Set-Cookie ^(.*)$ 
> $1;HttpOnly;secure

edit* just makes no sense at all here when you are capturing/replacing
the entire string and will loop by definition.

Are you looking to have mod_headers bail out after an unreasonable
number of iterations or header length?