On Oct 12, 2011, at 3:48 PM, William A. Rowe Jr. wrote:
> It doesn't seem reasonable to wait any longer to take this to dev.
> I don't really need to go into the root complaint since this aspect
> stands on its own.
>
> On 10/7/2011 3:51 PM, William A. Rowe Jr. wrote:
>> On 10/7/2011 7:26 AM, Eric Covener wrote:
>>>> It might be sensible to return NULL in this first loop when len exceeds
>>>> REALLY_BIG_NUMBER - a few mb? Not sure.
>>>
>>> sounds reasonable to me
>>
>> Should we tolerate a string longer than our usual 8kb limit? It's
>> not even that big a number.
>>
>> Returning NULL ... we would have to make sure all callers are really
>> prepared for a NULL.
>
> So the question comes down to what is realistically the absolute max
> valuable regex substitution which ap_pregsub() should be allowed to
> emit. I'm in favor of HUGE_STRING_LEN being the abs max result.
>
8k seems awfully small to me… 64k seems more useful.
> Secondly, what action should ap_pregsub take when this absolute max
> string size is exceeded? A sudden switch to returning NULL for these
> cases?
>
I say return the orig string...
>
>
> /** The max number of regex captures that can be expanded by ap_pregsub */
> #define AP_MAX_REG_MATCH 10
>
> And what is the behavior of > AP_MAX_REG_MATCH captures? Perhaps we should
> follow that same behavior for > HUGE_STRING_LEN results?
The thing is, I don't even see where we honor that at all… At
least in the way of if someone calls it with '20' we error out.
while ((c = *src++) != '\0') {
if (c == '$' && apr_isdigit(*src))
no = *src++ - '0';
else
no = 10;
if (no > 9) { /* Ordinary character. */
if (c == '\\' && *src)
src++;
len++;
}
else if (no < nmatch && pmatch[no].rm_so < pmatch[no].rm_eo) {
len += pmatch[no].rm_eo - pmatch[no].rm_so;
}
}
Note that no will always be <=10, so if they call with
match being 20, then we hit the edge-case.
I propose that if nmatch>AP_MAX_REG_MATCH, we return the orig
string. From what I can tell, we've never returned NULL, unless
the source itself was NULL, having our failure return the orig
string seems safest.