On 07/19/2009 05:26 PM, Nick Kew wrote:
> Ruediger Pluem wrote:
>>
>> On 07/19/2009 01:12 AM, [email protected] wrote:
>
>>> - if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) {
>>> - ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r,
>>> - "regex capture $%" APR_SIZE_T_FMT
>>> - " is out of range (last regex was:
>>> '%s') in %s",
>>> - idx, re->rexp, r->filename);
>>> - return NULL;
>>> - }
>>> -
>>> - if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo < 0) {
>>> - return NULL;
>>> - }
>>> + else if (re->match[idx]rm_so == re->match[idx].rm_eo) {
>>> + return NULL;
>>> + }
>>> + else if (re->match[idx].rm_so < 0 || re->match[idx].rm_eo <
>>> 0) {
>>> + /* I don't think this can happen if have_match is true.
>>> + * But let's not risk a regression by dropping this
>>> + */
>>> + return NULL;
>>> + }
>>> + else if (re->nsub < idx || idx >= AP_MAX_REG_MATCH) {
>>
>> IMHO this check should be the first (as in the old code) as it ensures
>> that
>> idx is of correct range. So it should be moved before
>> re->match[idx].rm_so == re->match[idx].rm_eo)
>
> The problem with that is that re->nsub isn't the actual number of
> matches, it's the maximum number we gave to regexec. So that check
> returns true even when idx indexes a non-match.
But if idx for whatever reason is above AP_MAX_REG_MATCH we can segfault
again in re->match[idx].rm_so == re->match[idx].rm_eo. So I still suggest
to reverse the order of both if statements. This doesn't cost us anything
but make things more foolprove.
>
> I'd say it would be better to set it to the number of actual matches
> when we run regexec. But I didn't want to think through the risk
> of side-effects in a quick-fix scenario.
>
> OTOH, bearing in mind the history of that code, it's never been
> part of a public API in a stable release, so if we do create
> side-effects, we're not at risk of breaking anything we shouldn't.
> I guess we could reasonably hack it in trunk?
>
>>> + ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r,
>>> + "regex capture $%" APR_SIZE_T_FMT
>>> + " is out of range (last regex was: '%s')
>>> in %s",
>>> + idx, re->rexp, r->filename);
>>> + return NULL;
>>> + }
>>>
>>> + else {
>>> val = apr_pstrmemdup(ctx->dpool, re->source +
>>> re->match[idx].rm_so,
>>> re->match[idx].rm_eo -
>>> re->match[idx].rm_so);
>>
>>
>> As apr_pstrmemdup does return '\0' instead of NULL when
>> re->match[idx].rm_so == re->match[idx].rm_eo
>> we change the behaviour by doing the re->match[idx].rm_so ==
>> re->match[idx].rm_eo check above.
>
> Fair point. Again, side-effects. Let's reverse that change for 2.2
> (and in trunk if you're unhappy with it).
As I see no benefit in returning NULL here I would say that we should
reverse this in trunk as well as it keeps 2.2.x and trunk closer together.
I have no problem if trunk is different from 2.2.x but it should be for
a reason which I fail to see here.
Regards
Rüdiger