mod_rewrite.c

Ruediger Pluem Mon, 13 Mar 2023 07:44:35 -0700

On 3/13/23 2:57 PM, Yann Ylavic wrote:
> On Mon, Mar 13, 2023 at 2:32 PM Eric Covener <[email protected]> wrote:
>>
>> On Mon, Mar 13, 2023 at 8:31 AM Yann Ylavic <[email protected]> wrote:
>>>
>>> Maybe we could make [B=] and [BTCLS] not mutually exclusive (encode
>>> both controls and whatever in B=).
>>> I was thinking of a [BNEG] flag too (encode everything but what's in
>>> B=), and never encode alnum or '_', so all in all some further patch
>>> like the attached one. WDYT?
>>
>> Looks good to me, I have an ancient patch where I did something very
>> similar to a copy of int:escape where you could set exceptions in
>> subprocess_env. The config is ugly so I never upstreamed it.
>>
>> if you make the change I can add some tests/doc. Should caution
>> against plain [B] and refer to the others in the doc?
> 
> r1902323, thanks for the tests/doc.
> IIUC, when the query-string is rewritten, I think we should caution
> against using [B] with a redirect (double encoding) and not using
> [BCTLS] (or some careful flavor of [B]) for a non-redirect..
> 

I think the biggest issue with [B] is that when you capture data from the path 
which is decoded
and try to use this data in the new URL in path and query string at the same 
time. For putting
it in the path B does the wrong thing, for putting it in the query string it 
does the correct thing.
If you do not use B then the substitution in the path will be correct, but not 
in the query string.
Apart from the additional B variants now available I guess a further internal 
rewritemap can help
that unlike 'int:escape' would consider using '+' instead of %20 to encode 
spaces.
Of course if you capture data from the query string you have to fight with the 
other way round and
need to ensure that you decode the data before putting it in the path as 
otherwise you probably have
an issue with double encoding.

BTW: I guess if time is found we should reorganize our escaping / unescaping 
code. It seems to be
widespread across the codebase to handle various special cases / edge cases and 
it seems to duplicate
code in a lot of locations. I think it would be better if it could be 
concentrated in server/util.c
and if needed add some helper functions that can be reused if we cannot move 
all special cases / edge cases
there such that their handling uses a more common base.
Another thought that I had from the performance point of view is if we could 
create an API that would allow
to build tables similar to the test_char_table on the "fly" e.g. during init 
and let functions use these
tables for effective character classification. I think this would be more 
effective for longer lists of characters
then using apr_isalnum or strchr in loops again and again.

Regards

Rüdiger

Re: svn commit: r1908301 - in /httpd/httpd/trunk: changes-entries/rewrite-bctls docs/manual/rewrite/flags.xml modules/mappers/mod_rewrite.c

Reply via email to