Re: [PATCH] Fix settings options with ProxyPassMatch

Yann Ylavic Tue, 29 Apr 2014 15:54:22 -0700

On Tue, Apr 29, 2014 at 3:51 PM, Jim Jagielski <[email protected]> wrote:
> On Apr 29, 2014, at 8:41 AM, Jan Kaluža <[email protected]> wrote:
>>
>> Because later we have to match the URL of request with some proxy_worker.
>>
>> If you configure ProxyPassMatch like this:
>> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>>
>> Then the proxy_worker name would be "http://x/$1/foo.jpg";.
>>
>> If you receive request with URL "http://x/something/foo.jpg";, 
>> ap_proxy_get_worker() will have to find out the worker with name 
>> "http://x/$1/foo.jpg";. The question here is how it would do that?
>>
>> The answer used in the patch is "we change the worker name to 
>> http://x/*/foo.jpg"; and check if the URL ("http://x/something/foo.jpg"; in 
>> our case) matches that worker.
>>
>> If we store the original name with $N, we will have to find out different 
>> way how to match the worker (probably emulating wildcard pattern matching)
>>
>> It would be possible to store only the original name (with "$N" variables), 
>> store the flag that the proxy worker is using regex and change 
>> ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any 
>> real advantage here.
>>
>
> In Yann's suggested patch we don't store match_name where it
> belongs; so we'd need to put it in shm, which means more
> memory.


Agreed, plus this is not balancer-manager aware.

BTW, what's the difference between alias_match() used by proxy_trans()
and ap_proxy_get_worker()? Longest match?
Can an entry matched by proxy_trans() *not* belong to the worker
got(ten) later from ap_proxy_get_worker()?
If no, another solution would be to backref the worker in (all) its
struct proxy_alias(es) entries.
That way the worker would be already known at proxy_trans() time (when
the entry is matched), and a new ap_proxy_get_worker_for_request(r)
could do the association later.
AFAICT, we don't use ap_proxy_get_worker() at runtime without a
request_rec available.

At least that could work for the *Match workers, for which the only
relevent requested-URL's match is from proxy_trans(), imo.

Still another solution for these workers would be to reuse the
ap_regmatch_t vector from proxy_trans() to exact match the worker's
name (with its zero or more $N replaced with strings offsets from
vector[N], like ap_expr_str_exec_re() does).
That would also require a request_rec available at
ap_proxy_get_worker()'s (run)time though.

> Instead, we store as is and add a simple char flag
> which sez if the stored name is a regex. Much savings.
>
> And I have no idea why storing with $1 -> * somehow makes
> things easier or implies a "different way how to match the worker".

Do we need to provide a way to escape (application/legitimate) $N in
the worker name or simply document on the limitation?
In the latter case this is indeed much simpler.

>
> Finally, let's think about this deeper...
>
> Assume we do have
>
>         ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>         ProxyPassMatch ^/zippy/(\d+)/bar.jpg http://x/$1/omar/propjoe.gif
>
> is the intent/desire to have 2 workers or 1? A worker is, in
> some ways, simply a nickname for the socket related to a host and port.

For which connections can be reused, different parameters apply...

> Maybe, in the interests of efficiency and speed, since regexes
> are slow as it is, a condition could be specified (a limitation,
> as it were), that when using PPM, only everything up to
> the 1st potential substitution is considered a unique worker.

That could be (another) limitation.
But one may want to apply different parameters to these somehow
different URLs, since they may be different backends/applications too.

Re: [PATCH] Fix settings options with ProxyPassMatch

Reply via email to