On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <[email protected]> wrote: > On 03/18/2014 02:46 PM, Yann Ylavic wrote: >> >> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <[email protected]> wrote: >>> >>> Wouldn't it be possible to define wildcard workers when the URL is >>> known to be a regexp substitution? >>> For these workers' URLs, the dollars (plus the following digit) could >>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then >>> use ap_strcasecmp_match() against the requested URL. >> >> >> I meant ap_strcmp_match(), this has to be case sensitive... >> > > I've implemented your idea. Can you check the attached patch please? It > fixes the original PR and also ProxyPassMatch with UDS for me. > > If it's OK, I will commit it. >
I took a deeper look into this but I'm afraid it is a more complex story actually... 1. Workers can be defined in <Proxy[Match] ...> sections too. The same should probably be done in proxysection(). 2. Both '*' and '?' are legitimate URL chars. We need a way to escape the original ones in the configured worker's URL, but ap_strcmp_match doesn't handle (un)escaping. So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch() functions should be implemented. 5. When no $ is used in the worker's URL, an implicit $1 is appended by proxy_trans(). Hence ProxyPassMatch /some/path/with(/capture) http://some.host is equivalent to ProxyPassMatch /some/path/with(/capture) http://some.host$1 So an implicit '*' should be appended to the worker's wildcard URL in the first case (maybe only when the match has a capture but this is just an optimization, * would match empty). 3. I think we should mark the worker as wildcard when it is defined by ProxyPassMatch or a <ProxyMatch> section. So that ap_proxy_get_worker() won't use the costly ap_strcmp_match() for "normal" workers, and won't either trigger false positives due to '*' or '?' in the configured URL (which is not escaped). 4. What about the same URL used both with ProxyPass and ProxyPassMatch, same worker or different ones? IMHO, they should be different workers, and by rewritting the URL to a wildcard one as done in the patch they will be, but we lose the configured name, and eg. the balancer-manager's param "w" would now work only with the rewritten name (which is, at least, non intuitive). A new balancer-manager's param could be added to access wildcard workers (eg. "ww"), since they are different they may be accessed separately, hence I think we need to save the original URL for such case(s). So maybe for point 2.'s mark we could use the configured (original) URL as a new member of the proxy_worker struct (I don't see any reason for it to be shared in proxy_worker_shared but one may object...). This new field would be NULL for "normal" workers. 5. What about wildcard balancers (ProxyPassMatch /foo(/.*) balancer://mycluster/bar$1)? Since the balancers are registered/selected solely by their names (path and everything after is ignored), and then their workers are selected according to the balancer's method (the path still does not matter), I think there is no matching issue here (the requested path, eventually rewritten by proxy_trans(), will finally be appended to the worker's URL as is). Hence the same balancer's URL used both in ProxyPass and ProxyPassMatch can refer to the same balancer (IMHO). Nothing to be done therefore, but I may be missing something... 6. ProxyPassReverseMatch would be welcome too, but that's probably another story... Regards, Yann.
