apr_hash_t and global scope
Hello modules-dev! I've encountered a problem with an apr_hash_t in the global scope of my module. Let me explain the use-case a bit. Every time, when an asset (image e.g.) is delivered via apache, it stores meta information of that file with a simple key/value pair in an apr_hash_t. That variable is part of the global scope of my module and allocated in register_hooks section (with the pool available here). If HTML is delivered that references an asset with meta information available, it rewrites the HTML based on the data from apr_hash_t. My problem is: using -X (only one worker) everything is fine and the module works as expected. But starting httpd with more than one worker, I sometime have no data on my apr_hash_t, which is expected to be there. I've tried various things, e.g. using server-process-pool and child_init to allocate the data, but the issue remains the same. I'm also using no global/thread_mutex, because I'm never reading and writing in the same thread (but in theory in the same process) - but I've no idea yet how hash_set is doing it internally, so this might be still a todo (is it? Do I really need a global mutex for data across worker/threads? Can I avoid it?). Using memcache is an option in theory, but I'd love to avoid it too. Otherwise my module scales much different, based on the delivered HTML. But anyways, it's most likely an issue with the wrong pool or with a misunderstanding of the scope and the cleanup of those pools. What's the best way to use a everywhere shared apr_hash_t? Do I need apr_shm to work properly with the workers (or apr_rmm in my use case) together with a global_mutex or should I try to dig into socache and avoid doing this handling manually? Many outdated resources around the web (and even different internal implementations for the same use-case) made me feel a bit ... doubtful to come to a final decision. Maybe you can help me here?! :-) Regards, Ingo
Re: apr_hash_t and global scope
On 2013-12-11 19:17, Ingo Walz wrote: Hello modules-dev! I've encountered a problem with an apr_hash_t in the global scope of my module. Let me explain the use-case a bit. Every time, when an asset (image e.g.) is delivered via apache, it stores meta information of that file with a simple key/value pair in an apr_hash_t. That variable is part of the global scope of my module and allocated in register_hooks section (with the pool available here). If HTML is delivered that references an asset with meta information available, it rewrites the HTML based on the data from apr_hash_t. My problem is: using -X (only one worker) everything is fine and the module works as expected. But starting httpd with more than one worker, I sometime have no data on my apr_hash_t, which is expected to be there. I've tried various things, e.g. using server-process-pool and child_init to allocate the data, but the issue remains the same. I'm also using no global/thread_mutex, because I'm never reading and writing in the same thread (but in theory in the same process) - but I've no idea yet how hash_set is doing it internally, so this might be still a todo (is it? Do I really need a global mutex for data across worker/threads? Can I avoid it?). Using memcache is an option in theory, but I'd love to avoid it too. Otherwise my module scales much different, based on the delivered HTML. But anyways, it's most likely an issue with the wrong pool or with a misunderstanding of the scope and the cleanup of those pools. I'm making the assumption that you're using Unix and not Windows. I don't think it is related to pools or their cleanup. It is rather because of how Unix processes work. The request that updates your hash is served by a thread in one apache child and the request that reads from your hash may be served by a thread in another apache child. The problem is that each apache child has its own copy of the hash. What's the best way to use a everywhere shared apr_hash_t? Do I need apr_shm to work properly with the workers (or apr_rmm in my use case) together with a global_mutex or should I try to dig into socache and avoid doing this handling manually? Many outdated resources around the web (and even different internal implementations for the same use-case) made me feel a bit ... doubtful to come to a final decision. Maybe you can help me here?! :-) My advice would be to avoid doing it manually. Regards, Sorin Regards, Ingo
Re: apr_hash_t and global scope
Am 12.12.2013 00:27, schrieb Sorin Manolache: On 2013-12-11 19:17, Ingo Walz wrote: Hello modules-dev! I've encountered a problem with an apr_hash_t in the global scope of my module. Let me explain the use-case a bit. Every time, when an asset (image e.g.) is delivered via apache, it stores meta information of that file with a simple key/value pair in an apr_hash_t. That variable is part of the global scope of my module and allocated in register_hooks section (with the pool available here). If HTML is delivered that references an asset with meta information available, it rewrites the HTML based on the data from apr_hash_t. My problem is: using -X (only one worker) everything is fine and the module works as expected. But starting httpd with more than one worker, I sometime have no data on my apr_hash_t, which is expected to be there. I've tried various things, e.g. using server-process-pool and child_init to allocate the data, but the issue remains the same. I'm also using no global/thread_mutex, because I'm never reading and writing in the same thread (but in theory in the same process) - but I've no idea yet how hash_set is doing it internally, so this might be still a todo (is it? Do I really need a global mutex for data across worker/threads? Can I avoid it?). Using memcache is an option in theory, but I'd love to avoid it too. Otherwise my module scales much different, based on the delivered HTML. But anyways, it's most likely an issue with the wrong pool or with a misunderstanding of the scope and the cleanup of those pools. I'm making the assumption that you're using Unix and not Windows. I don't think it is related to pools or their cleanup. It is rather because of how Unix processes work. The request that updates your hash is served by a thread in one apache child and the request that reads from your hash may be served by a thread in another apache child. The problem is that each apache child has its own copy of the hash. I'm in a linux environment, yes, sorry for missing this peace in my initial mail. I thought about this too, but if it's related to the way how it works internally, you're completely right in general - but I'm in a shared object scope. Shouldn't it use the same reference to the memory segments where my .so is stored internally in every process? I'm not 100% sure if the -module parameter in libtool is maybe affecting this behaviour (or some httpd internals), but this should be 1 of the major things that don't occur on shared libraries but do in static ones. What's the best way to use a everywhere shared apr_hash_t? Do I need apr_shm to work properly with the workers (or apr_rmm in my use case) together with a global_mutex or should I try to dig into socache and avoid doing this handling manually? Many outdated resources around the web (and even different internal implementations for the same use-case) made me feel a bit ... doubtful to come to a final decision. Maybe you can help me here?! :-) My advice would be to avoid doing it manually. So socache with shared memory (shmcb) is the way you would suggest? Ingo
Re: ProxyPass in directory... why not??
On Mon, 9 Dec 2013 14:02:33 -0500 Jim Jagielski j...@jagunet.com wrote: We know that ProxyPass works in Location, since it was added in http://svn.apache.org/viewvc?view=revisionrevision=1026184. However, the very next patch (http://svn.apache.org/viewvc?view=revisionrevision=1031758) prevents it from working in Directory (as well as Files, but I'm ignoring that)... My question is why? From what I can see, the only reason is to close https://issues.apache.org/bugzilla/show_bug.cgi?id=47765 but I again wonder why we allow Location but not Directory? Anyone have any further info on why? And yes, this is related to https://issues.apache.org/bugzilla/show_bug.cgi?id=54965 Just one quick observation; on case insensitive file systems, we correct /mixedcase/ (the directory name) to /docroot/MixedCase/ - how this would then interact with the proxied user URL becomes an interesting question. I would argue they don't interact because files and directory should not be in the 'core of cores', but a separate loadable content provider module. We know we aren't there right now (and have discussed this before). But over a long history, people misused Location when they desired directory-level acls, and this encourages Directory and Files to be similarly confused. That said, I'm -0 on mixing these concepts, but if you can invent a use case for Directory you can just as easily invent a use case for Files .
Re: mod_remoteip
On Mon, 09 Dec 2013 19:52:35 +0100 Reindl Harald h.rei...@thelounge.net wrote: the mod_remoteip config looks like below RemoteIPHeader X-Forwarded-For RemoteIPProxiesHeader X-Forwarded-For That config would be bad, and disagrees with the documentation. The RemoteIPProxiesHeader leaves a breadcrumb for which of the IP addresses were used to derive the apparent origin IP of the request, the apparent origin IP address of the request is the %a value (not a header value), and the RemoteIPHeader continues to preserve any remaining X-Forwarded-For values once the apparent origin IP is not trusted to present an IP address value. Which value, that list consumed, or that list of remaining values would be undefined, if one were foolish enough to write these two distinct values to the same header field.
Re: ProxyPass in directory... why not??
On Mon, Dec 9, 2013 at 2:02 PM, Jim Jagielski j...@jagunet.com wrote: We know that ProxyPass works in Location, since it was added in http://svn.apache.org/viewvc?view=revisionrevision=1026184. However, the very next patch (http://svn.apache.org/viewvc?view=revisionrevision=1031758) prevents it from working in Directory (as well as Files, but I'm ignoring that)... My question is why? From what I can see, the only reason is to close https://issues.apache.org/bugzilla/show_bug.cgi?id=47765 but I again wonder why we allow Location but not Directory? Anyone have any further info on why? And yes, this is related to https://issues.apache.org/bugzilla/show_bug.cgi?id=54965 My recollection was that it simply doesn't work (PR and quick test) So I tried to split it into a don't silently fail to proxy PR and a please support this PR. Proxy stakes its claim to a request in translate_name, which is after the location walk but before the directory/file walk. So it makes sense that is sees no configuration there given the current impl of storing the per-directory ProxyPass in actual per-directory config. The fix would need to include (I think?) some way of normalizing or ignoring cmd-path, which is used instead of the first parm but doesn't make much sense in terms of Directory. But I am not so invested in any of it, so if something works for you in sandbox no qualms here with changes.
Re: mod_remoteip
On Mon, 09 Dec 2013 11:10:46 -0800 Mike Rumph mike.ru...@oracle.com wrote: As you can see from the bug report, I have been looking into this. It might also be important to consider the related bug 55637: - https://issues.apache.org/bugzilla/show_bug.cgi?id=55637 Closed invalid. The incorrect assumptions are very similar to but distinct from the 55635 case. In this case, let's use a car's title as it's internal proxy document and the car's ignition keys as the trusted proxy document. Although you might trust one with your car keys, they can go ahead and share those keys with yet another party. We would not want to design the remoteip logic to then let that individual hand another party the title to the vehicle :) Once the InternalProxy list is exhausted and we have begun processing the TrustedProxy list, we can never again assign the next apparent proxy to be an InternalProxy. That would be a claim by an external party whom we can't assign that much trust to. The setups so far have not included a RemoteIPProxiesHeader. But if it is included, the mod_remote documentation seems to indicate that the value should be different from the RemoteIPHeader. - http://httpd.apache.org/docs/trunk/mod/mod_remoteip.html#remoteipproxiesheader RemoteIPHeader X-Forwarded-For RemoteIPProxiesHeader X-Forwarded-By You are correct. From my analysis so far it appears that mod_remoteip is behaving as documented. But the documentation is a little difficult to understand. Correct, and I'm not sure how it can be improved. Feel free to quote, rephrase or build upon my responses to the bug tickets.
Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On Tue, 26 Nov 2013 09:47:44 +0100 Yann Ylavic ylavic@gmail.com wrote: On Tue, Nov 26, 2013 at 9:29 AM, Yann Ylavic ylavic@gmail.com wrote: On Tue, Nov 26, 2013 at 6:31 AM, Kaspar Brand httpd-dev.2...@velox.chwrote: On 26.11.2013 00:46, Yann Ylavic wrote: Ideas for the appropriate patch to httpd? Scope this fix to CONNECT requests alone, or all forward proxy requests? Maybe all forward proxy modules are concerned. There is PR 55782 which I started to debug but did not finish (run out of time). From there it looked like ProxyPassPreserveHost may cause problems too with SNI (not sure yet). Also, 54656. Anyway, shouldn't the proxy code(s) use the Host header to fill in the SNI from r-headers_in (when applicable), instead of r-hostname, at least for the CONNECT and ProxyPassPreserveHost cases? AFAICT, this was the idea in the original patch for PR 53134, but a slightly different approach was then committed (r1333969). As far as PR 55782 is concerned, the problem might be that proxy_util.c:ap_proxy_determine_connection() does not take Host: header differences into account when checking if an existing connection can be reused (not sure). With SNI this would have the effect that the hostname from the TLS handshake is causing the mismatch with the Host: header. ap_proxy_determine_connection() should only care about IP:port reuse, which can be even if the Host has changed. Oh wait, you are probably right here, the IP:port cannot be reused if the Host has changed with SNI, sorry for the noise... Right :) Using the Host as SNI at first (handshake) for CONNECT/ProxyPreserveHost/subrequests (at least) is still the way to go, IMHO. Yes, that initial SNI Host - vhost - SSL negotiation is all correct and get us to the appropriate configuration of ssl keys and certs for the SNI connection's host target. The rest of the SNI hostname processing steps are where the problem lies. We still need to perform http headers - vhost translation after the connection is established. If there's any desire to do SNI hostname validation, that has to be limited to comparing that hostname to the ServerName/ServerAlias entries, not to the HTTP Host: which has a different semantic meaning and is the only hostname of interest to httpd as an HTTP server.
multiple load balancers with overlapping members
Hello dev, Scenario: URL 1 must be handled by one the first two workers and URL 2 can be handled by any worker. And would like to achieve uniform balancing across all three workers. An attempt to achieve this is the example below. What is the expected behavior of the following setup, where there are overlapping members like in the example below between the two load balancers? Do the balancers act independent of each other(a) or do they take into consideration that they have shared workers(b) ? If this does not work as in (b) is there a way to achieve (b) ? Proxy balancer://Cluster1 BalancerMember http://10.0.0.10:80 BalancerMember http://10.0.0.11:80 ProxySet lbmethod=bybusyness /Proxy Proxy balancer://Cluster2 BalancerMember http://10.0.0.10:80 BalancerMember http://10.0.0.11:80 BalancerMember http://10.0.0.20:80 ProxySet lbmethod=bybusyness /Proxy URL 1 is redirected to Cluster 1 URL 2 is redirected to Cluster 2 Regards, Mihai Iacob DB2 Security Development DB2 pureScale Development Phone: (905) 413-5378 Email: mia...@ca.ibm.com
Re: [SPAM?]: Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On Tue, 26 Nov 2013 18:47:39 +0100 Peter Sylvester peter.sylves...@edelweb.fr wrote: Hi: On 11/26/2013 06:18 PM, Kaspar Brand wrote: On 26.11.2013 09:29, Yann Ylavic wrote: Another point is that SNI can not be an IP address according to the RFC 6066 : 3. Server Name Indication [...] Literal IPv4 and IPv6 addresses are not permitted in HostName. and this is not specifically checked by mod_proxy before filling SNI. Shouldn't the SNI be ommited when the Host is missing/empty or an IP address too? Yes, ssl_engine_io.c:ssl_io_filter_handshake() takes care of that. (I argued for adding this to OpenSSL back in 2009 [1], but one reaction was is not exactly a nice thing and Looks ugly [2].) Hi, Since I am the culprit about that hasty response :-) The design for sni is: The protocol is between the applications. The best thing that the client part in openssl would check is whether the servername is syntactically a fqdn, and the server could validate this. well, then someone will ask about validation of I18N names Not really, punycode are syntactically normal dns entries, by design the i18n mapping was designed not to break yesterday's dns conventions. The cert needs to use the punycode representation in the CN OpenSSL does not check such things AFAIK. It is not an application level firewall. For example, there is no code to check whether a hostname matches a certificate, etc. In fighting with the current logic, I note that we only test the literal CN, without any consideration of wildcards, alt-subject names, etc. Is there any openssl logic to help an application author ask 'is this name foo.example.com represented by this certificate?' in CN/alt name, testing wildcard expansion? Writing parsers is easy. Writing them correctly is error prone :)
Re: unsetting encrypted cookies when encryption key changes
On 09 Dec 2013, at 10:50 AM, Thomas Eckert thomas.r.w.eck...@gmail.com wrote: So it should work out of the box. I figured as much but was unsure whether I hit a bug or forgot a configuration directive. Will look into it once I have the time :-/ Here is an untested patch, can you give it a try and confirm? Index: modules/session/mod_session_crypto.c === --- modules/session/mod_session_crypto.c(revision 1550312) +++ modules/session/mod_session_crypto.c(working copy) @@ -222,7 +222,7 @@ * Returns APR_SUCCESS if successful. */ static apr_status_t decrypt_string(request_rec * r, const apr_crypto_t *f, -session_crypto_dir_conf *dconf, const char *in, char **out) +session_crypto_dir_conf *dconf, const char *in, char **out, int *dirty) { apr_status_t res; apr_crypto_key_t *key = NULL; @@ -252,6 +252,9 @@ apr_size_t len = decodedlen; char *slider = decoded; +/* if not first passphrase, mark the session as dirty */ +*dirty = *dirty (i 0); + /* encrypt using the first passphrase in the list */ res = apr_crypto_passphrase(key, ivSize, passphrase, strlen(passphrase), @@ -382,7 +385,7 @@ if ((dconf-passphrases_set) z-encoded *z-encoded) { apr_pool_userdata_get((void **)f, CRYPTO_KEY, r-server-process-pconf); -res = decrypt_string(r, f, dconf, z-encoded, encoded); +res = decrypt_string(r, f, dconf, z-encoded, encoded, z-dirty); if (res != APR_SUCCESS) { ap_log_rerror(APLOG_MARK, APLOG_ERR, res, r, APLOGNO(01842) decrypt session failed, wrong passphrase?); Regards, Graham --
Re: http_filter.c r1524770 open issue?
On Sat, 23 Nov 2013 19:10:21 +0100 Yann Ylavic ylavic@gmail.com wrote: On Sat, Nov 23, 2013 at 6:52 PM, Yann Ylavic ylavic@gmail.com wrote: On Tue, Nov 19, 2013 at 3:27 PM, Yann Ylavic ylavic@gmail.com wrote: On Mon, Nov 18, 2013 at 6:28 PM, William A. Rowe Jr. wr...@rowe-clan.net wrote: By closing our write-end of the connection, we can signal to the server that we can't efficiently forward their response to the client (owing to the fact that the server believes this to be a keep-alive connection, and that we can't know where this specific response ends, until the server has given up on receiving another keep-alive request). This would be a good way to ensure the connection is closed by the origin, but half-closes are sometimes (and even often) mishandled, the origin might consider the connection is fully closed (unwritable) when the read-end is closed, that could be an issue too. Otherwise, the following patch could do the trick : Index: modules/http/http_filters.c === --- modules/http/http_filters.c(revision 1541907) +++ modules/http/http_filters.c(working copy) @@ -291,13 +291,19 @@ apr_status_t ap_http_filter(ap_filter_t *f, apr_bu * denoted by C-L or T-E: chunked. * * Note that since the proxy uses this filter to handle the - * proxied *response*, proxy responses MUST be exempt. + * proxied *response*, proxy responses MUST be exempt, but + * ensure the connection is closed after the response. */ -if (ctx-state == BODY_NONE f-r-proxyreq != PROXYREQ_RESPONSE) { -e = apr_bucket_eos_create(f-c-bucket_alloc); -APR_BRIGADE_INSERT_TAIL(b, e); -ctx-eos_sent = 1; -return APR_SUCCESS; +if (ctx-state == BODY_NONE) { +if (f-r-proxyreq != PROXYREQ_RESPONSE) { +e = apr_bucket_eos_create(f-c-bucket_alloc); +APR_BRIGADE_INSERT_TAIL(b, e); +ctx-eos_sent = 1; +return APR_SUCCESS; +} +f-r- connection-keepalive = AP_CONN_CLOSE; + apr_socket_shutdown(ap_get_conn_socket(f-r-connection), +APR_SHUTDOWN_WRITE); } Actually the shutdown would break SSL (which may need to write during read, ~roughly~). Maybe just closing the connection at the end of the response is enough, which is assured by setting connection-keepalive to AP_CONN_CLOSE here and ap_proxy_http_response() setting the backend-close below in that case. Forget about that (chicken and egg problem), the end of the response is the close on the backend side, so there is no way to forcibly do it, just trust... No, the linger_close logic does all of the read-till-end logic. Yes, closing the write end may cause the connection to forcibly terminate, but we were already in the process of closing that connection. The back end SSL should give up on their failure to read-before-write due to the closed socket. All we need to do is to return a 400-class error response to the client, mark the backend connection closed, and close the write side of our backend connection, letting the linger_close pump do the rest of our work. That is, provided that the linger_close logic in the proxy worker pool behaves as expected.
Re: [SPAM?]: Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On Thu, Dec 12, 2013 at 12:20 AM, William A. Rowe Jr. wr...@rowe-clan.netwrote: On Tue, 26 Nov 2013 18:47:39 +0100 Peter Sylvester peter.sylves...@edelweb.fr wrote: Hi: On 11/26/2013 06:18 PM, Kaspar Brand wrote: On 26.11.2013 09:29, Yann Ylavic wrote: Another point is that SNI can not be an IP address according to the RFC 6066 : 3. Server Name Indication [...] Literal IPv4 and IPv6 addresses are not permitted in HostName. and this is not specifically checked by mod_proxy before filling SNI. Shouldn't the SNI be ommited when the Host is missing/empty or an IP address too? Yes, ssl_engine_io.c:ssl_io_filter_handshake() takes care of that. (I argued for adding this to OpenSSL back in 2009 [1], but one reaction was is not exactly a nice thing and Looks ugly [2].) Hi, Since I am the culprit about that hasty response :-) The design for sni is: The protocol is between the applications. The best thing that the client part in openssl would check is whether the servername is syntactically a fqdn, and the server could validate this. well, then someone will ask about validation of I18N names Not really, punycode are syntactically normal dns entries, by design the i18n mapping was designed not to break yesterday's dns conventions. The cert needs to use the punycode representation in the CN OpenSSL does not check such things AFAIK. It is not an application level firewall. For example, there is no code to check whether a hostname matches a certificate, etc. In fighting with the current logic, I note that we only test the literal CN, without any consideration of wildcards, alt-subject names, etc. Is there any openssl logic to help an application author ask 'is this name foo.example.com represented by this certificate?' in CN/alt name, testing wildcard expansion? Writing parsers is easy. Writing them correctly is error prone :) CN/SubjectAltName and wildcards are addressed in r1425874 (from PR54030). (with a comment about X509_check_host() available in openssl-1.0.2 only). Hope this helps.
Do pools lead to bad programming?
Hi all, So I've been spending a fair bit of time inside Apache recently and I've seen a pattern. Consider the following code (from mod_proxy_fcgi.c): apr_uri_t *uri = apr_palloc(r-pool, sizeof(*uri)); ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01076) url: %s proxyname: %s proxyport: %d, url, proxyname, proxyport); if (strncasecmp(url, fcgi:, 5) != 0) { ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01077) declining URL %s, url); return DECLINED; } That allocation of uri can be moved down (further than the code shown above) until right before it is used. I've seen this in a number of places and it feels like it is considered OK because the pool is relatively short lived and in most cases I've seen so far the allocation is pretty small. But as a concept, this strikes me as bad. If that code was using a traditional malloc/free the above would be a memory leak. I am aware that pools provide protection against such leaks, but nonetheless, that is wasted memory allocated and although quick, also a waste of time. Am I being too obsessive? If not, would you like patches to correct these as I find them, and if so, should I open a bug about this or just post patches here (they are all likely to be a simple move of 1 or 2 lines)? Sincerely, Kean
mod_rewrite and mod_dav_svn
We've recently made a change to mod_dav_svn to start implementing translate_name and map_to_storage hooks in order to prevent r-filename from being set to a bogus path since Subversion is servering content that isn't actually accessible via the standard file I/O APIs... You can see the reasoning for that here: https://subversion.apache.org/docs/release-notes/1.8.html#mod_dav_svn-fsmap However, this has had the side effect of creating problems for people who use mod_rewrite. See this email to the d...@subversion.apache.org list for an example: https://mail-archives.apache.org/mod_mbox/subversion-dev/201312.mbox/%3C76026FDD-FF7C-4C71-84EC-3A73F12B6F34%40simonsoft.se%3E Ultimately, what's happening here is that mod_rewrite is changing the r-uri and that results in mod_dav_svn's configuration being inaccurate for what would actually be served, mod_dav_svn continues to believe it should rewrite the path and hook the map_to_storage to prevent the ap_core_translate() from running and thus producing an r-filename based off the configured DocumentRoot and r-uri. This results in the mod_rewrite modification of r-uri to something mod_dav_svn doesn't serve to result in a 404. We could take and ignore translate_name calls that have been messed with by mod_rewrite like the patch in this email: https://mail-archives.apache.org/mod_mbox/subversion-dev/201312.mbox/%3c52a8b2bf.2000...@reser.org%3E The problem here is that mod_rewrite could change the URI to anything, including a path that mod_dav_svn is responsible for or a path that it isn't. With that patch in place then rewritten paths that mod_dav_svn does serve are back to original behavior that we made the change to avoid. I can work around this by triggering a ap_location_walk() and ap_if_walk() in mod_dav_svn's translate_name hook before retrieving my configuration as I did in the patch on this email: https://mail-archives.apache.org/mod_mbox/subversion-dev/201312.mbox/%3c52a8d7ae.5040...@reser.org%3E However, I'm messing with httpd internals that I don't think I should be and it also seems like this fix really belongs in mod_rewrite. Probably by way of an API call to update the per_dir_config so the module doesn't have to be kept in sync with whatever is in ap_process_request_internal(). Before anyone says but you shouldn't be using the per dir config from a translate_name hook. I'll point out that mod_rewrite itself is accessing its per directory configuration. Several other modules are accessing per directory configuration as well (mod_proxy, mod_proxy_express, and mod_log_debug). I'm not sure if this presents an issue for them since I haven't spent a lot of time looking into their behavior. I can say that they are all hooking translate_name with APR_HOOK_FIRST and only mod_proxy is configured to always run after mod_rewrite. I don't seem to have any better place to put this since I need to prevent the directory_walk that happens in core_map_to_storage. The new post_perdir_config might seem like a good place but it can't prevent a request from being subject to directory access control. Thoughts?
Re: Do pools lead to bad programming?
On 12 Dec 2013, at 2:00 AM, Kean Johnston kean.johns...@gmail.com wrote: So I've been spending a fair bit of time inside Apache recently and I've seen a pattern. Consider the following code (from mod_proxy_fcgi.c): apr_uri_t *uri = apr_palloc(r-pool, sizeof(*uri)); ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01076) url: %s proxyname: %s proxyport: %d, url, proxyname, proxyport); if (strncasecmp(url, fcgi:, 5) != 0) { ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01077) declining URL %s, url); return DECLINED; } That allocation of uri can be moved down (further than the code shown above) until right before it is used. I've seen this in a number of places and it feels like it is considered OK because the pool is relatively short lived and in most cases I've seen so far the allocation is pretty small. But as a concept, this strikes me as bad. If that code was using a traditional malloc/free the above would be a memory leak. I am aware that pools provide protection against such leaks, but nonetheless, that is wasted memory allocated and although quick, also a waste of time. The idea behind pools is that you allocate an arbitrary and unpredictable set of memory, and then free the whole set at some future point in time. This means you don't need to keep track of each and every escape path out of a system and hope you've cleaned up everything, you allocate at will confident that it will all be freed. Obviously allocating too early and then throwing away the results of the allocation is a waste as you've pointed out, and should ideally be smoked out and fixed. Am I being too obsessive? If not, would you like patches to correct these as I find them, and if so, should I open a bug about this or just post patches here (they are all likely to be a simple move of 1 or 2 lines)? I'd love to see these things fixed, because they add up. If you post them here they are likely to be reviewed very quickly, as they'll no doubt be simple to review. Regards, Graham --
Re: Do pools lead to bad programming?
On 12/11/13 4:00 PM, Kean Johnston wrote: Am I being too obsessive? If not, would you like patches to correct these as I find them, and if so, should I open a bug about this or just post patches here (they are all likely to be a simple move of 1 or 2 lines)? There are two ways this sort of thing can happen. The person who wrote the original code didn't feel like breaking the declaration of the variable and the allocation into two lines. Which is a mistake. Alternatively, the allocation could have always been necessary and the only difference was the ordering when it happened. I.E. there may have not been any returns before the use of the variable. This would be an error on the part of the person changing the code. Looking at this specific case, the variable was added in r357444 as part of the original template of mod_proxy_fcgi. So it seems like an error on the original developers part. It certainly seems worthwhile to clean these up when they are found in my opinion. I wouldn't say that pools lead to bad programming. It's mostly that pools limit the consequences of these sorts of mistakes. The mistakes are going to happen regardless.
Re: Do pools lead to bad programming?
I'd love to see these things fixed, because they add up. If you post them here they are likely to be reviewed very quickly, as they'll no doubt be simple to review. Cool. Here's a patch for the case I just mentioned. It also eliminates an un-needed automatic (yes, I obsess about stack frames too). diff against trunk. Kean Index: modules/proxy/mod_proxy_fcgi.c === --- modules/proxy/mod_proxy_fcgi.c (revision 1550327) +++ modules/proxy/mod_proxy_fcgi.c (working copy) @@ -757,14 +757,11 @@ char server_portstr[32]; conn_rec *origin = NULL; proxy_conn_rec *backend; +apr_uri_t *uri; proxy_dir_conf *dconf = ap_get_module_config(r-per_dir_config, proxy_module); -apr_pool_t *p = r-pool; - -apr_uri_t *uri = apr_palloc(r-pool, sizeof(*uri)); - ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01076) url: %s proxyname: %s proxyport: %d, url, proxyname, proxyport); @@ -790,7 +787,8 @@ backend-is_ssl = 0; /* Step One: Determine Who To Connect To */ -status = ap_proxy_determine_connection(p, r, conf, worker, backend, +uri = apr_palloc(r-pool, sizeof(*uri)); +status = ap_proxy_determine_connection(r-pool, r, conf, worker, backend, uri, url, proxyname, proxyport, server_portstr, sizeof(server_portstr)); @@ -814,7 +812,7 @@ } /* Step Three: Process the Request */ -status = fcgi_do_request(p, r, backend, origin, dconf, uri, url, +status = fcgi_do_request(r-pool, r, backend, origin, dconf, uri, url, server_portstr); cleanup:
Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On 12.12.2013 00:15, William A. Rowe Jr. wrote: The rest of the SNI hostname processing steps are where the problem lies. We still need to perform http headers - vhost translation after the connection is established. If there's any desire to do SNI hostname validation, that has to be limited to comparing that hostname to the ServerName/ServerAlias entries, not to the HTTP Host: which has a different semantic meaning and is the only hostname of interest to httpd as an HTTP server. The logic in ssl_hook_ReadReq was added with r757373. It wasn't in the initial version of the patch for SNI support (r606190). I didn't find prior discussion of r757373 on the mailing list, but perhaps it was motivated by this statement in RFC 6066 (RFC 4366 at the time): If the server_name is established in the TLS session handshake, the client SHOULD NOT attempt to request a different server name at the application layer. I never really understood the reasoning for turning this into a reject requests if the SNI extension and the Host header differ (see e.g. [1], where I was becoming tired of things said in [2]). Also, I think that SSLStrictSNIVHostCheck is a pretty unnecessary knob. Kaspar [1] https://mail-archives.apache.org/mod_mbox/httpd-dev/200903.mbox/%3C49D0EFF7.8030902%40velox.ch%3E [2] https://mail-archives.apache.org/mod_mbox/httpd-dev/200903.mbox/%3c49ce8500.5060...@apache.org%3E
Re: [SPAM?]: Re: [SPAM?]: Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On 12/12/2013 12:20 AM, William A. Rowe Jr. wrote: On Tue, 26 Nov 2013 18:47:39 +0100 Peter Sylvester peter.sylves...@edelweb.fr wrote: Hi: On 11/26/2013 06:18 PM, Kaspar Brand wrote: On 26.11.2013 09:29, Yann Ylavic wrote: Another point is that SNI can not be an IP address according to the RFC 6066 : 3. Server Name Indication [...] Literal IPv4 and IPv6 addresses are not permitted in HostName. and this is not specifically checked by mod_proxy before filling SNI. Shouldn't the SNI be ommited when the Host is missing/empty or an IP address too? Yes, ssl_engine_io.c:ssl_io_filter_handshake() takes care of that. (I argued for adding this to OpenSSL back in 2009 [1], but one reaction was is not exactly a nice thing and Looks ugly [2].) Hi, Since I am the culprit about that hasty response :-) The design for sni is: The protocol is between the applications. The best thing that the client part in openssl would check is whether the servername is syntactically a fqdn, and the server could validate this. well, then someone will ask about validation of I18N names Not really, punycode are syntactically normal dns entries, by design the i18n mapping was designed not to break yesterday's dns conventions. The cert needs to use the punycode representation in the CN yes, the client application sets such value into the SNI. what I wanted to say is that at least in our patch to openssl we did not add a convenience layer allowing to present a i18n name in unicode or utf8 and create the corresponding value and reject invalid things etc. An openssl client can set whatever chain of characters it wants. I mentioned checking fqdn just as an example of a can of worms. The api does not pretend to check valid dns name, not the restriction of ip addresses, nothing. It is the server that has to check for possible garbage. also, there is no check against whatever is in the returned ... client middleware like curl do this. furthermore, for curl, the application also uses a possibly i18n encoded value as input. OpenSSL does not check such things AFAIK. It is not an application level firewall. For example, there is no code to check whether a hostname matches a certificate, etc. In fighting with the current logic, I note that we only test the literal CN, without any consideration of wildcards, alt-subject names, etc. Is there any openssl logic to help an application author ask 'is this name foo.example.com represented by this certificate?' in CN/alt name, testing wildcard expansion? Writing parsers is easy. Writing them correctly is error prone :) There used to be no such code in openssl. (there is such code in gnutls AFAIK) Therefore there is code in client libraries, e.g. curl. best regards
Re: [SPAM?]: Re: Behavior of Host: vs. SNI Hostname in proxy CONNECT requests
On 12/12/2013 12:15 AM, William A. Rowe Jr. wrote: On Tue, 26 Nov 2013 09:47:44 +0100 Yann Ylavic ylavic@gmail.com wrote: On Tue, Nov 26, 2013 at 9:29 AM, Yann Ylavic ylavic@gmail.com wrote: On Tue, Nov 26, 2013 at 6:31 AM, Kaspar Brand httpd-dev.2...@velox.chwrote: On 26.11.2013 00:46, Yann Ylavic wrote: Ideas for the appropriate patch to httpd? Scope this fix to CONNECT requests alone, or all forward proxy requests? Maybe all forward proxy modules are concerned. There is PR 55782 which I started to debug but did not finish (run out of time). From there it looked like ProxyPassPreserveHost may cause problems too with SNI (not sure yet). Also, 54656. Anyway, shouldn't the proxy code(s) use the Host header to fill in the SNI from r-headers_in (when applicable), instead of r-hostname, at least for the CONNECT and ProxyPassPreserveHost cases? AFAICT, this was the idea in the original patch for PR 53134, but a slightly different approach was then committed (r1333969). As far as PR 55782 is concerned, the problem might be that proxy_util.c:ap_proxy_determine_connection() does not take Host: header differences into account when checking if an existing connection can be reused (not sure). With SNI this would have the effect that the hostname from the TLS handshake is causing the mismatch with the Host: header. ap_proxy_determine_connection() should only care about IP:port reuse, which can be even if the Host has changed. Oh wait, you are probably right here, the IP:port cannot be reused if the Host has changed with SNI, sorry for the noise... Right :) Using the Host as SNI at first (handshake) for CONNECT/ProxyPreserveHost/subrequests (at least) is still the way to go, IMHO. Yes, that initial SNI Host - vhost - SSL negotiation is all correct and get us to the appropriate configuration of ssl keys and certs for the SNI connection's host target. It gets to the target whether the conf is appropriate or not :-) The rest of the SNI hostname processing steps are where the problem lies. We still need to perform http headers - vhost translation after the connection is established. If there's any desire to do SNI hostname validation, that has to be limited to comparing that hostname to the ServerName/ServerAlias entries, not to the HTTP Host: which has a different semantic meaning and is the only hostname of interest to httpd as an HTTP server. this part was always a bit strange: the initial idea was: When the code sees the Host: and when there was no sni, to force a renegotiation with the right cert/key. best regards Peter