walk caching to avoid extra authnz
Hi -- The short version of this email is, please, can people review this patch for server/request.c and see if it breaks anything? There are versions for trunk and 2.2.x. Thanks in advance! http://people.apache.org/~chrisd/patches/walk_cache/ I recently finished setting up mod_dav behind mod_authn_dbd, and while we were testing I noticed that certain WebDAV clients (*ahem, DAVExplorer*) send PROPFIND requests that cause mod_dav to recurse down through the contents of the requested directory, generating sub-requests for each file or directory it encounters. Presumably such clients send a large DAV depth header value; I didn't really probe the exact request contents. What surprised me was discovering that each sub-request (or, equally, internal redirect) went through the authnz steps despite the fact that I just had a single blanket authnz configuration for the entire directory on which I'd enabled mod_dav. This meant that if a user happened to request a directory with, say, 1000 files, each file triggered a mod_authn_dbd request to the database. Clearly, this was going to drag performance into the ground if released as-was, since we have lots of really large directories. It surprised me in part because in ap_process_request_internal() there are comments and code like: /* Skip authn/authz if the parent or prior request passed the authn/authz, * and that configuration didn't change (this requires optimized _walk() * functions in map_to_storage that use the same merge results given * identical input.) If the config changes, we must re-auth. */ if (r-main (r-main-per_dir_config == r-per_dir_config)) { r-user = r-main-user; r-ap_auth_type = r-main-ap_auth_type; } else if (r-prev (r-prev-per_dir_config == r-per_dir_config)) { r-user = r-prev-user; r-ap_auth_type = r-prev-ap_auth_type; } And in make_sub_request(): /* Start a clean config from this subrequest's vhost. Optimization in * Location/File/Dir walks from the parent request assure that if the * config blocks of the subrequest match the parent request, no merges * will actually occur (and generally a minimal number of merges are * required, even if the parent and subrequest aren't quite identical.) */ But, as it turned out, the ap_*_walk() functions only set r-per_dir_config to r-main-per_dir_config (or r-prev-per_dir_config) if the URI or filename exactly matches. The relevant bits look like: /* If we have an cache-cached location that matches r-uri, * and the vhost's list of locations hasn't changed, we can skip * rewalking the location_walk entries. */ if (cache-cached (cache-dir_conf_tested == sec_ent) (strcmp(entry_uri, cache-cached) == 0)) { ... if (r-per_dir_config == cache-dir_conf_merged) { r-per_dir_config = cache-per_dir_result; return OK; } This is why all the sub-requests generated by mod_dav, where a sub-request's r-uri might be /foo/bar.html while the main request's r-uri was /foo/, were each getting their own r-per_dir_config, so that ap_process_request_internal() re-ran all the authnz checks for each one. However, the code in the ap_*_walk() functions for these cases looked like it might be very close to being able to recognize them as not requiring their own privately merged r-per_dir_config. As the functions step through the config sections they detect any variation from the sections which matched for the previous or parent request; if none is found, now_merged is set to the last set of merged configs from the cache and that's then merged onto r-per_dir_config. If they could also detect that r-per_dir_config was the same as cache-dir_conf_merged -- that is, if the previous or parent request started from the same config -- then maybe they could set r-per_dir_config to cache-per_dir_result just like the case where the URI or filenames matched exactly. That's what this patch does, in sum. There's obviously one big assumption here, that it's OK to have lots of sub-requests and internal redirects pointing to the original request's r-per_dir_config, even if their URIs don't match. So far it seems to work OK for me, but there may be problems I haven't perceived or stumbled on! So, testing and review would be much appreciated. The changes to the ap_*_walk() functions are pretty clear, I think. The big change is to prep_walk_cache(). The problem I had there was that ap_process_request_internal() resets r-per_dir_config to r-server-lookup_defaults after the first call to ap_location_walk(). Here's a crummy diagram of what seems to happen to the r-dir_conf_merged and r-per_dir_result values in the cache as you pass through the initial request and then a sub-request (S = r-server-lookup_defaults): dir_conf_mergedper_dir_result init: ap_location_walk:S -
Re: walk caching to avoid extra authnz
Hi -- Thanks for taking an initial look at these patches; I reviewed them a bit more as well and did some testing this morning which resulted in a pair of small changes. One of those changes is important; it catches the condition where the current walk finds additional matches beyond those which matched for the previously cached walk. That's a condition which doesn't matter with the current code, where you always get a private r-per_dir_config unless the URI or filename is an exact match with the one in the parent request. But with these patches, it matters because it's another case where you can't reuse the parent's r-per_dir_config. Based on Justin's and Nick's comments, below, I confess that I feel rather stuck between a rock and a hard place. On the one hand, enabling mod_dav and mod_authn_dbd amounts to a kind of denial-of- service problem: DAV clients can issue legitimately issue requests that cause large numbers of DB queries, one for each item in a directory. (I suppose mod_autoindex might have similar problems, but without the extra recursive kick in the pants.) On the other hand, as Justin points out, some existing modules like mod_authz_svn may rely on httpd's current behaviour, in which sub-requests with distinct URIs from their parents always go through the authnz steps. Some thoughts on that score at the bottom of this message. Nick Kew wrote: There's a comment in both 2.2.x and trunk, just at the start of your patch, saying: /* Skip authn/authz if the parent or prior request passed the authn/authz, * and that configuration didn't change (this requires optimized _walk() * functions in map_to_storage that use the same merge results given * identical input.) If the config changes, we must re-auth. */ which looks like exactly what your patch is doing. WTF? Yeah, that's surprising, it's it? :-) But that comment (not part of the patch, but in the code for a long time) seems not to mean what you might think it means. The applicable config directives can be identical but the ap_*_walk() functions still merge a new, private r-per_dir_config for any sub-request or internal redirect where the URI or filename isn't precisely the same as the parent's. That's why things like mod_dav which trigger lots of sub-requests effectively can't be used with any modules that do complicated authnz (mod_authn_dbd, say, or maybe an LDAP thing). In our case, we could see thousands of redundant DB queries from a single DAV request; it's basically a non-starter without some kind of fix. Second, there are substantial changes in directory_walk which I would expect to affect this. Did you observe the problem behaviour in both 2.2.x and trunk? In our case, it's actually all about ap_location_walk(), so yes, both 2.2.x and trunk have the same behaviour. It's essentially independent of what kinds of specific transformations and checks the walks do. It has more to do with what happens when a path like /foo/bar/ matches exactly the same set of configuration sections as a parent request like /foo/. From the code comments, you might think it would get the parent's r-per_dir_config, but that's not what happens. Alternative proposal for this scenario that doesn't involve a possible risk of breaking something. mod_auth_inherit In anything that isn't a subrequest, it'll return DECLINED. In a subrequest, mod_auth_inherit will set r-user to r-main-user, without reference to any password lookup. If r-main-user is unset it'll return DECLINED. It'll also set an inherited token. A corresponding authz hook will implement a Require inherit to enable subrequests with inherited set to be authorized, and will run ahead of normal authz hooks. Well, such a thing might work, but at first blush it seems to me like it puts an undue burden on administrators, who'd have to read the fine print to determine that a particular combination of authnz modules with other modules might produce serious denial-of-service consequences on their server, and that they need to install and correctly configure additional modules to prevent this. Justin Erenkrantz wrote: BTW, I expect that this would break Subversion and mod_authz_svn. It is possible to have an authz setup such that I can read /foo, but not /foo/bar - the *only* way to confirm that someone has access is via the authz hooks (and why SVN has such a penalty for path authz). Yet, if this breaks that, then that's a no-no. So, have you tested that setup and confirmed that it still works? -- justin Indeed it does break this; thanks for the heads-up. I sort of feel like such modules are relying on less than ideal behaviour on httpd's part. To me at least, it seems like common sense that a config section like this implies one DB authn lookup per request: Location /foo/ Dav filesystem AuthDigestProvider dbd Require valid-user ... /Location The same is true of
Re: walk caching to avoid extra authnz
Hi -- William A. Rowe, Jr. wrote: It so happens I'm starting one of those cycles again right now with the changes to the mis-handling of file matches that Nick(?) corrected in trunk, and I'll study your patch in tandem. Thanks for your work!!! Much appreciated, but alas, Justin pointed out a serious conflict in mod_authz_svn, and more generally, various modules may exist out there that are also expecting authnz functions to be called for every sub-request that has a different URI/filepath. However, I'm still attached to the idea of the more aggressive walk caching, because it does address some otherwise potentially serious performance issues. Moreover, I'd think these were likely to be unexpected and non-intuitive for many users. Interestingly, mod_dav_svn and mod_dav_authz appear to me to represent both sides of the issue. The mod_dav_svn docs [1] say: [E]ven if you haven't configured a module like mod_authz_svn at all, the mod_dav_svn module is still asking Apache httpd to run authorization checks on every path. The mod_dav_svn module has no idea what authorization modules have been installed, so all it can do is ask Apache to invoke whatever might be present. That's true, and it occurs because dav_svn_authz_read() invokes lots of sub-requests. But, suppose that httpd could optimize away all the authnz checks in most of these sub-requests -- then mod_dav_svn wouldn't even need its SVNPathAuthz Off option, which makes dav_svn_authz_read() skip all the sub-requests: On the other hand, there's also an escape-hatch of sorts, one which allows you to trade security features for speed. If you're not enforcing any sort of per-directory authorization (i.e. not using mod_authz_svn or similar module), then you can disable all of this path-checking. In your httpd.conf file, use the SVNPathAuthz directive[.] This optimization in httpd would be possible if it could assume its config files contained a complete description of all the per-directory authnz configurations. But, as noted in the SVN docs, some modules like mod_authz_svn may provide authnz functions that do per-directory discriminations based on data from outside the httpd config files. They can do this because of the way sub-request per_dir_config data is handled now. Since there's no way for httpd to know about such external per-directory authnz discriminations, we can't add the walk cache optimizations as I first wrote them. Maybe in 3.x they could be added this way, but I'd prefer to see them arrive a little sooner, if at all possible. Thinking about it last night, what I thought might work is for httpd to assume that any module that provides authnz functions is by default unsafe for walk cache optimization. However, the majority -- maybe all -- of the authnz modules in the httpd distribution could explicitly set a flag which indicates that they're safe. If no modules are loaded that are unsafe, then the optimization could be done, otherwise the old behaviour would be the default. Existing external modules would continue to work as before. (Finally, mod_authz_svn might be modified to note whether its AuthzSVNAccessFile external config file contained any per-directory configurations; if not, it could go ahead and declare itself safe. I'm less sure about that since it's not really something I've studied closely.) At any rate, review of the patches is still much appreciated, as well as testing. I'll run them in production for a while here to see if anything shakes out, and maybe sometime I'll get around to looking at how to add the safe opt-in feature ... when time permits! :-) Thanks, Chris. [1] http://svnbook.red-bean.com/nightly/en/svn.serverconfig.httpd.html#svn.serverconfig.httpd.authz.pathauthzoff -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
mod_dbd revisions
Hi -- I've been spending a lot of time staring at mod_dbd and have created four sequential patches for trunk: http://people.apache.org/~chrisd/patches/mod_dbd_pools_groups/ I've tested each one in both the threads and no-threads cases, making sure it compiles and runs with mod_authn_dbd. I've also tested the cumulative patch fairly extensively, using both ap_dbd_acquire() and ap_dbd_open()/ap_dbd_close() simultaneously (in different threads). I checked the threads and no-threads cases, with DBDPersist on and off for each, making sure all types of connections were always closed at the appropriate time. I also forced dbd_setup() to fail in the child_init hook to test the code path where dbd_setup() is called later after acquiring a mutex. Would other folks mind taking a these out for a trial run? One thing I'd be interested to know is whether multiple drivers work OK; with the cumulative patch it's possible to point one set of hosts to one kind of DB and another set to a different DB. Do the APR DBD drivers play well with each other if they're loaded into the same process? Another pair of questions I have relate to the netware and beos MPMs and their handling of child_init hook functions and memory pools. Skim down to the *** for details. Here's a rundown of the changes for the four patch files: First, a pile of cleanups, including some function and variable renaming. This is primarily for the sake of readability and clarity (and my sanity). Second, some miscellaneous fixes and changes, including: - Using ap_log_error() or ap_log_rerror() instead of ap_log_perror(), wherever possible, since without a server_rec log_error_core() ignores messages below the default server log level. - In the no-threads versions of ap_dbd_[c]acquire(), in the non- persistent case, cleanups are registered passing svr-conn as the data; but that's NULL since it's the non-persistent case. - In the no-threads persistent case, ap_dbd_open() makes a call to apr_dbd_error() passing arec-driver and arec-handle if apr_dbd_check_conn() fails, but arec/rec is NULL; it should be svr-conn-driver and svr-conn-handle. - Eliminating code duplication between the threads and no-threads cases. - Including apr_lib.h and using apr_isdigit() instead of including ctype.h and using isdigit() directly. - Including apr_want.h and using APR_WANT_MEMFUNC and APR_WANT_STRFUNC. Third, a fairly major change to deal with the problems identified in PR #39985: http://issues.apache.org/bugzilla/show_bug.cgi?id=39985 One issue raised by this bug report is that in the non-threaded persistent case, the DB connection is opened using s-process-pool and then dbd_close() is registered as a cleanup on that pool. However, the MPMs never call apr_pool_destroy() on that pool, so the DB connections aren't closed properly on shutdown. The larger issue raised in the report is that in the threaded, persistent case, segfaults could occur during shutdown. This is ultimately due to a second call to apr_pool_destroy() for a pool that was previously destroyed. More generally, it is due to the fact that there's no easy way to use reslists if you plan to create a sub-pool in the constructor. But, creating a sub-pool is often the right thing to do. This patch works around the problem by passing to apr_reslist_create() a pool for which apr_pool_destroy() is never called. For more details, see: http://marc.theaimsgroup.com/?l=apr-devm=116683262613500w=2 It turns out that the child processes created by MPMs call apr_pool_destroy() on their pchild pools, and this is the pool passed to the child_init hook functions. Currently dbd_setup_init() passes this pool to dbd_setup(), which creates a sub-pool and passes that to apr_reslist_create(). However, luckily, it turns out that s-process-pool, which is created in the main process, is not destroyed when a child process exits. This means that we can use it to create the reslist, and the reslist's internal cleanup function will never be called. Instead, we explicitly register apr_reslist_destroy() as a cleanup on a sub-pool of pchild which is created in dbd_setup_init(). At shutdown, this cleanup takes care of closing all the DB connections, without causing a double-free situation. (*** As a sidebar issue, it looks to me as though the netware MPM doesn't call apr_pool_destroy() on the pmain child it creates and passes to the child_init hook functions. Also, the beos MPM doesn't seem to call the child_init hook functions at all, because it only creates threads. I know nothing about either platform, but I'm making the assumption here that these are both things which should be fixed. If that's a bad assumption, please holler!) On advice from Nick Kew, I revised dbd_construct() so that it allocates its ap_dbd_t structure out of a sub-pool; this means the reslist won't leak memory over time. The dbd_close() is registered as a cleanup on the sub-pool,
Re: walk caching to avoid extra authnz
Hi -- Chris Darroch wrote: Much appreciated, but alas, Justin pointed out a serious conflict in mod_authz_svn, and more generally, various modules may exist out there that are also expecting authnz functions to be called for every sub-request that has a different URI/filepath. [snip] This optimization in httpd would be possible if it could assume its config files contained a complete description of all the per-directory authnz configurations. But, as noted in the SVN docs, some modules like mod_authz_svn may provide authnz functions that do per-directory discriminations based on data from outside the httpd config files. They can do this because of the way sub-request per_dir_config data is handled now. Since there's no way for httpd to know about such external per-directory authnz discriminations, we can't add the walk cache optimizations as I first wrote them. Maybe in 3.x they could be added this way, but I'd prefer to see them arrive a little sooner, if at all possible. Thinking about it last night, what I thought might work is for httpd to assume that any module that provides authnz functions is by default unsafe for walk cache optimization. However, the majority -- maybe all -- of the authnz modules in the httpd distribution could explicitly set a flag which indicates that they're safe. If no modules are loaded that are unsafe, then the optimization could be done, otherwise the old behaviour would be the default. Existing external modules would continue to work as before. I've revised my patch set for httpd trunk to do something like what's described above; it's available at: http://people.apache.org/~chrisd/patches/walk_cache/ Briefly, the walk cache changes I made earlier are still in place, but sub-requests and internal redirects only share the initial request's per_dir_config if ap_auth_initial_req_only() returns true. Otherwise, they get their own private copy, as before. The new function ap_auth_initial_req_only() simply returns the AND'ed value of two optional functions, one each for authn and authz. These functions return true or false depending on whether any authn or authz providers were registered using the old unsafe for walk cache optimization version code. If none were -- as should usually be the case for most installations -- then optimization can proceed. The main trick here is that instead of the hard-coded version string 0 for authn/z providers, AUTHN/Z_PROVIDER_VERSION_ALL_REQ and AUTHN/Z_PROVIDER_VERSION_INITIAL_REQ are defined. Also, instead of doing ap_lookup_provider() directly, ap_authn_lookup_provider() and ap_authz_lookup_provider() should be used instead. These check first for a better INITIAL_REQ version, then fall back to the ALL_REQ version. Based on a somewhat speedy evaluation, I don't see any authn/z modules that would require the ALL_REQ version -- that is, anything which works like mod_authz_svn, and discriminates between requests on the basis of URLs (using configuration information other than the normal httpd configuration directives). I won't check this in anytime soon, because I'd like to hear if anyone can think of additional problems, or has issues with this proposed solution. Fire away! Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: walk caching to avoid extra authnz
Justin Erenkrantz wrote: Without looking at the code, I'd hope that means that they are version 1 providers. =) -- justin Yes, that's what I used, although I wondered if that was ideal, or whether it should be something like 0-initial-req-only ... but 1 seemed less ugly to me. :-) However, it occurs to me that I probably have more to do here, to make sure that all modules that have registered hook functions for access_checker, check_user_id, and/or auth_checker have also signed off on the more aggresive walk caching. Sigh ... that I won't get to until the new year, for sure. Still, comments on what's there so far much appreciated! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: 3.0 - Proposed Goals
Hi -- Paul Querna wrote: - Rewrite the Core to be an Async Event state machine and data router. - Break the 1:1 mapping of a worker to a single request. - Change the meaning of MPMs. The problem with MPMs today is they are really mostly platform abstractions -- not just abstractions of the process model itself. - Build a cleaner configuration system, enabling runtime reconfiguration. - Experiment with the right way to abstract state machines, multi-threading, and async IO from module developers who want a 'simple world view'. At a high level, I like these goals! I'd agree with others that it's probably most important to set specific goals which are achievable in a mid-term timeframe (3-6 months?) and focus on those. The process of actually implementing them is likely to spur new ideas, as well. I'm not keen on requiring an XML configuration file format; so long as it's optional, that's OK with me. Here are a few itches that I'd personally like to see scratched as well; they may or may not align with your proposals: - Stop passing sub-requests through the potentially expensive authn/z steps if they share the same authn/z configuration as the main request. This would have security implications for certain external modules like mod_authz_svn, which currently expect to be called to do private authorization of each sub-request. However, there are otherwise some serious performance implications when using, say, mod_dav with mod_authn_dbd.[1] - Provide a generic inter-process data-sharing framework. Currently mod_ssl, mod_auth_digest, mod_ldap, and the scoreboard all use more-or-less independent implementations of shared memory data stores. As someone who maintains a module with yet another such data store, I think a standard interface for such things (beyond apr_rmm) might be useful. Perhaps something key/value based; maybe aligned with memcached somehow? See my final musings below. - Provide a generic scoreboard interface for use by modules. The current scoreboard is effectively sized at initial startup to max MPM processes * max MPM threads. That wastes space, but also provides no way for modules to register their own private threads. As someone who maintains a module with such threads, I'd love to see them in mod_status. I'd also like to see the non-worker threads from an MPM like worker in there too (i.e., listener, start, main); I have a collection of incomplete patches to do this hanging around. Admittedly, this may be a hard problem: how do you size the scoreboard's block of shared memory if modules can be added at restarts, and might suddenly require extra scoreboard space for their threads? I have no good solution. (Should the scoreboard use the above-mentioned generic data-sharing framework, or not? Perhaps shared memory isn't even the right tool?) If we're generally moving to an increasingly asynchronous, threaded design then I think such a scoreboard might also serve as a valuable sanity check during development (What the heck is that thread doing?) An API which allowed threads to register their possible states might be valuable; this would allow modules/providers/MPMs to define what states were meaningful to them, rather than trying to define them all in scoreboard.h. I confess I haven't followed the progress of the httpd-proxy-scoreboard branch; maybe there's some work in there that would apply to these issues. As a long-term goal I think it would be interesting to try to design these interfaces in such a way as to allow them to work between multiple instances of httpd. This obviously heads into the tricky territory of distributed computing, clustering, etc. If one can't permit stale or cached data you may need write replication, a distributed lock manager, leader election schemes, and so forth. This is obviously complex stuff and maybe out of scope for httpd. Still, I can't help but feel like there's a logical continuum here from the existing httpd 2.x shared memory data stores, to memcached, to a distributed locking and data storage system like Google's Chubby lock service.[2] Moving httpd's existing uses of inter-process data stores to a generic key/value interface might allow us to start with just a default provider that had a shared memory implementation no different than today's. Other providers could then be developed later to replicate the data across a cluster, if so desired. Chris. [1] http://marc.theaimsgroup.com/?l=apache-httpd-devm=116556310814307w=2 [2] http://labs.google.com/papers/chubby.html -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH]: Call dbd_setup() for all virtual hosts or create mutex in mod_dbd.c
Nick Kew wrote: Thanks. I've just reviewed both patches, and added them as an attachment to PR#42327 and a proposal in STATUS. I apologize for joining this thread a little late. I know it's more complicated, but I'm inclined to suggest trying to bring the more comprehensive trunk fixes into 2.2.x. First, can I suggest that PR #42327 be marked as a duplicate of PR #41302? For many details, see this comment in particular, and its various references: http://issues.apache.org/bugzilla/show_bug.cgi?id=41302#c1 Trunk also includes a large number of other fixes, including those for PR #39985. Both PR #39985 and PR #41302 have comments from their respective reporters indicating that the trunk version resolves their problems, and I think all three PRs could be closed as soon as backports are committed to 2.2.x. Sorry again for the lack of time -- things should ease up for me in a month or two. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH]: Call dbd_setup() for all virtual hosts or create mutex in mod_dbd.c
Bojan Smojver wrote: If mod_dbd.c from trunk works in 2.2.x, we should just have that instead. No need to carry two different things if the new stuff is backward compatible. If you need to, you can just drop the mod_dbd.c from trunk into 2.2.x; we do that and it works fine. The main problem is simply that trunk has had a series of substantial patches, all of which can all be backported, but need some round tuits, as Nick would say. PS. I was just responding to a privately asked question about problems with mod_dbd.c in 2.2.x and decided to send the resulting patches to the list, in case those problems were not already addressed. No problem ... I should have made up the backport proposal ages ago, but have been buried in work until recently, and am still digging my way out. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH]: Call dbd_setup() for all virtual hosts or create mutex in mod_dbd.c
Nick Kew wrote: I was wondering about that, but reluctant to propose a backport from trunk without doing some more research. If you want to make it a backport proposal, I'll try and get my brain around it (and one or two related issues) in the morning. The main thing I'd point to is this long explanation of what the issues I found were: http://marc.info/?l=apache-httpd-devm=116742014418304w=2 If you follow the links at the bottom of this page, it maps my original patches to what eventually got committed to trunk, along with the comments in SVN: http://people.apache.org/~chrisd/patches/mod_dbd_pools_groups/ Anyone here using trunk mod_dbd operationally and have anything to report? We just drop the mod_dbd.c from trunk into 2.2.x and it works fine. I did some really extensive testing of it beforehand, checking all the various combinations of options, and I think it's pretty solid. IIRC, r491884 contains a number of fixes for old problems with somewhat unusual combinations of things, like the use of ap_dbd_acquire() on a non-threaded platform. I tried to check all the combinations of the following: - with and without threads - with and without persistent connections - ap_dbd_[c]acquire(), with release during request or connection cleanup - ap_dbd_open(), with explicit release using ap_dbd_close() - failure during setup when creating persistent connection pools That's off the top of my head, so I may have missed a few things; it's been a while since I thought about it. I'll try to start making up backports but it won't be tomorrow, for sure. :-/ Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH]: Call dbd_setup() for all virtual hosts or create mutex in mod_dbd.c
Bojan Smojver wrote: I apologize for joining this thread a little late. I know it's more complicated, but I'm inclined to suggest trying to bring the more comprehensive trunk fixes into 2.2.x. Just a ping on the status of this backport... The proposals are in the 2.2.x STATUS file, but no votes yet (except mine). I know the patches are a little daunting -- normally I wouldn't propose such large changes as backports -- but in this case I felt it was the best option, since the key pools and groups changes are simpler to understand and implement if you've got the other bits of tidying in place first. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: auth dbd pgsql issues
Phil Endecott wrote: http://marc.info/?l=apache-httpd-usersm=118765132424174w=2 My guess is that I'm encountering the known issues with DBD described here: http://marc.info/?l=apache-httpd-devm=116742014418304w=2 Am I right in thinking that this is fixed in the trunk but not in 2.2.4? What about 2.2.5/6 - I don't see anything in the 2.2.5 changelog. I think you're right about the problem you're encountering; the patches for 2.2.x await a third vote and so they're not in expected in 2.2.5/6, as it stands at the moment. - In mod_authn_dbd.c, a couple of global variables are used to point to the dbd_acquire and dbd_prepare functions. Am I right in thinking that this means you can have only one dbd driver for authn? So you can't for example, have postgresql in one virtual host and mysql in another? This shouldn't limit the use of multiple drivers (if you're using the mod_dbd from trunk). These mod_authn_dbd globals just point to common mod_dbd functions. Assuming there are no underlying incompatibilities between the drivers (or other bugs), mod_dbd's functions should invoke the relevant one's functions for each request. - It looks as if, when a new db connection is created, all prepared statements are prepared on the new connection. However, when a new prepared statement is created, it is not prepared on any existing connections. This is fine as long as all prepared statements are declared before any connections are established. Correct? Yes. The ap_dbd_prepare() statement must be used during the configuration phase only. The trunk code says: /* Prepare a statement for use by a client module during * the server startup/configuration phase. Can't be called * after the server has created its children (use apr_dbd_*). */ - authn_dbd_password() uses the error message Error looking up %s in database for 3 different errors. It would be really great to have different messages in each case. I'd suggest opening a Bugzilla report and, if possible, attaching a patch file with the revised messages you'd like to see (and please add the keyword PatchAvailable in this case). - The mod_authn_dbd docs (http://httpd.apache.org/docs/2.2/mod/mod_authn_dbd.html) claim DBD drivers recognise both stdio-like %s and native syntax. Is this accurate? It seems that the postgresql DBD driver does some type magic based on the character after the %, which wouldn't be possible with the postgresql $1 syntax. Maybe $1 only works for strings (which would be OK for usernames, of course). (Does it correctly count the number of parameters if I use $1?) I seem to recall some type magic in this driver, but I'm not particularly familiar with it. If you encounter problems, I'd again suggest opening a Bugzilla report. - The mod_dbd docs (http://httpd.apache.org/docs/2.2/mod/mod_dbd.html) say that DBDPersist can be 0 or 1; this should be on or off. I'll try to remember to fix that, but if you don't see it in trunk's docs after a while ... submit a Bugzilla report! :-) Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: auth dbd pgsql issues
Guenter Knauf wrote: I think you would increase chance for another vote if there would be only _one_ patch which applies cleanly against current 2.2.x head; I've started testing with them, but got stopped since one patch didnt apply of them all; therefore I only voted on the first one which I could test. If possible please create a new diff which includes all changes, and you get my vote at once after I have verified that at least some of the issues I see on threaded NetWare are gone with the patch (which I believe from what I've read so far)... If a single patch file helps, here it is: http://people.apache.org/~chrisd/patches/mod_dbd_pools_groups/mod_dbd-all-2.2.x.patch Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: auth dbd pgsql issues
Phil Endecott wrote: OK; my experience seems to be that in this respect 2.2.4 has regressed compared to 2.2.3 (though I may have been lucky in some way with my 2.2.3 setup) and certainly compared to 2.0.x + the 3rd-party mod_auth_pgsql. I don't know if this affects how the issue is prioritised for inclusion in future versions. I would love to see working authn_dbd ASAP. We have mod_authn_dbd + mod_dbd working but we use the trunk version of mod_dbd.c. It's a drop-in replacement for 2.2.x's mod_dbd.c; you can just copy it into 2.2.4 and recompile. I'd recommend trying that to see if it deals with your concerns; for example, it should handle the original issue I think you had with putting DBD directives into virtual hosts. If you still have problems, they may of course either be previously unnoticed bugs in the driver or elsewhere. But I'd suggest trying trunk's mod_dbd.c as a first step. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: mod_dav hack
Fred Woods wrote: I've written an extension/hack for mod_dav. I would like to know if it would be useful to others and how I might change it to be more portable? The hack hooks the code to store, copy, rename, and remove files. It creates a string representing the operation and the path elements, and writes the string to a unix domain socket. The simple hook method was chosen so as to have as little affect on the apache web server as possible. The service I have listening on the unix domain socket is called davsync. It is a service that replicates the operations onto other servers and selectively checks some types of files into a configured version control system. To make it portable, I assume I all need to do is make the hook socket configurable in the httpd.conf file and handle other types of sockets? Sounds interesting ... can you put it up somewhere so others can take a look? (Hard to know in the abstract if those are the only portability issues.) It would be of interest to me if the store hook could also invoke a content validation provider to, for example, do XML validation against a schema before permitting the file to be changed. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
thoughts on ETags and mod_dav
Hi -- A couple of months ago a short thread started in relation to the PRs #16593 and #38034 (which also references #42987) on the various problems related to ETags: http://marc.info/?l=apache-httpd-devm=118831732512678w=2 http://issues.apache.org/bugzilla/show_bug.cgi?id=16593 http://issues.apache.org/bugzilla/show_bug.cgi?id=38034 http://issues.apache.org/bugzilla/show_bug.cgi?id=42987 I'm not about to tackle these all immediately, but I wanted to send out a couple of proposals and find out what problems folks might see with them. 1) Per #38034, it appears that ap_meets_conditions() treats * incorrectly. The patch attached to the PR duplicates ap_meets_conditions() for exclusive use by mod_dav. The only changes are to add checks on resource-exists in two places and to use resource-hooks-getetag instead of looking for an ETag header in r-headers_out. In the interest of avoiding code duplication, I wonder if it would be possible to continue to use ap_meets_conditions() by using r-no_local_copy in place of resource-exists, and by setting/unsetting that and an ETag header in r-headers_out in mod_dav as necessary; e.g., prior to calls to ap_meets_conditions(). However, it looks to me as though r-no_local_copy is used in a few places to bypass the ap_meets_conditions() checks entirely (i.e., to never send a 304 response in certain cases). Does anyone have comments on what the intended purpose r-no_local_copy is or was, especially in relation to the language of RFC 2616 sections 14.24 and 14.26 (if '*' is given and any current entity exists for that resource ...)? Would it be more appropriate and/or required by our backporting rules to add another field to request_rec, like r-exists, to indicate to ap_meets_conditions() that a resource does not exist? Or to use r-notes or conceivably r-finfo for the same purpose? Is it even possible to backport a change to the action of ap_meets_conditions() such that it would start depending on data which third-party callers won't be supplying? Advice is welcome, please! 2) I haven't looked at #16593 in any detail, but I wonder if the change proposed in the patch attached to #38034 where resource-hooks-getetag is used instead of looking for an ETag header in r-headers_out might not be an attempt to deal with this issue. If so, then whatever the eventual solution for #38034 turns out to be will likely resolve this PR as well. 3) Per #42987, it seems as if there's a subtle difference between the requirements for ETags as described in RFC 2616 section 13.3.3 and the way we currently generate them in ap_make_etag(). When a file's mtime is in the past, ap_make_etag() uses the file's inode, length, and/or mtime to create a strong ETag. However, when the mtime is the same as r-request_time, a weak ETag is generated. As the attachment to the PR reveals, though, this isn't really sufficient when certain sequences of requests occur within a single second. The RFC says that a file's mtime could be a weak validator, because the file might change more than once in a second. But I think the subtly important issue here is that word could. If I understand it correctly, the implication is that one has resource for which one wants to generate weak ETags only, because one knows the semantic meaning isn't changing although the content may be. In this particular case, the modification time *might* be appropriate to use as a weak validator (although that seems somewhat unlikely, because non-semantic changes that occurred seconds apart would still cause different weak ETags to be generated, which probably isn't what you'd want in this circumstance). But, in the cases handled by ap_make_etag(), the content and/or the semantic meaning of a resource could always be changing, so weak ETags would seem to be, as the PR says, simply never appropriate. Alas, though, there is probably no way to quickly generate a strong validators like a hash -- which is why we're using inodes, lengths, and mtimes in the first place, of course. So, a modest proposal, which aims to punt the problem to the administrator, and also to resolve the fact that FileETag is provided to configure the actions of ap_make_etag(), while mod_dav_fs's dav_fs_getetag() actions are fixed. The highlights would be: - refactor ap_make_etag() and dav_fs_getetag() to use a common code base as far as possible - make dav_fs_getetag() follow the options set by FileETag (if this is a problem, then introduce a DavFileETag directive, but I'm hoping that could be avoided) - add MD5 and SHA1 as options to FileETag (while obviously much slower, certain applications may require really strong validators); for legacy reasons, these would have to be specifically configured and would not be implied by the All option - add a nomtime=none|weak|ignore option to FileETag (in the manner of ProxyPass) which specifies what to do when the MTime option is configured but the file's mtime
Re: thoughts on ETags and mod_dav
Hi -- 1) Per #38034, it appears that ap_meets_conditions() treats * incorrectly. More precisely, I should say that ap_meets_conditions() isn't designed to support the NULL resources of RFC 2518 (WebDAV). I'm certainly no expert on these issues, so guidance is welcome. RFC 2616 section 14.24 (and 14.26 is similar) says, If the request would, without the If-Match header field, result in anything other than a 2xx or 412 status, then the If-Match header MUST be ignored. Thus in the typical case, if a resource doesn't exist, 404 should be returned, so ap_meets_conditions() doesn't need to handle this case at all. RFC 2518 introduces lock-null resources for use with certain methods such as LOCK, PUT, and UNLOCK. (RFC 4918, it seems, deprecates lock-null resources in favour of locked empty resources.) For these particular methods and resources, other language from RFC 2616's section 14.24 (again, 14.26 is similar) comes into play, specifically, If none of the entity tags match, or if '*' is given and no current entity exists, the server MUST NOT perform the requested method, and MUST return a 412 (Precondition Failed) response. Personally, I'd be inclined to try to make ap_meets_conditions() handle support these different situations in as generic a way as possible (in case other non-mod_dav DAV modules or other HTTP extensions need to use the same logic), but that may be difficult to do without introducing compatibility problems and/or contorted code. Another option would be the one described by Paritosh Shah: For this, we can create a new dav_meets_conditions() which calls ap_meets_conditions() and also handles those other cases. Thoughts, corrections, etc.? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: thoughts on ETags and mod_dav
Henrik Nordstrom wrote: On fre, 2007-10-12 at 00:25 -0400, Chris Darroch wrote: RFC 2616 section 14.24 (and 14.26 is similar) says, If the request would, without the If-Match header field, result in anything other than a 2xx or 412 status, then the If-Match header MUST be ignored. Thus in the typical case, if a resource doesn't exist, 404 should be returned, so ap_meets_conditions() doesn't need to handle this case at all. There is more to HTTP than only GET/HEAD. If-Match: * and If-None-Match: * is quite relevant only taking 2616 into account Agreed -- by in the typical case, I didn't mean in all cases covered by RFC 2616, although I don't think I made that particularly clear. I just meant that ap_meets_conditions() was likely written with the methods that handled by the core handler in mind; that's why its logic has remained largely unchanged for many years. PUT isn't supported by the core handler, or most typical content handlers. WebDAV happens to bring this issue to light, perhaps because it's increasingly widely utilized. I'm all in favour of trying to make ap_meets_conditions() as generic and correct as possible, as I said: Personally, I'd be inclined to try to make ap_meets_conditions() handle support these different situations in as generic a way as possible But, to do this it may be necessary to have data available that isn't at present. That's why I was interested in exploring various options, such as using r-no_local_copy. Or, possibly, we could add an r-does_not_exist flag which would be 0 by default; this could be introduced without compability problems, I think. Modules which needed to indicate that they were dealing with a resource that didn't exist, e.g., lock-null resources in mod_dav, could set the flag to 1. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: thoughts on ETags and mod_dav
Nick Kew wrote: My vote goes to r-notes. Anything else relies on something with semantic significance that'll risk breaking random things. For the future (e.g. 2.4), we could review other options. For example, abstract out r-finfo to a void* with an API for inspecting resources including but not limited to those visible in the filesystem. Sounds good; maybe an r-notes key named NON_EXTANT_RESOURCE or NO_RESOURCE or some such? Feel free to ignore my totally half-baked ideas, but mod_dav is not the only area where we have issues, with ETags. There's also the whole question of negotiated content to consider, and I'm still confused by the problems that still surround PR#39727. If we're reworking ETag logic, we should take the opportunity to deal with this, too. Can we incorporate some kind of tokenisation into ETags that will express dependence on negotiation? For example, Henrik Nordstrom suggests ;gzip. If we expand that to, say, ;CE=gzip, then we can start to apply semantics to decoding the Etag, and determine equivalence. So in that case, we could determine that the (uncompressed) file does indeed match the ETag. Not sure if there's anything we might do with other negotiated headers, but maybe we could leave that open, providing a hook that something like mod_negotiation might someday use. I confess I've been following this discussion at an abstract level only (at best). I don't think, though, that it directly affects any of the several bugs I'm hoping to tackle, if I understand the issue correctly. But I don't claim to fully understand the details of mod_negiotiation. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: thoughts on ETags and mod_dav
Paritosh Shah wrote: Thanks for the patches ... I'll take a look when I get some time. I used resource-exists as the key instead of NON_EXTANT_RESOURCE or NO_RESOURCE as suggested by Chris Darroch, to avoid double negatives. I wanted to use such a term because the default case, when a module does nothing special, is that a resource exists. I think it's confusing if that default case (resource exists) maps to something in r-notes being not defined (i.e., field does not exist in r-notes). I personally think it's less confusing if those few modules which need to signal that they're dealing with a non-extant resource must set a flag to that effect. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: thoughts on ETags and mod_dav
Paritosh Shah wrote: There are really three states here ( wrt ap_meets_conditions()) 1. resource exists 2. resource does not exist 3. nothing is known about existence of the resource Currently ap_meets_conditions() does not make any assumptions about existance of the resource ( case 3 ). If we use a NO_RESOURCE flag without addtional Y or N values, we can really only cover 2 states ( flag is set and flag is not set ). In the case that flag is not set, we cannot directly assume that the resource exists, because this runs the risk of breaking a lot of existing modules which do not set any flags. I don't think it's that complex (although of course I may be missing something). The current code assumes the resource exists. All that's needed, I believe, is a flag which, when present, indicates that the resource does not exist. We must continue to assume that the resource exists when the flag is not present, to avoid breaking things, so I don't see a need to add an explicit yes, it exists flag state -- that's just adding confusion, IMHO. In an ideal situation (maybe for httpd 3.0), we could have a flag like r-exists which must be set to either 0 or 1. But, since we want a backportable solution, we're looking at adding an optional flag to r-notes instead. Now, when that flag is not present, the code in ap_meets_conditions() must assume (as it does now) that the resource exists. That's the current behaviour, and it's correct for all cases except those where a module (like mod_dav) needs to indicate that the resource doesn't exist. So our flag is only going to be present at all in a few cases, which means there will be a degree of mystery to the code no matter how to slice the problem. If we use a flag named something like RESOURCE_EXISTS, then what we care about is the case when that flag is set to false, which means we've also got to have a true state as well. What I dislike about that is the code then has to assume that when the flag isn't present -- i.e., is undefined -- it should proceed as if the flag was set to the true state. But, that runs counter to a lot of other programmatic practice. A SQL expression like WHERE foo = 1 will fail if foo is NULL, for example. Or in (non-strict) Perl, an undefined variable is treated as zero/false/empty/etc. by default. That's why I see a flag named something like NON_EXTANT_RESOURCE as a little better. It doesn't need to have true or false values; its presence alone indicates the opposite case from the default assumption. Anyway, I'll continue to think it over for a bit before I commit anything; please let me know if I've missed something! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
[PATCH 37533] mod_dbd pool and config handling
Hi -- Nick suggested I start posting more to the list, so this one's on his head. :-) Per the guidelines, the patches themselves are in Bugzilla in bug number 37533. I'd also like to float the suggestion that the DBDKeep and DBDExptime directives be renamed to match the use in mod_proxy, perhaps DBDSoftMax and DBDTimeToLive? (There's a comment in mod_proxy.c that suggests that someone doesn't love the smax parameter name there either, though.) The two patches in Bugzilla are for mod_dbd.c and mod_dbd.h, and implement the following: - convert expiry/time-to-live values to seconds - make DBDKeep accept On and Off flag values - add *_set flags to svr_cfg so that, for example, if the main server config specifies a non-default value, then a virtual host can override it - using the COND_PARAM(nmin, DEFAULT_NMIN) method would mean that a virtual host with an explicit setting of DEFAULT_NMIN via a DBDMin directive would be overridden by the main server's configuration; see mod_cache for example I cribbed from - create a private memory pool for use by the reslist so that threads using the reslist via ap_dbd_acquire, etc. don't conflict with the owner of the pool (i.e., the pool passed in ap_run_child_init to dbd_setup) - create a private memory pool for each ap_dbd_t, primarily so that prepared statements will be cleaned up when the connection is closed (otherwise they linger until the reslist or s-process-pool is destroyed, as do any cleanup callbacks registered in the apr_dbd_prepare driver code) - wrap dbd_setup in separate logic for the init and retry cases - for the retry case, where init failed, create a thread mutex to serialize competing retries by multiple ap_dbd_open calls - add status values to log messages where possible - prevent params being dumped to the log file when unable to connect to DB since this string may contain a DB password Comments, flames? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
[PATCH 36090] recursive ErrorDocument msgs
Hi -- I updated my patch for #36090 for 2.2.0; dunno if anyone cares to take a look. I've been applying this patch for some time, with no apparent problems, FWIW. One note about possible things for wiser minds to review, from the tail end of the bug description: The modules/arch/win32/mod_isapi.c file along with mod_rewrite.c, mod_actions.c, and mod_negotiation.c in modules/mappers might all be examined to determine if r-status could be a value other than HTTP_OK prior to their use of these functions [i.e., ap_internal_redirect() and ap_internal_redirect_handler()], since if these functions detect an error and call ap_die(), then ap_die() will think it is handling a recursive error, as is the case with this particular bug. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
[PATCH 38019, 36908] make SetEnv run during post_read_req
Hi -- Well, this may be a sore point, but I'll tackle it anyway, so apologies in advance. The fact that environment variables created with SetEnv are applied during the fixups phase, while SetEnvIf creates its variables during the post_read_request and header_parser phases, does make my life a little more painful. This has been raised in various bug reports, including #36908 and #35611. These are marked won't fix, with the explanation that SetEnv is intended for use by content generators like CGI scripts. However, as I've described in a new bug (sorry!), #38019, I have a particular setup where what I'd like to do is reject all requests that contain a particular HTTP header (in this case, a header injected by hardware that means the request is coming from outside our private network). Here's what I thought I could do: SetEnv FOO 1 SetEnvIf Http-X-Foo .+ !FOO Directory /foo Allow from env=FOO /Directory The logic being, set FOO=1, then unset FOO if the HTTP header is present, and only allow access to a resource if FOO is still present. I attach a patch to the bug report in #38019 that simply moves mod_env's handler up to the post_read_request phase, ahead of mod_setenvif. Now, there may be historical reasons why this isn't desired, and it might, I suppose, also cause existing users' configurations to malfunction. Still, I'd propose it as a 2.3/2.4 change, because it does enable functionality like that I outlined above, and more generally, it makes sense that all of the SetEnv* and PassEnv*, etc., directives would take effect at the same time. If this change isn't accepted, then I would strongly suggest changing the documentation regarding environment variables to indicate the order in which directives take effect; at the moment, that's not clear. In fact, the current documentation says the following, which may be true for the special purpose environment variables listed on the page, but isn't true in general for variables used in other ways (e.g., in Allow from env): To make these mechanisms as flexible as possible, they are invoked by defining environment variables, typically with BrowserMatch, though SetEnv and PassEnv could also be used, for example. This section in particular, to my mind, clearly implies to the reader that SetEnv and SetEnvIf take effect at the same time. So does the section on the Allow direction from mod_authz_host, which says: The third format of the arguments to the Allow directive allows access to the server to be controlled based on the existence of an _environment_variable_. When Allow from env=env-variable is specified, then the request is allowed access if the environment variable env-variable exists. The server provides the ability to set environment variables in a flexible way based on characteristics of the client request using the directives provided by mod_setenvif. The link to the Environment Variables page, and the wording, imply that while mod_setenvif directives may be used to control access based on request characteristics, that's not the only way it could work, and other mechanisms for setting environment variables should work too. If it's going to stay the way it is now for historical and user support reasons, let me know and I'll try to write something more obvious for docs/manual/env.xml, mod/mod_env.xml, mod/mod_authz_host.xml, etc. Thanks! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH 38019, 36908] make SetEnv run during post_read_req
Hi -- I have a particular setup where what I'd like to do is reject all requests that contain a particular HTTP header (in this case, a header injected by hardware that means the request is coming from outside our private network). Here's what I thought I could do: SetEnv FOO 1 SetEnvIf Http-X-Foo .+ !FOO Directory /foo Allow from env=FOO /Directory The logic being, set FOO=1, then unset FOO if the HTTP header is present, and only allow access to a resource if FOO is still present. I thought a bit more about this last night, did some experimentation, and have a different proposal now. :-/ I found that for my particular situation, I can use: SetEnvIf Http-X-Foo ^$ FOO which works because SetEnvIf matches an empty regex if either the Http-X-Foo header is present and empty, or if it isn't present at all. I find this somewhat counter-intuitive (albeit useful in this particular case): how can a missing header match against anything? So, here's the proposal. Alas, I am deeply short of round tuits these days, but perhaps in a couple of months I can supply a patch. Comments welcome beforehand, though. 1) Leave mod_env as-is for the moment, but document that it only functions for content generators, and not for any prior part of the request handling process. In a future release (2.4? 3.0?) change the name of the directives to SetEnvCGI, etc. 2) Alter mod_setenvif so that SetEnvIf only matches if a header is present. To my mind, this is simply the single-header equivalent of what you'd want to happen when using a regex to match header names, e.g.: SetEnvIf ^Http-X-.*$ ... I'd presumably want nothing to happen if no headers match the name regex. In the case where I'm using a simple header name, like Http-X-Foo, that should be equivalent to using ^Http-X-Foo$, and thus follow the same logic; if no header is present that matches that name, nothing should happen (i.e., all value string regexs should fail to match, even for ^.*$ and other always-matching regexs). 3) Add to mod_setenvif a new directive, SetEnvPre, which works like this: SetEnvPre env-variable[=value] and simply sets env-variable to the given value (or an empty string if no value is supplied) before any SetEnvIf directives are applied. 4) Add to mod_setenvif another new directive, SetEnvExists, which works like this: SetEnvExists header|header-regex [!]env-variable[=value] [[!]env-variable[=value]] ... That is, just like SetEnvIf, but it only deals with headers (not other request attributes like Request_URI, etc.), and sets or unsets the env-variables based on whether any matching headers exist in the request. With these directives, one could set up something like: SetEnvPre FOO=false SetEnvExists ^Http-X-Foo-.*$ FOO=true SetEnvPre FOO_IGNORE SetEnvExists Http-X-Foo-Ignore FOO_IGNORE=all SetEnvIfNoCase Http-X-Foo-Ignore ^header=(.*)$ FOO_IGNORE=$1 so that FOO is false unless at least one Http-X-Foo-* header exists, in which case it's true, and FOO_IGNORE is empty unless the Http-X-Foo-Ignore header exists, in which case FOO_IGNORE is all, or if Http-X-Foo-Ignore has the form header=foo then FOO_IGNORE is foo. Does that all make sense? As I said, comments welcome, and flames also. (What's this?! Santa flambé??) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
shutdown and linux poll()
Hi -- This may be an old topic of conversation, in which case I apologize. I Googled and searched marc.theaimslist.com and Apache Bugzilla but didn't see anything, so here I am with a question. In brief, on Linux, when doing an ungraceful stop of httpd, any worker threads that are poll()ing on Keep-Alive connections don't get awoken by close_worker_sockets() and that can lead to the process getting the SIGKILL signal without ever getting the chance to run apr_pool_destroy(pchild) in clean_child_exit(). This seems to relate to this particular choice by the Linux and/or glibc folks: http://bugme.osdl.org/show_bug.cgi?id=546 The backstory goes like this: I spent a chunk of last week trying to figure out why my module wasn't shutting down properly. First I found some places in my code where I'd failed to anticipate the order in which memory pool cleanup functions would be called, especially those registered by apr_thread_cond_create(). However, after fixing that, I found that when connections were still in the 15 second timeout for Keep-Alives, a child process could get the SIGKILL before finished cleaning up. (I'm using httpd 2.2.0 with the worker MPM on Linux 2.6.9 [RHEL 4] with APR 1.2.2.) The worker threads are poll()ing and, if I'm reading my strace files correctly, they don't get an EBADF until after the timeout completes. That means that join_workers() is waiting for those threads to exit, so child_main() can't finish up and call clean_child_exit() and thus apr_pool_destroy() on the pchild memory pool. This is a bit of a problem for me because I really need join_workers() to finish up and the cleanups I've registered against pchild in my module's child_init handler to be run if at all possible. It was while researching all this that I stumbled on the amazing new graceful-stop feature and submitted #38621, which I see has already been merged ... thank you! However, if I need to do an ungraceful stop of the server -- either manually or because the GracefulShutdownTimeout has expired without a chance to gracefully stop -- I'd still like my cleanups to run. My solution at the moment is a pure hack -- I threw in apr_sleep(apr_time_from_sec(15)) right before ap_reclaim_child_processes(1) in ap_mpm_run() in worker.c. That way it lets all the Keep-Alive timeouts expire before applying the SIGTERM/SIGKILL hammer. But that doesn't seem ideal, and moreover, doesn't take into account the fact that KeepAliveTimeouts 15 seconds may have been assigned. Even if I expand my hack to wait for the maximum possible Keep-Alive timeout, it's still clearly a hack. Does anyone have any advice? Does this seem like a problem to be addressed? I tried to think through how one could signal the poll()ing worker threads with pthread_kill(), but it seems to me that not only would you have to have a signal handler in the worker threads (not hard), you'd somehow have to break out of whatever APR wrappers are abstracting the poll() once the handler set its flag or whatever and returned -- the APR functions can't just loop on EINTR anymore. (Is it socket_bucket_read() in the socket bucket code and then apr_socket_recv()? I can't quite tell yet.) Anyway, it seemed complex and likely to break the abstraction across OSes. Still, I imagine I'm not the only one who would really like those worker threads to cleanly exit so everything else does ... after all, they're not doing anything critical, just waiting for the Keep-Alive timeout to expire, after which they notice their socket is borked and exit. FWIW, I tested httpd 2.2.0 with the worker MPM on a Solaris 2.9 box and it does indeed do what the Linux bug report says; poll() returns immediately if another thread closes the socket and thus the whole httpd server exits right away. Thoughts, advice? Any comments appreciated. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: shutdown and linux poll()
Paul: This may be an old topic of conversation, in which case I apologize. I Googled and searched marc.theaimslist.com and Apache Bugzilla but didn't see anything, so here I am with a question. In brief, on Linux, when doing an ungraceful stop of httpd, any worker threads that are poll()ing on Keep-Alive connections don't get awoken by close_worker_sockets() and that can lead to the process getting the SIGKILL signal without ever getting the chance to run apr_pool_destroy(pchild) in clean_child_exit(). This seems to relate to this particular choice by the Linux and/or glibc folks: http://bugme.osdl.org/show_bug.cgi?id=546 To clarify, are you sure its not using EPoll instead of Poll? Well, I'll probe more deeply tomorrow, and while I'm no expert on this stuff, I don't think so. Here are the last two lines from an strace on one of the worker threads: 21:39:30.955670 poll([{fd=13, events=POLLIN, revents=POLLNVAL}], 1, 15000) = 1 21:39:42.257615 +++ killed by SIGKILL +++ That's the poll() on descriptor 13, for 15 keep-alive seconds, during which the main process decides to do the SIGKILL. Here, I think, is the accept() that opens fd 13: 21:38:51.017764 accept(3, {sa_family=AF_INET, sin_port=htons(63612), sin_addr=inet_addr(xxx.xxx.xxx.xxx)}, [16]) = 13 and while I do see some epoll stuff, it's on another descriptor: 21:38:43.012242 epoll_create(1) = 12 Now, the caveat here is that I'm learning as I go; sockets are not really my strong point. But it's fairly easy to reproduce this behaviour with a stock Apache 2.0 or 2.2 on a RedHat system; I've tried both. I can certainly provide more details if requested; let me know! Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: shutdown and linux poll()
Hi -- Does anyone have any advice? Does this seem like a problem to be addressed? I tried to think through how one could signal the poll()ing worker threads with pthread_kill(), but it seems to me that not only would you have to have a signal handler in the worker threads (not hard), you'd somehow have to break out of whatever APR wrappers are abstracting the poll() once the handler set its flag or whatever and returned -- the APR functions can't just loop on EINTR anymore. (Is it socket_bucket_read() in the socket bucket code and then apr_socket_recv()? I can't quite tell yet.) Anyway, it seemed complex and likely to break the abstraction across OSes. Still, I imagine I'm not the only one who would really like those worker threads to cleanly exit so everything else does ... after all, they're not doing anything critical, just waiting for the Keep-Alive timeout to expire, after which they notice their socket is borked and exit. Paul Querna wrote: To clarify, are you sure its not using EPoll instead of Poll? The culprit is the poll() inside apr_wait_for_io_or_timeout(), which is indeed being called from within apr_socket_recv(). The stack is, basically: apr_wait_for_io_or_timeout() apr_socket_recv() socket_bucket_read() apr_bucket_read() ap_rgetline_core() ap_rgetline() read_request_line() ap_read_request() ap_process_http_connection() Here's the tail of my strace, after I hacked on waitio.c to spit out a write() just before and after polling: 11:47:21.757774 write(15, about to poll with timeout 15000\n, 33) = 33 11:47:21.757877 close(15) = 0 11:47:21.757943 munmap(0xb7fff000, 4096) = 0 11:47:21.758016 poll([{fd=14, events=POLLIN, revents=POLLNVAL}], 1, 15000) = 1 11:47:33.261025 +++ killed by SIGKILL +++ I'd really love to hear opinions on this. Would anyone like a patch to make ap_reclaim_child_processes() to wait first for the maximum configured Keep-Alive period? If that's too hacky, then what's the consensus -- ignore the issue, or try to invent a way for the worker (and event?) MPMs to signal their worker threads? It would seem to me that, other than major surgery on APR, the ideal would be for the signal handler to perform this snippet from the tail of worker_thread(): ap_update_child_status_from_indexes(process_slot, thread_slot, (dying) ? SERVER_DEAD : SERVER_GRACEFUL, (request_rec *) NULL); apr_thread_exit(thd, APR_SUCCESS); or at a bare minimum, the apr_thread_exit(). But I'm not sure offhand if having signal handlers perform thread exits is possible; I feel like it's verboten Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: shutdown and linux poll()
Hi -- I've crafted what seems to me like a reasonably minimal set of patches to deal with the issue I described in this thread: http://marc.theaimsgroup.com/?l=apache-httpd-devm=113986864730305w=2 The crux of the problem is that on Linux, when using httpd with the worker MPM (and probably the event MPM too), hard restarts and shutdowns often end up sending SIGKILL to httpd child processes because those processes are waiting for their worker threads to finish polling on Keep-Alive connections. Apparently, on most OSes, if one thread closes a socket descriptor then other threads polling on it immediately get a return value. This certainly seems to be the case on Solaris. But on Linux, worker threads polling on their sockets in apr_wait_for_io_or_timeout() don't get an error return value until the full (usually 15 second) Keep-Alive timeout period is up. The main httpd process deems that too long, and issues SIGKILL to the child processes. For me personally, the consequence is that all my nice cleanup handlers registered against the memory pool that's passed during the child_init stage never get called. This is particularly painful if one is hoping to, for example, cleanly shut down DB connections that one has opened with mod_dbd/apr_dbd. In the case of mod_dbd, it opens its reslist of apr_dbd connections against the pool passed in the child_init stage, which with the worker MPM is its pchild pool. When SIGKILL is applied, the apr_pool_destroy(pchild) call is often not reached, so DB disconnections don't occur; even if I'm trying to shut down httpd in a hurry, I don't really want that to happen if at all possible. Without further ado, then, my initial patches. These are Unix-only at the moment; I have little experience with other OSes. If anyone wants to propose something better, and/or suggest changes, that would be superb. In the meantime, since these work for me, I'll start applying them against APR and httpd for my own use. First, the APR patches (against trunk): === --- include/apr_network_io.h.orig 2006-02-20 16:20:44.841609000 -0500 +++ include/apr_network_io.h2006-02-20 16:24:19.99359 -0500 @@ -99,6 +99,7 @@ * until data is available. * @see apr_socket_accept_filter */ +#define APR_INTERRUPT_WAIT 65536 /** Return from IO wait on interrupt */ /** @} */ --- network_io/unix/sockopt.c.orig 2006-02-17 11:24:13.058691778 -0500 +++ network_io/unix/sockopt.c 2006-02-17 11:28:08.910410867 -0500 @@ -318,6 +318,9 @@ return APR_ENOTIMPL; #endif break; +case APR_INTERRUPT_WAIT: +apr_set_option(sock, APR_INTERRUPT_WAIT, on); +break; default: return APR_EINVAL; } --- support/unix/waitio.c.orig 2005-07-09 03:07:17.0 -0400 +++ support/unix/waitio.c 2006-02-17 11:23:42.620856949 -0500 @@ -49,7 +49,8 @@ do { rc = poll(pfd, 1, timeout); -} while (rc == -1 errno == EINTR); +} while (rc == -1 errno == EINTR + (f || !apr_is_option_set(s, APR_INTERRUPT_WAIT))); if (rc == 0) { return APR_TIMEUP; } === Second, the httpd patches (also against trunk): === --- server/mpm/worker/worker.c.orig 2006-02-20 16:26:55.302701000 -0500 +++ server/mpm/worker/worker.c 2006-02-20 16:46:44.764980568 -0500 @@ -213,6 +213,19 @@ */ #define LISTENER_SIGNAL SIGHUP +/* The WORKER_SIGNAL signal will be sent from the main thread to the + * worker threads after APR_INTERRUPT_WAIT is set true on their sockets. + * This ensures that on systems (i.e., Linux) where closing the worker + * socket doesn't awake the worker thread when it is polling on the socket + * (especially after in apr_wait_for_io_or_timeout() when handling + * Keep-Alive connections), close_worker_sockets() and join_workers() + * still function in timely manner and allow ungraceful shutdowns to + * proceed to completion. Otherwise join_workers() doesn't return + * before the main process decides the child process is non-responsive + * and sends a SIGKILL. + */ +#define WORKER_SIGNAL AP_SIG_GRACEFUL + /* An array of socket descriptors in use by each thread used to * perform a non-graceful (forced) shutdown of the server. */ static apr_socket_t **worker_sockets; @@ -222,6 +235,7 @@ int i; for (i = 0; i ap_threads_per_child; i++) { if (worker_sockets[i]) { +apr_socket_opt_set(worker_sockets[i], APR_INTERRUPT_WAIT, 1); apr_socket_close(worker_sockets[i]); worker_sockets[i] = NULL; } @@ -822,6 +836,11 @@ ap_scoreboard_image-servers[process_slot][thread_slot].generation = ap_my_generation; ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_STARTING,
[PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]
Hi -- Alexander Lazic wrote: After 'make install' i started apache, then some seconds later i got the message '...MaxClients reached...' but there was no entry in the access log, and nobody have make a request to this server. Jeff Trawick wrote: There are problems accounting for child processes which are trying to initialize that result in the parent thinking it needs to create more children. The less harmful flavor is when it thinks (incorrectly) it is already at MaxClients and issues the reached MaxClients message. More disturbing is when MaxClients is very high and the parent keeps creating new children using exponential ramp-up. That can be very painful. I have been seeing something similar with 2.2.0 using the worker MPM, where with the following settings, I get over 10 child processes initializing immediately (e.g., up to 15), and then they drop back to 10. I see the server reached MaxClients message as well right after httpd startup, although nothing is connecting yet. IfModule mpm_worker_module StartServers 10 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 100 ThreadsPerChild 10 /IfModule In my case, the problem relates to how long the child_init phase takes to execute. I can tune this by raising DBDMin (and DBDKeep) so that mod_dbd attempts to open increasingly large numbers of DB connections during child_init. With DBDMin set to 0 or 1, all is well; no funny behaviour. Up at DBDMin and DBDKeep at 3, that's when (for me) things go pear-shaped. In server/mpm/worker/worker.c, after make_child() creates a child process it immediately sets the scoreboard parent slot's pid value. The main process goes into server_main_loop() and begins executing perform_idle_server_maintenance() every second; this looks at any process with a non-zero pid in the scoreboard and assumes that any of its worker threads marked SERVER_DEAD are, in fact, dead. However, if the child processes are starting slowly because ap_run_child_init() in child_main() is taking its time, then start_threads() hasn't even been run yet, so the threads aren't marked SERVER_STARTING -- they're just set to 0 as the default value. But 0 == SERVER_DEAD, so the main process sees a lot of dead worker threads and begins spawning new child processes, up to MaxClients/ThreadsPerChild in the worst case. In this case, when no worker threads have started yet, but all possible child processes have been spawned (and are working through their child_init phases), then the following is true and the server reached MaxClients message is printed, even though the server hasn't started accepting connections yet: else if (idle_thread_count min_spare_threads) { /* terminate the free list */ if (free_length == 0) { I considered wedging another thread status into the scoreboard, between SERVER_DEAD (the initial value) and SERVER_STARTING. The make_child() would set all the thread slots to this value and start_threads() would later flip them to SERVER_STARTING after actually creating the worker threads. That would have various ripple effects on other bits of httpd, though, like mod_status and other MPMs, etc. So instead I tried adding a status field to the process_score scoreboard structure, and making the following changes to worker.c such that this field is set by make_child to SERVER_STARTING and then changed to SERVER_READY once the start thread that runs start_threads() has done its initial work. During this period, while the new child process is running ap_run_child_init() and friends, perform_idle_server_maintenance() just counts that child process's worker threads as all being effectively in SERVER_STARTING mode. Once the process_score.status field changes to SERVER_READY, perform_idle_server_maintenance() begins to look at the individual thread status values. Any thoughts? The patch in Bugzilla doesn't address other MPMs that might see the same behaviour (event, and maybe prefork?) http://issues.apache.org/bugzilla/show_bug.cgi?id=39275 It also doesn't necessarily play ideally well with the fact that new child processes can gradually take over thread slots in the scoreboard from a gracefully exiting old process -- the count of idle threads for that process will be pegged (only by perform_idle_server_maintenance()) at ap_threads_per_child until the new process creates its first new worker thread. But, that may be just fine I'll keep poking around and testing and maybe a better idea will present itself. Chris.
Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]
Hi -- Someone tried to send me a fax in the middle of the night, so I've been up for a while and I think I've realized there are several subtle contention issues involved with any fix for this issue. First of all, I should note that my initial patch in Bugzilla has a flaw; it needs an else clause around most of the logic in perform_idle_server_maintenance(). With that in place, it works pretty well for me. However, there's a theoretical race condition that I'll describe below. To summarize where we stand, we're trying to avoid the situation where the parent process creates a worker MPM child, and then before the child gets a chance to run very far, the parent's perform_idle_server_maintenance() routine (hereafter p_i_s_m(), per Greg Ames) checks and sees no/few running worker threads for that child and spawns another child. Jeff Trawick suggested a process_score.mpm_state field used much the way my patch uses it. The parent sets it to SERVER_STARTING after creating the child in make_child(), and the child sets it to SERVER_READY when some workers have been created. Then p_i_s_m() can check it and, if not READY, bypass the count of workers and assume they'll be ready soon. Greg Ames preferred the other approach I initially mentioned; using the worker_score.status field only and adding a new status value; let's call it SERVER_INIT. In this case, either the parent or the child would perform this logic shortly after the child was created: for (i = 0; i ap_threads_per_child; i++) { if (status != SERVER_GRACEFUL status != SERVER_DEAD) { ap_update_child_status_from_indexes(slot, i, SERVER_INIT, NULL); } } Then p_i_s_m() would accept this worker state as meaning I'll be ready soon. Sticking with the latter approach, I'll try to explain the contention issue I see. Fundamentally, it comes about because normally only the parent changes process_score fields, while only the child changes worker_score fields (synchronizing between its threads and sometimes checking for an GRACEFUL or DEAD status from another child's threads). Suppose the child runs the SERVER_INIT setup code above in child_main(), before ap_run_child_init(). It seems to me that it's at least theoretically possible that the parent's p_i_s_m() still gets scheduled first by the OS and concludes that another child needs to be created. Suppose instead that the parent does the setup code in make_child(), right after successfully forking. Then p_i_s_m() can't see un-INIT worker status values. However it's conceivable that right after make_child() does its check for != GRACEFUL and != DEAD that the child's threads get scheduled and set the worker status to READY; this then gets clobbered forever by the parent. One option might be to use apr_atomic_cas32() routines on the worker status fields -- perhaps only required in make_child(), even. That might be viable, but I confess I can't see all the ramifications right now, and those status fields are updated in lots of places via ap_update_child_status_from_indexes(). Let's call this option A. Reviewing the former approach, the same kinds of contention exist because make_child() sets mpm_state to STARTING and then child's start_threads() sets it to READY; if scheduling causes them to be reversed, the child will always be treated as having a full complement of about to be ready workers. It might be straightforward to use apr_atomic_cas32() here since we're the only users of the field. Let's call this option B. Still, based on scoreboard.h, I think it makes sense if the parent is the only one to edit process_score fields: /* stuff which the parent generally writes and the children rarely read */ typedef struct process_score process_score; Option C is somewhat more complex, but perhaps handles a number of different issues. Both a process_score.mpm_state and a worker_score.internal field are added. The child process maintains two additional worker_score structures at ap_threads_per_child and ap_threads_per_child+1, these are marked internal and correspond to the listener and start threads. They just set their status to STARTING, READY, GRACEFUL, and DEAD as appropriate. No other changes to the child process are needed, I believe. The parent process sets mpm_state to STARTING in make_child(), and follows the logic of option B, except that atomics aren't needed. That's because in p_i_s_m() it checks the child's start thread's internal worker_score structure, and if it is READY, then it flips the mpm_state to READY. Otherwise, if mpm_state is STARTING, it does the bypass logic and assumes that all workers will be ready soon. This seems to me to have the advantage that only the parent edits process_score and only the child edits worker_score, which is conceptually clearer and reduces the ways contention issues can creep in. It does introduce some minor changes to make sure there's room in the scoreboard for the extra two
Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]
Hi -- for (i = 0; i ap_threads_per_child; i++) { if (status != SERVER_GRACEFUL status != SERVER_DEAD) { ap_update_child_status_from_indexes(slot, i, SERVER_INIT, NULL); } } [snip] ... after make_child() does its check for != GRACEFUL and != DEAD ... After heading back to bed I realized this should be, of course, if (status == SERVER_GRACEFUL || status == SERVER_DEAD) Never cut-'n'-paste when you're tired! Re my option C, it also occurs to me that instead of squeezing the worker MPM's start and listener threads into extra, internal worker_score structures, it might be more appropriate to create a new section of the scoreboard with non_worker_score structures. These would just have pid, tid, and status fields only. The worker and event MPMs would use these to track their non-worker threads; and the parent process for these MPMs could monitor them as per option C to decide when the child process's workers were ready to be counted. It would also solve an outstanding problem for me with a module I maintain that uses its own private threads; there's nowhere in the scoreboard that I know of where I can register them. That might also intersect with this conversation: http://marc.theaimsgroup.com/?l=apache-httpd-devm=112902736905643w=2 A possible set of calls for managing these might be: num = ap_register_non_worker_thread(slot, tid, status) ap_update_non_worker_status_from_indexes(slot, num, status) ap_unregister_non_worker_thread(slot, num) ret = ap_get_scoreboard_non_worker(slot, num, status) The child process registers its non-worker threads, making sure that it registers one of them as STARTING immediately after ap_reopen_scoreboard() so that it precedes ap_run_child_init(). This will be guaranteed to go into the [slot][0] structure. It can then set that to READY once it's ready for its workers to be counted individually. Then the parent process can monitor that structure's status using ap_get_scoreboard_non_worker(), and do the wait till all workers are running thing in p_i_s_m(). Meanwhile mod_status can call ap_get_scoreboard_non_worker until it returns an error code and then it'll know it's checked all the registered non-workers; these can be displayed in the report. Finally, a question without checking the code first ... I notice that worker.c has the following, taken from prefork.c: /* fork didn't succeed. Fix the scoreboard or else * it will say SERVER_STARTING forever and ever */ ap_update_child_status_from_indexes(slot, 0, SERVER_DEAD, NULL); Is that right? Should it perhaps cycle through and set all non-DEAD workers to DEAD? I'm thinking that in prefork, threads_limit is set to 1, so mod_status only checks the first thread in a slot, but here it'll check all of them. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]
Colm: The worker and event MPMs would use these to track their non-worker threads; and the parent process for these MPMs could monitor them as per option C to decide when the child process's workers were ready to be counted. +1, I think this could be very useful, I came accross the same kind of problems when looking at handling the odd mod_cgid race conditions when doing graceful stops, but in the end didn't solve them (we still have slightly different death semantics between prefork and worker-like MPM's for cgi termination). Great, I'll get started on this first thing tomorrow. I'm away Thurs-Mon this week but with luck will have some patches before I go for all these co-mingled issues. Finally, a question without checking the code first ... I notice that worker.c has the following, taken from prefork.c: /* fork didn't succeed. Fix the scoreboard or else * it will say SERVER_STARTING forever and ever */ ap_update_child_status_from_indexes(slot, 0, SERVER_DEAD, NULL); Is that right? Should it perhaps cycle through and set all non-DEAD workers to DEAD? I'm thinking that in prefork, threads_limit is set to 1, so mod_status only checks the first thread in a slot, but here it'll check all of them. I'm not sure what you mean. As in, all potential workers globally? That child isn't going to have any actual workers, since child_main() never got run. What I noticed is that the prefork MPM seems to consistently use the first worker_score in the scoreboard for all its record- keeping; that makes sense because its thread_limit is always 1. So in make_child(), it sets that worker_score to SERVER_STARTING, does the fork(), and if that fails, resets the worker_score to SERVER_DEAD. No problems there. However, the code seems to have been copied over into the worker MPM, and that particular if-fork()-fails bit still does the same business, even though other things have changed around it. In particular, multiple worker_score structures are in use for each child (ap_threads_per_child of them), and make_child() doesn't begin by setting them all to SERVER_STARTING -- or even one of them -- so having it subsequently handle the fork()-fails error condition by resetting the first one to SERVER_DEAD seems incorrect. Further, make_child() may be creating a child to replace a gracefully exiting one, and if so, it shouldn't actually touch any of the worker_score statuses. That's because what start_threads() in the child does -- in the case where fork() succeeds -- is to watch for threads in the old process that have marked their status as SERVER_GRACEFUL and only then initiate new workers to replace them. So in fact, if fork() fails, such old workers may still be executing and their status values may not yet be SERVER_GRACEFUL. As things stand, if one of them happens to be have a scoreboard index of 0, its value will get overwritten by make_child(). I think probably the right thing is for make_child() just to do nothing here. That makes the same assumption the rest of the code does, namely, that worker threads will reach their normal termination and set their statuses to SERVER_DEAD or SERVER_GRACEFUL; and if a child process does fail, then the parent process's will notice in server_main_loop() and set all of the child's threads' statuses to SERVER_DEAD. Anyway, I'll include that change in my patchset for review; I may of course have missed something important. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]
Jeff Trawick wrote: On 5/1/06, Greg Ames [EMAIL PROTECTED] wrote: after more thought, there is a simpler patch that should do the job. the key to both of these is how threads in SERVER_DEAD state with a pid in the scoreboard are treated. this means that p_i_s_m forked on a previous timer pop but some thread never made it into SERVER_STARTING state. the difference: this patch just counts those potential threads as idle, and allows MinSpareThreads worth of processes to be forked before putting on the brakes. the previous patch pauses the forking immediately when the strange situation is detected but requires more code and a new variable. new patch is fine with me; I think we've lost our other interested parties on this thread anyway ;) I'm still here! I had to be away from the keyboard most of last week, so have only returned to this thread (pun intended, alas) recently. I also found myself fixing a variety of other things that turned up in the vicinity of this issue; patches forthcoming on those subjects. This is indeed a very straightforward fix; I've been trying to ponder its consequences overnight. The questions I've got (no answers yet, just thought I'd ping so you know I'm here) would be (a) whether you want to allow for SERVER_GRACEFUL as well to be counted as idle, and (b) whether there's any reason to want to not proceed through the if (any_dead_threads ...) logic that follows in p_i_s_m(). If you can bear with me for a day or two more, I should have a collection of patches ready. These tackle the issue by tracking the start and listener threads in a nice new spot in the scoreboard, and also clean up various issues and bugs relating to fork(), ThreadLimit, ServerLimit, MaxClients, etc. They also tackle some issues raised in various XXX comments and clean up some stale cruft in scoreboard.h. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
[PATCH 0/6] mpm and scoreboard fixes
Hi -- I've been working away on PR #39275 and in the process of thinking about that and studying the scoreboard and the various MPMs, I've turned up a few things. Rather than try to jam them all together I've broken out the first set of them and thought I'd start mailing them to the list. The first five are relatively simple and I'll send them now. The last one is a little more involved and while I'm hoping to have it wrapped up soon, it may take another day or so. After this batch I'll aim to have the stuff actually related to #39275 in another batch, but life was getting too confusing with all these other patches mixed in. If anyone's got time to review and commit these, that would be much appreciated! Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
[PATCH 1/6] scoreboard over-sized
Hi -- It looks to me like the memory allocated for ap_scoreboard_image is a little bit over-sized. In r104404 the lb_score elements were added to the scoreboard in the manner of the worker_score array, and then in r105134 much of this was reversed, but the call to calloc() still sizes ap_scoreboard_image as if a two-dimensional array was required. Chris. = --- server/scoreboard.c.orig2006-05-02 09:52:09.803650679 -0400 +++ server/scoreboard.c 2006-05-03 10:17:13.273161088 -0400 @@ -117,8 +117,7 @@ ap_calc_scoreboard_size(); ap_scoreboard_image = -calloc(1, sizeof(scoreboard) + server_limit * sizeof(worker_score *) + - server_limit * lb_limit * sizeof(lb_score *)); +calloc(1, sizeof(scoreboard) + server_limit * sizeof(worker_score *)); more_storage = shared_score; ap_scoreboard_image-global = (global_score *)more_storage; more_storage += sizeof(global_score);
[PATCH 3/6] generation number unset
Hi -- This may not be necessary, but I notice that prefork and most of the other MPMs set ap_my_generation to an initial value of zero. The worker and event MPMs don't, though. Chris. = --- server/mpm/experimental/event/event.c.orig 2006-05-03 07:20:37.953905913 -0400 +++ server/mpm/experimental/event/event.c 2006-05-03 10:37:41.262464486 -0400 @@ -395,7 +395,7 @@ static int volatile restart_pending; static int volatile is_graceful; static volatile int child_fatal; -ap_generation_t volatile ap_my_generation; +ap_generation_t volatile ap_my_generation = 0; /* * ap_start_shutdown() and ap_start_restart(), below, are a first stab at --- server/mpm/worker/worker.c.orig 2006-05-02 22:24:09.727476765 -0400 +++ server/mpm/worker/worker.c 2006-05-03 10:37:26.814544916 -0400 @@ -353,7 +353,7 @@ static int volatile restart_pending; static int volatile is_graceful; static volatile int child_fatal; -ap_generation_t volatile ap_my_generation; +ap_generation_t volatile ap_my_generation = 0; /* * ap_start_shutdown() and ap_start_restart(), below, are a first stab at
[PATCH 5/6] hard restart on Linux #38737
Hi -- An older but essentially identical version of this patch is in Bugzilla PR #38737. Using the worker MPM (but not the event MPM), if Keep-Alives are enabled and the timeout is reasonably long (e.g., 15 seconds), then worker threads wait in poll() after handling a request for any further requests on a Keep-Alive connection. On most Unix-like OSes (e.g., Solaris), when you perform a hard restart or shutdown (not a graceful one), the close_worker_sockets() function successfully alerts all worker threads that they should exit quickly. In particular, if the worker threads are polling on the socket, closing the socket in the main thread has the side-effect of causing the worker thread to immediately receive an EBADF. However, on Linux, this side-effect doesn't occur. This is intentional; see: http://bugme.osdl.org/show_bug.cgi?id=546 The consequence is that the child process's main thread waits in apr_thread_join() for each worker thread and if these are just starting a reasonably long Keep-Alive timeout period, then the parent process soon decides that the child process has stalled and ap_reclaim_child_processes() sends SIGTERM and SIGKILL to the child process. In turn, any modules that have registered pool cleanup functions on the pchild memory pool, as passed in the child_init stage, won't see their cleanup functions run. For certain modules, e.g., mod_dbd, this has relatively severe consequences, such as failing to cleanly disconnect from a database. This patch causes the main thread to signal each worker thread before attempting apr_thread_join(); if any workers are waiting in poll() they then receive EBADF immediately after the signal and this allows the restart or shutdown process to proceed as expected, with calls to all the registered pool cleanup functions. Chris. = --- server/mpm/worker/worker.c.orig 2006-05-03 15:04:28.429547123 -0400 +++ server/mpm/worker/worker.c 2006-05-03 15:07:04.659719568 -0400 @@ -213,6 +213,19 @@ */ #define LISTENER_SIGNAL SIGHUP +/* The WORKER_SIGNAL signal will be sent from the main thread to the + * worker threads during an ungraceful restart or shutdown. + * This ensures that on systems (i.e., Linux) where closing the worker + * socket doesn't awake the worker thread when it is polling on the socket + * (especially in apr_wait_for_io_or_timeout() when handling + * Keep-Alive connections), close_worker_sockets() and join_workers() + * still function in timely manner and allow ungraceful shutdowns to + * proceed to completion. Otherwise join_workers() doesn't return + * before the main process decides the child process is non-responsive + * and sends a SIGKILL. + */ +#define WORKER_SIGNAL AP_SIG_GRACEFUL + /* An array of socket descriptors in use by each thread used to * perform a non-graceful (forced) shutdown of the server. */ static apr_socket_t **worker_sockets; @@ -822,6 +835,11 @@ ap_scoreboard_image-servers[process_slot][thread_slot].generation = ap_my_generation; ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_STARTING, NULL); +#ifdef HAVE_PTHREAD_KILL +unblock_signal(WORKER_SIGNAL); +apr_signal(WORKER_SIGNAL, dummy_signal_handler); +#endif + while (!workers_may_exit) { if (!is_idle) { rv = ap_queue_info_set_idle(worker_queue_info, last_ptrans); @@ -1077,6 +1095,13 @@ for (i = 0; i ap_threads_per_child; i++) { if (threads[i]) { /* if we ever created this thread */ +#ifdef HAVE_PTHREAD_KILL +apr_os_thread_t *worker_os_thread; + +apr_os_thread_get(worker_os_thread, threads[i]); +pthread_kill(*worker_os_thread, WORKER_SIGNAL); +#endif + rv = apr_thread_join(thread_rv, threads[i]); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_CRIT, rv, ap_server_conf,
Re: [PATCH 6/6] MPM configuration directives
Hi -- I don't suppose that anyone's had a chance to peek at this big glob of a patch? I know it's rather a lot to digest. Advice on whether such kinds of changes are desirable at all is welcome, too. On reflection, I should probably have done the ap_swap_nodes() refactoring in a separate patch. Sorry! And, if someone with a Windows compiler could test it using the winnt MPM, that would be lovely. It all boils down to a few key fixes and changes, I think: A) A fix for the following problem: [I]f ThreadsPerChild was not specified at all, then the default value of DEFAULT_THREADS_PER_CHILD could potentially be larger than the configured ThreadLimit, and no test would be performed, since set_threads_per_child() would never run. This could result in an out-of-bounds value for ap_threads_per_child in the running server. B) A fix for the following less-than-critical issue: I noticed that although there was usually a warning if one tried to change ServerLimit or ThreadLimit and restart the server, no warning appeared if one removed (e.g., commented out) these directives and restarted. Presumably, removing these directives implies the use of the default values, but if the server was originally configured with a non-default value, then a warning should probably appear. C) Attempting to ensure that after a restart, valid warning messages always appear in the log files about any changes that the code has made to the configured values for ServerLimit, ThreadLimit, MaxClients, and ThreadsPerChild. Warning messages are printed to the console too, as before, but may not always be valid since the original values of ServerLimit and ThreadLimit aren't known to the process that prints them. D) Pulling the swap configuration nodes code out of the worker and event MPMs and refactoring it as ap_swap_nodes() and ap_order_nodes() in util_cfgtree.c. Is this even the right place for such a thing, if desirable at all? Perhaps server/config.c instead? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
configuration directives redux
Hi -- Some time ago, I proposed this large patchset (better described, I think, by the message referenced by the second link below): http://marc.theaimsgroup.com/?l=apache-httpd-devm=114729206702495w=2 http://marc.theaimsgroup.com/?l=apache-httpd-devm=114788040600327w=2 Discussing the issues on IRC, I received a few responses, including a long one from William A. Rowe, Jr. His main point, I believe, was that the order of configuration directives in the tree should never be altered from what it is in the configuration files. Therefore, what the worker and event MPMs do now in their pre_config stages -- namely, re-ordering ThreadsPerChild ahead of MaxClients -- should not be done. Instead, all such order-dependent configuration directive checking should be done in a post-config phase, either open_logs or post_config. (Note, BTW, that the existing re-ordering is in fact insufficient, because there are other dependencies between the four directives in question: ServerLimit, ThreadLimit, MaxClients, and ThreadsPerChild. Hence my original patchset proposed a standard ap_swap_nodes() to perform all the re-ordering that seemed necessary.) Thinking about Bill's comments and doing some additional research, I've come up with the following patch. So far, I've just been working on prefork; if it seems sensible, then I'll apply the same ideas to the winnt, worker, and event MPMs, which all share the same essential logic when it comes to these particular configuration directives. If some people could review this new approach, I'd really appreciate it, because (among other things) I'd love to get this item off my plate so I can move on to other stuff. (I suppose I could just do that anyway, but to me that feels like cheating. :-) So, to aid with that, a few comments about what I'm attempting to do in this patch. First, from the second link above, I'm trying to achieve items B and C: B) A fix for the following less-than-critical issue: I noticed that although there was usually a warning if one tried to change ServerLimit or ThreadLimit and restart the server, no warning appeared if one removed (e.g., commented out) these directives and restarted. Presumably, removing these directives implies the use of the default values, but if the server was originally configured with a non-default value, then a warning should probably appear. C) Attempting to ensure that after a restart, valid warning messages always appear in the log files about any changes that the code has made to the configured values for ServerLimit, ThreadLimit, MaxClients, and ThreadsPerChild. Warning messages are printed to the console too, as before, but may not always be valid since the original values of ServerLimit and ThreadLimit aren't known to the process that prints them. (Item D doesn't apply anymore, since I'm looking to follow Bill's advice on that score, and item A doesn't apply to prefork, only worker and event. I figured I'd start with prefork first, though.) First, this patch moves all bounds and sanity checking on MPM-related configuration directives into the open_logs phase. That has the advantage that it can all be performed in order, without swapping any configuration tree nodes. It also means that this checking gets performed regardless of whether the user has defined a particular directive or not. That means that item B above gets fixed automatically. Item A, for the worker and event MPMs, will also get fixed in the same manner. All that's required is to make sure that the default values for all the directives are set properly in pre_config. Then the handlers for whatever directives are defined in the config files run and change those values, and finally open_logs does all the in-order bounds and sanity checking. Second, I studied the logic surrounding the changed_limit_at_restart flag and whether it was still necessary. Here I should perhaps pause to quickly describe the four different conditions under which these configuration handlers run (for the prefork, worker, and event MPMs; winnt is a little different, I believe): 1) In the initial startup process, when stderr prints to the console. 2) In the main loop, after detaching from the startup process; stderr has been pointed to /dev/null in apr_proc_detach() in the pre_config phase. 3) In a separate process that signals the main running process to stop or restart. Like condition 1, stderr prints to the console. The open_logs and post_config phases never run here because ap_signal_server() detects the running process, signals it, and exits. 4) In the main loop, after ap_mpm_run() returns because it's caught a restart signal, and the running process then re-reads the configuration files. The comment in set_server_limit() that the error log is a bit bucket at this point actually refers to condition 4. The issue at hand is whether the running server, having detected a
Re: apache 2.2 crashes at the start time in mod_dbd.c then preparing AuthDBDUserPWQuery
Anton Golubev wrote: It's the same, but I don't agree it is Virtual Host specific, as you said. This configuration (almost taken from documentation) also crashes the server: ServerRoot /usr/local/apache Listen 80 User nobody Group nobody DocumentRoot /home/ivc2/public_html DBDriver mysql DBDParams dbname=Users user=auther passreplace=xx DBDMin 1 DBDKeep 2 DBDMax 10 DBDExptime 60 Directory /home/ivc2/public_html AuthType Basic AuthName 'ADMIN ZONE' AuthBasicProvider dbd AuthDBDUserPWQuery select PASS from Users where LOGIN= %s require valid-user /Directory Yes, it seems to be caused by the Directory ... specifically, authn_dbd_prepare(), which is called to handle the AuthDBDUserPWQuery directive, calls ap_dbd_prepare() and passes cmd-server as the server_rec *s argument. Then ap_dbd_prepare() does: ap_get_module_config(s-module_config, dbd_module); and that returns NULL. I quickly also tried running: ap_get_module_config(cmd-server-module_config, authn_dbd_module); inside authn_dbd_prepare(), and it too returns NULL. And yet I feel sure I'm seeing things that do effectively the same thing in other directive handlers, like cmd_rewritelog() for mod_rewrite: ap_get_module_config(cmd-server-module_config, rewrite_module); Maybe something about that's being RSRC_CONF vs. ACCESS_CONF for mod_authn_dbd? I can't grok it tonight; maybe I'll catch some time over the weekend. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: configuration directives redux
Hi -- Some time ago, I proposed this large patchset (better described, I think, by the message referenced by the second link below): http://marc.theaimsgroup.com/?l=apache-httpd-devm=114729206702495w=2 http://marc.theaimsgroup.com/?l=apache-httpd-devm=114788040600327w=2 Discussing the issues on IRC, I received a few responses, including a long one from William A. Rowe, Jr. His main point, I believe, was that the order of configuration directives in the tree should never be altered from what it is in the configuration files. Therefore, what the worker and event MPMs do now in their pre_config stages -- namely, re-ordering ThreadsPerChild ahead of MaxClients -- should not be done. Instead, all such order-dependent configuration directive checking should be done in a post-config phase, either open_logs or post_config. [snip] Thinking about Bill's comments and doing some additional research, I've come up with the following patch. So far, I've just been working on prefork; if it seems sensible, then I'll apply the same ideas to the winnt, worker, and event MPMs, which all share the same essential logic when it comes to these particular configuration directives. It's been busy on the list lately, so I'm not at all surprised that nobody's had a lot of time (or interest, perhaps!) for this little bundle of fixups. Unless someone wants to send me an explicit -1 vote on my prefork-only patch now (see link below), I'll keep working on matching ones for the worker, event, and winnt MPMs, then commit them to trunk and wait for responses at that point. http://mail-archives.apache.org/mod_mbox/httpd-dev/200607.mbox/[EMAIL PROTECTED] Cheers, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: configuration directives redux
Hi -- Some time ago, I proposed this large patchset (better described, I think, by the message referenced by the second link below): http://marc.theaimsgroup.com/?l=apache-httpd-devm=114729206702495w=2 http://marc.theaimsgroup.com/?l=apache-httpd-devm=114788040600327w=2 Discussing the issues on IRC, I received a few responses, including a long one from William A. Rowe, Jr. His main point, I believe, was that the order of configuration directives in the tree should never be altered from what it is in the configuration files. Therefore, what the worker and event MPMs do now in their pre_config stages -- namely, re-ordering ThreadsPerChild ahead of MaxClients -- should not be done. Instead, all such order-dependent configuration directive checking should be done in a post-config phase, either open_logs or post_config. Well, as promised (threatened?), here are a complete set of patches, I believe: http://people.apache.org/~chrisd/patches/httpd_mpm_configs/ I thought I'd wait a few days and see if objections were raised, then commit at least the prefork, worker, and event MPM ones, which I've tested quite extensively. The patches for the other four platforms (winnt, netware, beos, and mpmt_os2) I have no way of testing. If anyone with access to one or more of those platforms could test and review, that would be greatly appreciated! Of those, the Windows one is the one I'd particularly like someone to test and review. It has two key differences from the others: a) Like the prefork, worker, and event MPMs, there are ordering issues involved -- the bounds checks on ap_threads_per_child need to occur after the bounds checks on thread_limit. The existing code, I believe, has the same problem that exists with the worker and event MPMs: [I]f ThreadsPerChild was not specified at all, then the default value of DEFAULT_THREADS_PER_CHILD could potentially be larger than the configured ThreadLimit, and no test would be performed, since set_threads_per_child() would never run. This could result in an out-of-bounds value for ap_threads_per_child in the running server. b) Unlike the other MPMs, there seems to need to be some special handling with regard to what code runs in the parent. I've attempted to comprehend this without access to a test platform, but no doubt I've made some errors. So ... anyone for some testing? Anyone, anyone? Also ... objections to committing the patches for the prefork, worker, and event MPMs? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: configuration directives redux
Hi -- Well, as promised (threatened?), here are a complete set of patches, I believe: http://people.apache.org/~chrisd/patches/httpd_mpm_configs/ I did have one final (I hope) thought last night about this patch. Currently the prefork, worker, and event MPMs hack their open_logs hook function into place ahead of the core ap_open_logs() function: /* The prefork open_logs phase must run before the core's, or stderr * will be redirected to a file, and the messages won't print to the * console. */ My patches, as written, make all the MPMs do this, which is somewhat ugly. One option might be another hook phase that runs after ap_process_config_tree() and friends, but before both the signal_server function (prior to entering the main loop) and the open_logs phase. My inclination, though, is to add an ap_mpm_post_config optional function, like ap_signal_server, which gets called just before it does. MPMs could register a function that does most or all of what's in their command handlers now (i.e., what's been moved into their open_logs handlers in the patches). One advantage of adding either the new hook phase or an optional function is that on restarts, messages can appear on the console and in the log files, formatted properly for each. This is difficult to achieve any other way, I think, for somewhat arcane reasons. Personally, I think I favour the optional function -- opinions? If none, I'll add that to the patches and start committing next week, barring more late night insights or other roadblocks. Chris. P.S. Thanks, Brad, for the Netware test! -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
mod_authn_dbd fix?
Hi -- I noticed some recent activity in mod_dbd.c to deal with virtual host configurations, but didn't pay a lot of attention, I confess. Today I happened to upgrade a system from 2.2.2 to 2.2.3 and discovered my AuthDBDUserRealmQuery directives now weren't getting inherited from the main server config down to the virtual hosts the way they used to. So, I whipped up this patch -- does it make sense to anyone else? I'll have to be away from the keyboard for a week, so I won't be committing anything for a while. Maybe if someone else (Nick?) could noodle on this? Thanks! Chris. --- modules/database/mod_dbd.c.orig 2006-08-29 02:25:54.018358838 -0400 +++ modules/database/mod_dbd.c 2006-08-29 02:16:41.411371370 -0400 @@ -617,10 +617,27 @@ { svr_cfg *svr; server_rec *sp; +dbd_prepared *main_server_prepared; for (sp = s; sp; sp = sp-next) { svr = ap_get_module_config(sp-module_config, dbd_module); svr-prepared = apr_hash_get(dbd_prepared_defns, sp-server_hostname, APR_HASH_KEY_STRING); +if (sp == s) { +main_server_prepared = svr-prepared; +} +else if (main_server_prepared != NULL) { +if (svr-prepared == NULL) { +svr-prepared = main_server_prepared; +} +else { +dbd_prepared *last_prepared = svr-prepared; + +while (last_prepared-next != NULL) { +last_prepared = last_prepared-next; +} +last_prepared-next = main_server_prepared; +} +} } return OK; }
Re: mod_authn_dbd fix?
Nick: I don't think we ever had that kind of merge. And it also looks like possible overspill/pollution, where directives from the main config are not necessarily wanted in a vhost. The complexity here seems to stem from two things. One is specifically the treatment of prepared statements. In retrospect, this is a poor approach: we should instead have exported an optional dbd_construct hook which modules could use to prepare statements more cleanly. My fault, of course, and I'm going to think about whether that can be changed without breaking things. The other lies in the different configuration hierarchies: mod_dbd lives in the server hierarchy, but prepared statements commonly apply per-directory. This is falling into a similar trap to mod_proxy in 2.0. I *think* that if we can move to a hook, that'll go away, if the prepared statements for other modules can be taken away from the mod_dbd config. Just to clarify my particular bug, here's the configuration I'm working with: Location /foo AuthDigestProvider dbd AuthDBDUserRealmQuery SELECT password FROM users WHERE username = %s ... /Location VirtualHost ... ## no auth settings, but expect to inherit from main server /VirtualHost With the change to mod_dbd in 2.2.3, the auth settings are inherited by the virtual host, as you'd expect. However, because the prepared statement isn't inherited any longer, they don't work in the virtual host. Hence my quick attempt at a patch to ensure that they do: from the admin's perspective, I think, the fact that there are prepared statements involved here is invisible, and auth settings should just work when they're inherited by a virtual host. At least, that was my quick take; sorry, I've been travelling and generally away from the keyboard lately. It didn't seem entirely bad to me that mod_dbd would allow prepared statements to be inherited by virtual hosts, anyway -- isn't that what you'd normally expect with other directives? Would recognizing a none argument to DBDPrepareSQL help, perhaps? Other directives often use this to allow inherited values to be overridden and effectively unset. Like I said, I'm not 100% on top of this at the moment, but I'm pretty sure that the DBD auth settings should inherit cleanly with no other action required by the admin. Thoughts, flames? :-) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
prefix_stat() problems in mod_rewrite?
Hi -- I was looking at (the amazing!) mod_rewrite.c in order to figure out how it implements its passthrough mode, and in doing that I looked at prefix_stat(), which is called in the opposite case, when mod_rewrite is *not* passing a request along to other modules' translate_name hook functions. What I noticed was that prefix_stat() seems to implement a heuristic that attempts to decide whether or not the URI has been mapped out of the document root, because if it hasn't, then the document root needs to be prepended to the URI and written into r-filename. It extracts the first URI path component and checks to see whether that exists in the file system's root directory, and if it does, assumes the intention was to map the URI out of the document root, and therefore doesn't prepend the document root to r-uri when setting r-filename. Thinking about this I realized that an administrator might be surprised by this functionality, or that files could be exposed to the Internet unintentionally. For example, suppose the admin sets up a mod_rewrite mapping like this: RewriteRule ^/abc/(.*) /foo/$1 and it works just fine to map /abc/ - /foo/ in the document root. Then, for some reason, they add: RewriteRule ^/def/(.*) /etc/$1 Unfortunately, if they don't test this, then (on a Unix system :-) it will expose /etc/passwd and so forth under the /def/ URI path. Now, obviously, mod_rewrite is very powerful and admins should know better than to add a configuration directive without testing its effects. Still, it's somewhat non-intuitive. What troubled me more, though, was that even with just that first rule in place: RewriteRule ^/abc/(.*) /foo/$1 if someone with root access happened to put some files under /foo in the file system's root directory, for whatever reason, then they'd be unexpectedly exposed to the Internet, while files under docroot/foo would suddenly be unavailable. That seems somewhat dangerous to me, even if it is a bit of an unusual case. The other thing that I wondered about regarding prefix_stat() is that if you have a rule like: RewriteRule ^/([\d]+)([a-z]+)/(.*) /$2/$3 then the fact that the first path component of the rewritten URI could be extremely variable (depending on what kinds of requests you get) means that prefix_stat() is going to be running stat() against all kinds of different file names in the file system's root directory, e.g., abc and def and gromit and so forth. That seems like it might exceed the OS's ability to cache the results from all the different stat() calls, which the comments in mod_rewrite suggest it is relying on avoid a performance penalty. Again, this is a somewhat unusual case, so maybe it's not a concern. I tested my various examples (minus any transcription typos while composing email :-) on Solaris with httpd 2.0.53 and they do work as I suggested; you can trump files under the document root that were previously exposed and reveal stuff outside the document root by creating, say, /foo in the file system. I should note that I was only looking at the per-server context and not the per-directory context; prefix_stat() isn't used there, but there is some other logic about document roots that I didn't look at -- sorry about that. At any rate, I thought that having prefix_stat() perform its stat() against docroot/first path component instead of /first path component might alleviate some of the issues, because it would mean that any files in the document tree would trump anything outside it. It would still mean that a rewrite rule that intended to map to a file outside the tree could be unexpectedly trumped by the addition of a file inside the tree, though. For instance, if you had: RewriteRule ^/abc/(.*) /home/chrisd/www/$1 and then someone created docroot/home/chrisd/www, that would ctrump a previously working map of /abc/ to /www in my home directory. My inclination would be -- although I realize this might cause painful problems for admins upgrading Apache -- to eliminate this source of confusion and potential mischief by removing prefix_stat() and the logic around it, and instead always prepending the document root in hook_uri2file(), unless a specific rule configuration flag was used, maybe A for alias? (This would be unrelated to the passthrough mode, where hook_uri2file() munges r-uri and then returns DECLINED; this would relate to the normal case of returning OK to end the translate_name phase.) This obviously means a fair bit of coding work, especially since I'm not sure what it would entail in relation to rewrites done in a per-directory context -- maybe nothing? Thoughts, comments, criticisms, errors in the above? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: mod_dir + caching brokenness
Colm: mod_dir insists on the indeces being regular files, which breaks with mod_cache in a particularly nasty way. mod_cache doesn't fill out rr-finfo for speed reasons, and because it's not always a file that the cached entity is coming from. The result is; make request for / - works (nothing gets cached) make request for /index.html- works (gets cached) make request for / - doesn't work (we fall back to autoindex) as rr-status for the subrequest is 200, but rr-finfo.filetype is not APR_REG. I'd like to apply this patch, but I have a feeling there must be some undocumented reason it's not like this already. Any thoughts? As a voice from the wilderness, I for one actually do rely on the existing ability to redirect to a subdirectory whose name matches against the list of valid index files. For example, request for /foo/ finds .../foo/index.bar/, redirects, then request for /foo/index.bar/ finds /foo/index.bar/index.html and returns that. Odd, I know, but it's valuable for our purposes. This, I believe, requires that mod_dir fall through the test you're looking at, to the next test: /* XXX: (filetype == APR_REG) - we can't use a non-file index??? */ if ( rr-status == HTTP_OK ( (rr-handler !strcmp(rr-handler, proxy-server)) || rr-finfo.filetype == APR_REG)) { ap_internal_fast_redirect(rr, r); return OK; } /* If the request returned a redirect, propagate it to the client */ if (ap_is_HTTP_REDIRECT(rr-status) ... So I don't think your patch changes anything for us (I know you retracted it, but anyway), because rr-status won't be HTTP_OK when redirecting to a subdirectory. Other changes might, though, so I'll keep an eye on this, and I thought I'd throw in my $0.02 CDN, in case it helps you and/or anyone else using that particular bit of existing functionality. :-) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: thoughts on ETags and mod_dav
Hi -- I'm delighted to see that others have stepped in to keep this thread going; thank you! Work commitments have kept me from having the appropriate time to devote to it. It looks like Nick Kew and Ruediger Pluem have committed some of the patches discussed in the thread and the bug reports; again, thanks to all involved. It would be great if the problems addressed in #38034 and #16593 could be handled properly (at least with respect to mod_dav). Roy T. Fielding wrote: If you want a better meets_conditions in general, then write a new one with a new interface that can pass in the metadata for the chosen representation (if any) and does not stink of stupid DAVisms (like resource exists where it should say representation exists). Do not make patches to the old interface because those won't be distributed until Apache 3.x. It looks like the patch Nick committed for mod_dav essentially wraps ap_meets_conditions() and deals with the additional If-Match/If-None-Match tests outside of that function. That should suffice to handle all the DAV issues from #38034 and that may be entirely sufficient until 3.x. With an eye on handling those additional If-Match/If-None-Match tests for all users, not just mod_dav, I'd hoped to allow callers to pass the necessary extra info to ap_meets_conditions() in r-notes. The function would remain entirely backwards-compatible for existing users, of course. I think Paul Querna suggested creating a parallel ap_meets_conditions2(), as reported by Paritosh Shah, and that may be less crufty than my notion. I just wish I had an idea for more meaningful name for such a replacement function. If the weak etags are not being matched to the string etags on GET, then that is another bug that must be fixed. It is not an excuse to ignore the HTTP design. This, if I understand correctly, is the subject of #42987. My own best effort at understanding the issue, from my original post, is below -- if you can weigh in with any clarifications or corrections, that would be very helpful (to me, at least!) Chris Darroch wrote: The RFC says that a file's mtime could be a weak validator, because the file might change more than once in a second. But I think the subtly important issue here is that word could. If I understand it correctly, the implication is that one has [a] resource for which one wants to generate weak ETags only, because one knows the semantic meaning isn't changing although the content may be. In this particular case, the modification time *might* be appropriate to use as a weak validator (although that seems somewhat unlikely, because non-semantic changes that occurred seconds apart would still cause different weak ETags to be generated, which probably isn't what you'd want in this circumstance). But, in the cases handled by ap_make_etag(), the content and/or the semantic meaning of a resource could always be changing, so weak ETags would seem to be, as the PR says, simply never appropriate. Reading that over, it still makes some sense to me. Apache httpd is never in the situation implied by the RFC's example, where one wants to generate a weak ETag because one has knowledge that the file's semantic meaning hasn't changed, and therefore one can use the file's mtime to generate the weak ETag. Since httpd is never in that situation, the RFC's example essentially doesn't apply to the problem httpd faces when a file's modification time is the current time (to the second). Put another way, modification times are not appropriate as weak ETag sources when one doesn't know, somehow, that a file's semantic meaning hasn't changed. I think! :-/ Currently, when httpd notices that a file's mtime and the current time are the same, ap_make_etag() just creates its usual ETag but then marks it as weak. This has the advantage of being fast, since, as you point out, high performance is critical. If I understand the issues at hand correctly, though, to resolve #42987 we should really just not generate any ETag at all in this case. I'd proposed putting this into the hands of the administrator via a configuration directive: - add a nomtime=none|weak|ignore option to FileETag (in the manner of ProxyPass) which specifies what to do when the MTime option is configured but the file's mtime is the same as r-request_time; the default value is weak to retain backward compatibility So, the out-of-box default would be the current functionality. One could tune it to not generate any ETag in this particular situation (where mtime is used a part of the strong ETags, and the mtime and current time match), or to ignore this situation and just always send out a strong ETag. Does that seem out to lunch? Certainly the intention is not to run counter to the RFC, but rather to do the best job of conforming to it while providing high performance, and being backwards- compatible with existing behaviour where that's appropriate. Chris. -- GPG Key ID
Re: segfault in dav_validate_request
Michael Clark wrote: I'm getting a segfault here in mod_dav from trunk (after a make clean) running litmus using extras/httpd-dav.conf whereas it was working for me last night. Not sure if this a work-in-progress. No time to file a bug right now as i'm off for the weekend. /* Set the ETag header required by dav_meets_conditions() */ -if ((err = (*resource-hooks-set_headers)(r, resource)) != NULL) { return dav_push_error(r-pool, err-status, 0, Unable to set up HTTP headers., err); } I think the set_headers hook for mod_dav_fs is a NULL pointer unless DEBUG_GET_HANDLER in dav/fs/repos.c is non-zero, which it isn't by default. Using the getetag hook should work to make sure the ETag is in place before calling ap_meets_conditions(), though, if I read the code correctly. The originally proposed patch in #38034 does this, IIRC. Like you, I'm off for the weekend, plus taking my first real holiday break in a few years. If no one else deals with this before Wednesday, though, I'll see what I can do. Sorry in advance for the delay. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [PATCH] ap_socache.h mod_socache_*
Hi -- This looks great! Some semi-random thoughts, dealing just with the main header file. I was a little puzzled by the name socache because I assumed so meant shared object, like mod_so, until I read the code comments. I wondered if it was true that people would only use this kind of interface to store small objects -- I won't, for one. (We can migrate our session cache to this interface and we cram all kinds of data in there.) Also, I wondered if the term cache was necessarily accurate, if some providers allow unlimited persistent storage (DBM, DBD, etc.) I'm hoping to be able to store things with expiry=0 to mean persist as long as possible. Cache also overlaps with the mod_cache and friends which implement something rather different. With those thoughts in mind, some other possible names presented themselves -- perhaps grouped under modules/foo, where foo is the name of choice? I thought of map, dict, store, table, and hash, possibly with a d (data, distributed) or s (shared) prefix, e.g., mod_dtable, mod_dtable_distcache, etc. I kind of like mod_dict myself, or mod_dtable or mod_dmap. Should the expiry argument to store() be an apr_interval_time_t or an apr_time_t? I'd love to be allowed 0 to mean don't expire too. And perhaps apr_size_t for the key and value lengths? Is there a particular reason for the store(), retrieve(), delete() names? APR hashes use get/set (set NULL to delete), tables use get/set/unset, and DBM uses store/fetch/delete. I'd sort of be inclined toward following the DBM names and changing retrieve() to fetch(). Thoughts? I also wanted to suggest some different/additional flags; I've tried to describe their meaning in the proposed header file below. Thanks again for tackling this work -- let me know if I can help. Chris. /** * @file ap_dict.h * @brief Shared data dictionary provider interface. * * @defgroup AP_DICT ap_dict * @ingroup APACHE_MODS * @{ */ #ifndef AP_DICT_H #define AP_DICT_H #include httpd.h #include ap_provider.h #include apr_pools.h /* If this flag is set, the provider guarantees persistent storage of data; * if unset, data may be ejected as the provider deems necessary. */ #define AP_DICT_FLAG_PERSIST (0x0001) /* If this flag is set, the provider sets no fixed limit on the number of * key/value data pairs it can store. */ #define AP_DICT_FLAG_NO_LIMIT (0x0002) /* This this flag is set, the provider guarantees that stored data will * remain consistent if multiple processes or threads make concurrent calls * to the provider's functions. */ #define AP_DICT_FLAG_ATOMIC (0x0004) /* If passed in a klen argument, the provider will used strlen(key) * to determine the key length. */ #define AP_DICT_KEY_STRING (-1) typedef struct ap_dict_provider_t { /* Canonical provider name: */ const char *name; /* Bitmask of AP_DICT_FLAG_* flags: */ unsigned int flags; /* Create a dictionary based on the configuration parameters in * params. Returns NULL on success, or an error string on failure. * Pool tmp should be used for any temporary allocations, pool p * should be used for any allocations that should last as long as the * lifetime of the return context. * * The context pointer returned in *instance will be passed as the * first argument to subsequent invocations. */ const char *(*create)(void **instance, const char *params, apr_pool_t *tmp, apr_pool_t *p); /* Initialize the dictionary. Return APR error code. */ apr_status_t (*init)(void *instance, /* hints, namespace */ server_rec *s, apr_pool_t *pool); /* Destroy a given dictionary context. */ void (*destroy)(void *instance, server_rec *s); /* Store an object in the dictionary. If expiry is zero, the * provider will attempt to retain the data as long as possible. */ apr_status_t (*store)(void *instance, server_rec *s, const char *key, apr_size_t klen, apr_interval_time_t expiry, char *val, apr_size_t vlen); /* Retrieve stored data with key of length klen, returning APR_SUCCESS * on success or an APR error code otherwise. Upon success, * the data will be written to *val, up to a maximum number of bytes * specified on entry by *vlen, and *vlen will be updated to the * length of the data written. */ apr_status_t (*fetch)(void *instance, server_rec *s, const char *key, apr_size_t klen, char *val, apr_size_t *vlen, apr_pool_t *pool); /* Remove an object from the dictionary. */ void (*delete)(void *instance, server_rec *s, const char *key, apr_size_t klen, apr_pool_t *pool); /* Dump dictionary status for mod_status output. */ void (*status)(void *instance, request_rec *r, int flags);
Re: [PATCH] ap_socache.h mod_socache_*
Hi -- With those thoughts in mind, some other possible names presented themselves -- perhaps grouped under modules/foo, where foo is the name of choice? I thought of map, dict, store, table, and hash, possibly with a d (data, distributed) or s (shared) prefix, e.g., mod_dtable, mod_dtable_distcache, etc. I kind of like mod_dict myself, or mod_dtable or mod_dmap. I have to retract mod_dict, since dictionary usually implies an ordered set of keys, and that's not the case with these providers. But, what about mod_shmap for shared map, since I think what is common to all these providers is essentially a shared associative array, with a mapping from keys to values. Thoughts? Chris. = /** * @file ap_shmap.h * @brief Shared associative array provider interface. * * @defgroup AP_SHMAP ap_shmap * @ingroup APACHE_MODS * @{ */ #ifndef AP_SHMAP_H #define AP_SHMAP_H #include httpd.h #include ap_provider.h #include apr_pools.h /* If this flag is set, the provider guarantees persistent storage of data; * if unset, data may be ejected as the provider deems necessary. */ #define AP_SHMAP_FLAG_PERSIST (0x0001) /* If this flag is set, the provider sets no fixed limit on the number of * key/value data pairs it can store. */ #define AP_SHMAP_FLAG_NO_LIMIT (0x0002) /* This this flag is set, the provider guarantees that stored data will * remain consistent if multiple processes or threads make concurrent calls * to the provider's functions. */ #define AP_SHMAP_FLAG_ATOMIC (0x0004) /* If passed in a klen argument, the provider will used strlen(key) * to determine the key length. */ #define AP_SHMAP_KEY_STRING (-1) typedef struct ap_shmap_provider_t { /* Canonical provider name: */ const char *name; /* Bitmask of AP_SHMAP_FLAG_* flags: */ unsigned int flags; /* Create a shared map based on the configuration parameters in * params. Returns NULL on success, or an error string on failure. * Pool tmp should be used for any temporary allocations, pool p * should be used for any allocations that should last as long as the * lifetime of the return context. * * The context pointer returned in *instance will be passed as the * first argument to subsequent invocations. */ const char *(*create)(void **instance, const char *params, apr_pool_t *tmp, apr_pool_t *p); /* Initialize the shared map. Return APR error code. */ apr_status_t (*init)(void *instance, /* hints, namespace */ server_rec *s, apr_pool_t *pool); /* Destroy a given shared map context. */ void (*destroy)(void *instance, server_rec *s); /* Store an object in the shared map. If expiry is zero, the * provider will attempt to retain the data as long as possible. */ apr_status_t (*store)(void *instance, server_rec *s, const char *key, apr_size_t klen, apr_interval_time_t expiry, char *val, apr_size_t vlen); /* Retrieve stored data with key of length klen, returning APR_SUCCESS * on success or an APR error code otherwise. Upon success, * the data will be written to *val, up to a maximum number of bytes * specified on entry by *vlen, and *vlen will be updated to the * length of the data written. */ apr_status_t (*fetch)(void *instance, server_rec *s, const char *key, apr_size_t klen, char *val, apr_size_t *vlen, apr_pool_t *pool); /* Remove an object from the shared map. */ void (*delete)(void *instance, server_rec *s, const char *key, apr_size_t klen, apr_pool_t *pool); /* Dump shared map status for mod_status output. */ void (*status)(void *instance, request_rec *r, int flags); } ap_shmap_provider_t; /* Dictionary providers are registered using the ap_provider_* interface, * with the following group and version: */ #define AP_SHMAP_PROVIDER_GROUP shmap #define AP_SHMAP_PROVIDER_VERSION 0 #endif /* AP_SHMAP_H */ /** @} */
Re: Proposal: a cron interface for httpd
Graham Leggett wrote: On a number of occasions recently I have run into the need to run some kind of garbage collection within httpd, either in a dedicated process, or a dedicated thread. I've also written a few modules where each child process runs a private thread in the background. I'd suggest there are perhaps three variants here (in a Unix-like context, anyway): one separate process, one thread in the master process, or one thread per child process. Presumably MPMs should somehow indicate which of these they support. ap_cron_per_second ap_cron_per_minute ap_cron_per_hour ap_cron_per_day ap_cron_per_week I wonder if you could flatten these down to a single ap_monitor_interval() kind of thing, where the module specified the interval it wanted? I suppose one could go the other direction toward a full-blown scheduler, but that seems like a lot of extra effort for perhaps little gain. It might be nice to also offer the option to randomly stagger the creation of processes/threads within some additional time interval -- especially for one-per-child threads, that could help avoid a thundering herd kind of situation where a bunch of child processes start up together (e.g., at a restart), and later all kick off resource-intensive threads at nearly the same time. Of course a module's background threads could wait some random interval on their own after being started, but then they're eating up time sleeping while the invoking process thinks they're doing work. My other thought is that it would be really nice to be able to track the status of these processes and threads in mod_status; e.g., in the scoreboard or a scoreboard-like utility. Moreover, it would be excellent if the scheduler/MPM could use this info to avoid spawning new processes/threads if the old ones were still executing ... for a per-second kind of interval, that might be quite important, especially if there's any chance the task could occasionally get stuck. I feel this relates a bit to my continued interest (mod lack of time) in abstracting the scoreboard and shared-memory subsystems into a shared map facility. Joe Orton did a bunch of work on the SSL session cache which I think moves it in this direction; see his RFC and a couple of my responses: http://marc.info/?l=apache-httpd-devm=120397759902722w=2 http://marc.info/?l=apache-httpd-devm=120406346306713w=2 http://marc.info/?l=apache-httpd-devm=120491055413781w=2 I also put a larger outline of some of my notions in this regard into this document: http://svn.apache.org/viewvc/httpd/sandbox/amsterdam/architecture/scoreboard.txt?view=markup In particular, in thinking about background processes and threads both of the type you're proposing and those created by modules like mod_cgid and mod_fcgid, I had began tossing around the notion of modules being able to register additional scoreboard states beyond those hard-coded now into mod_status: - during pre/check/post-config phases, modules (including MPMs) may indicate if they need IPC space, what type, and how much: - private space in scoreboard table values - additional scoreboard states - at startup, the master process initializes the storage provider: - MPM sizes scoreboard based on runtime process and thread limits, not compile-time maximums - assigns IDs for additional scoreboard states requested by modules - creates scoreboard state-to-ID hash mappings in regular memory as part of read-only configuration data inherited by children We could then offer modules standard ways to ask for processes or threads to be spawned at startup/restart time, or on some schedule (as per your initial proposal), and for these processes/threads to update their status record in the scoreboard. Certain MPMs (e.g., worker, event) also spawn threads that aren't recorded in the scoreboard at the moment; it would be great to see them in the scoreboard too. The administrator could then see everything at a glance in mod_status, including all the background tasks, and the scheduler/MPM would have a standard way to check if a background task was still running from a previous invocation. I feel like there's a certain serendipity in this proposal coming along around the same time as Joe Orton's work, which seems to me to be heading toward not so much of a cache in the traditional httpd sense (i.e., mod_cache and friends) as a generic shared map interface that would be useful in a wide variety of ways, including implementing a configurable scoreboard that could help track arbitrary background tasks. Thoughts, flames? Fire away! Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Dynamic configuration for the hackathon?
Jim Jagielski wrote: I'd prefer optimum runtime and let that drive how it gets exposed to the admin, rather than the reverse... And then we can see if that pain is worth it :) +1 to this as a guiding principle. I know our administrators would, above all else, like a standard way to build up configuration files from templates. Perhaps that could be mod_macro packaged in the main distribution, or some kind of configulator utility, or yet something else. Dynamic (re)configuration issues are a lesser concern for them, I think; it would be great if whatever is implemented could be largely optional so it can be loaded only if necessary. Good luck at the hackathon! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: 2.4 (Was: Re: Configuration Issues to Address [was Re: Dynamic configuration for the hackathon?])
William A. Rowe, Jr. wrote: I'd -1 a 2.4.0 release today, because nobody has even bothered to make a candidate for 2.3-dev. Auth logic changes break most if not all third party auth modules (broke an auth feature in mod_ftp). Not talking about commercial modules but every third party auth extension out there. I've been working with the 2.4 authn/z stuff a bit lately and what I keep tripping over is that the default authorization merge rule uses OR logic. For example, if I enable mod_access_compat and put in a traditional: Location /foo Order deny,allow Deny from all /Location it doesn't take effect, because the default top-level Directory contains Require all granted and since that succeeds for all requests, everything else is short-circuited by the OR merge logic. So at a minimum I seem to have to put in an AuthzMergeRules Off to get things to DWIM. I fear that might trip up a significant percentage of people upgrading ... perhaps a AuthzMergeRules Off in the default httpd.conf would be sufficient, but my experience with mod_authz_dbd suggested that I needed to put it in a lot of places to get stuff working the way I intended (e.g., the example config in the mod_authz_dbd manual page in the trunk docs). Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: 2.4 (Was: Re: Configuration Issues to Address [was Re: Dynamic configuration for the hackathon?])
William A. Rowe, Jr. wrote: I've been working with the 2.4 authn/z stuff a bit lately and what I keep tripping over is that the default authorization merge rule uses OR logic. For example, if I enable mod_access_compat and put in a traditional: I wonder if anyone would offer a fastfeather talk next week on wed or thurs - it's only 15 minutes - to introduce what's upcoming in 2.4? I won't be there, but here's a recap of the issue for discussion. (Caveat: I may be missing something important!) With 2.2 and prior versions, one can do something like: Directory /htdocs Require valid-user /Directory Directory /htdocs/admin Require user admin /Directory The logic which is then applied is: 1) For all requests under /htdocs, except those under /htdocs/admin, require any valid user. 2) For all requests under /htdocs/admin, require the admin user. With 2.4, unless I'm missing something, the same configuration produces the logic: 1) For all requests under /htdocs, except those under /htdocs/admin, require any valid user. 2) For all requests under /htdocs/admin, require any valid user OR require the user admin. Of course this grants any valid user access. To get the old behaviour, you seem to need to add AuthzMergeRules Off to the second Directory. I just tested versions of this configuration with 2.2 and 2.4 and I think I'm describing the situation correctly. Assuming I am, I fear this will surprise a lot of people who think they've secured their systems after upgrading. It certainly caught me short. Perhaps the default AuthzMergeRules setting should be Off rather than On, at least when merging across configuration blocks? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Apache support for form authentication
Graham Leggett wrote: A session is a simple table of key value pairs. mod_session_dbd stores sessions within a SQL database. The session is tracked by a cookie, very similar to a typical Tomcat session. The catch is that you need a database beefy enough to handle the resulting load, which may or may not be a problem for the admin, they can decide. Great idea. What would you think about converting from the session_save hook to a provider-based mechanism? I can imagine that administrators might want/need to use one kind of storage provider (dbd, say) for sessions on high-priority virtual hosts, while using another less reliable storage provider on others. Maybe this is possible with hooks if the storage modules return DECLINED unless they're the one that's most closely configured with a particular request. Still, it might be a little cleaner if the caller just looks up the appropriate provider and calls it directly. Again, great idea, thanks! Once more, with feeling, I'll plug the notion of a generic mod_shmap that implements shared map (key/value pair) storage, using any of various providers for each map: dbd, dbm, shared memory (w/ cyclic buffer), distcache, maybe memcached, etc. It would seem that mod_session would be an ideal potential customer for such a storage provider system, along with mod_proxy (for the stuff it shoehorns into the scoreboard now), mod_ssl, mod_auth_digest, and perhaps even the scoreboard itself. Administrators could set the storage provider for each individually. A default setup might use shmem for mod_proxy, distcache for mod_ssl and mod_auth_digest caches (would be nice to finally enable mod_auth_digest's shared cache code!), and dbd for user sessions. The trade-offs are the usual ones of performance vs. scalability/persistance/reliability. See http://marc.info/?l=apache-httpd-devm=120491055413781w=2 for a proposed ap_shmap.h based on Joe Orton's refactoring of mod_ssl's session cache code. I can't be at the conference next week but I'd be willing to hack on either trunk, or perhaps better, a branch, per Paul Querna's suggestion. A way to start might be to take Joe Orton's latest dbm and shmem cyclic buffer providers and try to get mod_session + mod_shmap + {mod_shmap_dbm, mod_shmap_shmem} working as a proof of concept. Thoughts? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: So here was the thinking behind it when AuthzMergeRules was introduced. Maybe there is still a bug here that needs to be addressed. http://mail-archives.apache.org/mod_mbox/httpd-dev/200607.mbox/[EMAIL PROTECTED] http://mail-archives.apache.org/mod_mbox/httpd-dev/200607.mbox/[EMAIL PROTECTED] I'm not sure it's a bug per se, but rather, an unexpected break from the intuition developed by administrators used to configuring 2.2.x and prior versions about how authorization cascades through configuration blocks. I may be wrong about this, but here's how I'd expect the example from your second thread to work. The example is: Directory /www/pages Reject ip 127.0.0.1 /Directory Directory /www/pages/secure Authtype Basic ... SatisfyAll Require valid-user Reject user joe /SatisfyAll /Directory My hunch is that prior to 2.2, the fundamental logic when merging authz rules across blocks was neither AND nor OR, but reset. That is, it relies on the configuration walks to find the authz directives in the most closely relevant block. Those directives are then applied; what appeared in less relevant blocks is ignored. I haven't looked closely at the logic, but that's what seems to happen if you've got something like: Directory /htdocs Require valid-user /Directory Directory /htdocs/admin Require user admin /Directory So in your example, my configuration intuition suggest that only what appears in the Directory /www/pages/secure block applies to anything under /www/pages/secure, and that the Reject ip 127.0.0.1 would just not applied at all to these URIs. So I'd expect that to access /www/pages/secure I'd need to be any valid user except joe; whether I was connecting from 127.0.0.1 wouldn't matter. Now, within a block, having OR be the default merge logic would seem to make sense; if you want AND, you need SatisfyAll. So: Directory /www/pages/secure Require user betty Require user joe /Directory means betty and joe alone have access, regardless of what applied to /www/pages. But if the following is also configured, then a reset rule across blocks would mean that it does what one expects; betty and joe would be rejected along with everyone else when attempting to access URIs under /www/pages/secure/really. Directory /www/pages/secure/really Require all denied /Directory This would presumably work identically: Directory /www/pages/secure/really Reject all granted /Directory Anyway, that's a bit of a shot in the dark. Hope it helps. Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Problems with mod_dbd and prepared statements
Graham Leggett wrote: Are there any special requirements necessary before mod_dbd will successfully register a prepared statement? You have to register statements to be prepared prior to the post_config phase ... that's the main one I can think of. Does that help? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r645395 - /httpd/httpd/trunk/server/request.c
[EMAIL PROTECTED] wrote: Make it compile on pre-C99 compilers (Move decls to beginning of func) Joe Orton wrote: The ap_clear_auth_internal() definition doesn't match the way it's called. Thanks, both. Joe's checks in the past have taught me to use -Wall religiously but apparently this time I did my final post-debugging tests on a machine with an insufficiently recent gcc (but not sufficiently old to catch the pre-C99 error). Sorry about that; thanks again, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Apache support for form authentication
Graham Leggett wrote: The session modules are all designed to go exclusively inside Directory and Location sections, which allow you to precisely target which URL space your session is valid for, and this can be repeated as many times as you like creating as many individual sessions as you like. William A. Rowe, Jr. wrote: Hmmm... why can't I use a session across an entire vhost? That was my first thought as well: If you visit this host, we'll try to give you a session cookie. Of course it could go into a Location / but it's a little cleaner and more obvious if I can just use it directly in the RSRC_CONF context as well. Identify areas of the server where functionality is being reinvented over and over, and then create key modules, like mod_shmap and mod_session to replace the reinvented functionality. Agreed, and I suspect most folks are likely to be on board with that overall aim; of course, the question becomes one of getting agreement on which areas are in fact examples of duplicated logic, and of those which deserve to remain independent implementations for reasons of performance/clarity/etc. and which should be refactored. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Problems with mod_dbd and prepared statements
Graham Leggett wrote: Is that prior to or up to? If it's up to, it would help me a lot... So long as you ensure you run before mod_dbd's hook in the post_config phase, you should be able to register statements to be prepared (I think). Once mod_dbd's post_config hook runs, it has an internal set of hashes that describe the query label-to-SQL mappings. So if you put mod_dbd in aszSucc when registering your post_config hook, I would think that should work. Then, whenever a DB connection is opened (some will be opened at child_init time, if DBDMin 0, and others later as demand dictates) these SQL statements will be prepared against the DB connection. It's worth noting that they're not prepared during the config phases, since no DB connections yet exist; connections are only made from within child processes. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: Directory /www/pages Reject ip 127.0.0.1//Or any other Require directive /Directory Directory /www/pages/whatever ... /Directory Since the /www/pages/whatever directory did not specify any authz, what should happen? If the AuthzMergeRules is OFF then what is the authz for /www/pages/whatever? I'm not sure that 'all granted' is correct but then neither is 'all denied'. Since the AuthzMergeRules is OFF then merging the authz would be counter intuitive. If AuthzMergeRules is ON then 127.0.0.1 is rejected and all others are allowed. I guess the thinking was that leaving the default ON would leave fewer unknowns however in some instances may not be as intuitive. Well, again referring to my intuition based on configuring 2.2 and prior servers, I would expect the /www/pages authz to cover everything under /www/pages (including /www/pages/whatever) unless I define other authz rules -- at which point, the corresponding authz slate is wiped clean for that subdirectory, and my local authz directives take effect. (Not all authz directives, mind you, just those which I'm overriding.) So I can do something like: Directory /www/pages Require valid-user /Directory Directory /www/pages/images ## still protected Dav filesystem /Directory Directory /www/pages/private Require user admin /Directory One key to this behaviour, IIRC, is that all of the stock authz modules in 2.2 use the default merge_dir_config rules; that is, none of them define their own merge function and all just say: NULL, /* dir merger --- default is to override */ Then the ap_merge_per_dir_configs() logic which gets applied when merging their per-directory configurations is to cause the ancestor's configuration to be inherited, unless there's a new per-directory configuration for the same module. The Require directive is, again IIRC, handled in the core per-directory configuration, and the logic in merge_core_dir_configs() is similar: duplicate the ancestor's configuration, and then override the inherited Require directives if there's a Require directive in the new per-directory configuration: if (new-ap_requires) { conf-ap_requires = new-ap_requires; } So I think the result is that each authz directive gets inherited down through the directory configurations, until something overrides it, at which point everything to do with that directive is started fresh. It's a relatively space-efficient configuration style, actually, because you only need to put in an authz directive if you're changing something from what's inherited. I hope I've described the existing situation accurately; at any rate, it seems to me to be a good way to structure the 2.4 authz merge rules. It would mean you mostly didn't need to specify AuthzMergeRules (saves typing and errors) and things are protected down through the directory hierarchy by default (also good) unless you specifically include a new authz directive, at which point that takes effect and is inherited downwards. So the default merge logic would be, I guess, neither AND nor OR but inherit-until-overridden. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: Your assumptions about how the 2.2 per-dir merging is correct. Unfortunately the same concepts no longer apply to 2.4. The reason why is this: Directory /www/pages SatisfyAll Require ip 10.10.0.1 Require ldap-group sales SatisfyOne Require ldap-group ne-sales Require ldap-group sw-sales /SatisfyOne /SatisfyAll /Directory Directory /www/pages/images ## still protected Dav filesystem /Directory Directory /www/pages/private Require ldap-group marketing /Directory Which ldap-group is overridden vs merged? Since the 2.2 authz had no concept of logic and was simply a list of require statements, it was very easy to add to the list or override an entry in the list. This is no longer the case. The require statements have to be merged according to some type of logic rules. Your suggestion would work if /www/pages/private simply reset and applied only the require statements it found in its own directory block (ie. AuthzMergeRules OFF), but picking the inherited logic apart and trying to rework it, won't really work. Ideally, perhaps, it would be possible to throw a configuration-time error indicating that there was ambiguity here. However, since it's going to be effectively impossible to pre-determine all the possible Directory, Location, etc. blocks which might apply to a particular request (and in which order), that's not really possible. As you say, picking things apart isn't going to work either. One alternative is, as we've discussed, a default AuthzMergeRules Off state which would blank the slate when a per-dir config block contained any of Require/Reject/SastifyAll/SastifyOne. What would happen, though, if the default merge rule between per-dir configuration blocks was AND? (Within a block, OR would presumably still make sense.) I have a head cold today, and so may be making even less sense than usual, but I think this would at least ensure that people's configurations got gradually tighter and more restrictive as blocks were merged together down through the location and directory hierarchies. It might take an administrator a while after upgrading to figure out why they couldn't access something anymore, but at least they wouldn't have opened anything up unexpectedly. BTW, one of the main reasons why the logic operators were added to authz was to solve the problem that existed in 2.2. The problem was that if multiple require statements appeared in a single directory block or in a hierarchy, you never really knew in which order they were actually applied. Basically it came down to the order in which the authz modules happen to have been loaded. And there's no easy way, I think, to turn off authz completely, a Require all granted ... the 2.2 docs refer you to using Sastify All and Allow from all. 2.4 is much cleaner, no doubt. My concern just really comes down to the fact that I'm fairly sure that a default OR merge rule between per-dir configuration blocks will lead people into thinking they've secured a subdirectory, when in fact they haven't, because their more-secure rule will be silently short-circuited by less-secure authz rules from above. Here's another thought: for people doing mass virtual hosting, and who let their customers put authn/z directives into .htaccess files with AllowOverride AuthConfig, I would think it may be important to ensure that these rules still merge together in the way they used to. Otherwise upgrading to 2.4 might mean tracking down every .htaccess file and rewriting it to do the right thing (sticking in an AuthzMergeRules Off or something). For some people doing vhosting I suspect that would be a tall order, so it would be good if 2.4 would function securely in this situation, by default. That said, I don't use .htaccess files and may not be making any sense today; my apologies. Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Chris Darroch wrote: Here's another thought: for people doing mass virtual hosting, and who let their customers put authn/z directives into .htaccess files with AllowOverride AuthConfig, I would think it may be important to ensure that these rules still merge together in the way they used to. Otherwise upgrading to 2.4 might mean tracking down every .htaccess file and rewriting it to do the right thing (sticking in an AuthzMergeRules Off or something). For some people doing vhosting I suspect that would be a tall order, so it would be good if 2.4 would function securely in this situation, by default. That said, I don't use .htaccess files and may not be making any sense today; my apologies. Here's a follow-up notion; admittedly, it represents a lot of re-refactoring work. It would provide an secure upgrade path for people with complex configurations, including those with many legacy .htaccess files to consider. A new directive, Accept, is introduced to take the place of Require. It functions as Require does now in 2.4. Thus we have two groups of authz directives, old (Require/Satisfy/Order/ Deny/Allow) and new (Accept/Reject/SatisfyOne/SatisfyAll). The old directives function as they did in 2.2. Authz directives would be parsed and merged as follows: 1) Within a per-dir config block (Location/Directory/etc.) old and new authz directives may not be mixed. If directives from both groups appear, a config-time error is thrown. 2) When merging new authz directives within a per-dir config block, the default merge rule is OR, as in 2.4 at present. This is equivalent to using a SatisfyOne around all new authz directives within a per-dir config block. 3) When merging per-dir config blocks at runtime, the following rules are applied; we'll call the parent block base and the child block new: 3.1) If the new block contains no authz directives, the base's authz configuration is inherited (if any). This follows current 2.2 behaviour. 3.2) If the new block contains old authz directives, the base block's authz configuration is discarded, and the new block's authz directives are applied to a clean slate. This follows current 2.2 behaviour. 3.3) If the new block contains new authz directives, the base and new blocks' authz configurations are merged using the rule specified by AuthzMergeRules (as it appears within the new block): 3.3.1) If AuthzMergeRules is set to Off or is not defined, the base block's authz configuration is discarded, and the new block's authz directives are applied to a clean slate. This follows current 2.2 behaviour, to avoid confusion and simplify most configurations. 3.3.2) If AuthzMergeRules is set to Or or SatisfyOne, the base block's authz configuration is merged with the new block's as if they were collectively contained within a SatisfyOne block. 3.3.3) If AuthzMergeRules is set to And or SatisfyAll, the base block's authz configuration is merged with the new block's as if they were collectively contained within a SatisfyAll block. Writing that all out it mostly just seems like a depressingly large amount of work, but otherwise feels like it might offer a way forward, both for people upgrading from 2.2 and those starting fresh with 2.4. Thoughts? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r646445 - in /httpd/httpd/trunk/modules/aaa: mod_auth_basic.c mod_auth_digest.c
Graham Leggett wrote: Let each consumer of authn providers redefine the list_provider_names callback in case they are loaded individually without mod_authn_core. Can you check if mod_auth_form also needs this? Yes, it does -- looks like you're doing your own ap_lookup_provider() calls on the authn provider group. I can put it in for you if you like. I'd propose also taking out the AP_MODULE_MAGIC_AT_LEAST(20080403,1) check because in trunk you can just call the new function; if you're backporting to 2.2 before the walk cache stuff gets backported, just do it the old way (and don't define the list_provider_names callback). I'll let the walk cache stuff settle a little more but I hope to propose it for backporting soon (see the relevant patch set under http://people.apache.org/~chrisd/patches/walk_cache/). It should be backwards-compatible and provides a performance boost, especially for mod_dav when running behind authn/z (which it almost always is). Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r646445 - in /httpd/httpd/trunk/modules/aaa: mod_auth_basic.c mod_auth_digest.c
Ruediger Pluem wrote: Lets hope that we never decide to change anything to authn_ap_list_provider_names. Otherwise I bet that we run into inconsistencies which might be hard to track and lead to funny bug reports. So I feel -0.5 on this. Do you mean changing the function definition, or the provider group/version whose providers it returns? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r646445 - in /httpd/httpd/trunk/modules/aaa: mod_auth_basic.c mod_auth_digest.c
Ruediger Pluem wrote: The implementation of the function. IMHO it must be the same in all modules. Otherwise it depends on the module load order what gets called and done. True -- an alternative might be to do the following: - #define AUTHN/Z_PROVIDER_VERSION 0 in mod_auth.h - change all the authn/z modules to use #defines in place of hard-coded 0 provider versions (maybe not a bad thing to do anyway) - #include mod_auth.h in request.c (mostly involves lots of fiddling with config.m4 files and the like, for each platform) - eliminate the optional function call entirely I worked on this approach but concluded people might not appreciate all the changes to the config files and so forth, although the actual code per se is a little simpler and cleaner. If this is preferred by a majority, though, I'm happy to implement it this way instead. Thoughts? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r646445 - in /httpd/httpd/trunk/modules/aaa: mod_auth_basic.c mod_auth_digest.c
Ruediger Pluem wrote: So we are using the optional function only because we do not have the defines for AUTHN_PROVIDER_GROUP and the version number around in request.c. If they would be around it would be quite fine to call ap_list_provider_names directly from there, correct? Yes, and in fact, there's no need for a function call at all, you can just look up the provider list directly in that case. Much cleaner, except for all the config magic changes (as discussed by others); I can more or less guarantee that I'll break the Windows and/or the Netware builds trying to get mod_auth.h to be included. :-) But a +1 from me on the idea, if others are willing to do the repair work afterwards for those platforms. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: [VOTE] move all mod_*.h with public APIs to ./include folder
Guenter Knauf wrote: in order to simplify future configuration, and most important to have same include path structure with both in-tree and installed compilations I think it makes sense to move all mod_*.h headers with public APIs to the common ./include folder. +1 since it simplifies my life. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: I'm not real excited about adding a new authz directive. Authn and authz are already very complex and adding a new directive to the mix will just help to confuse people even more. That's a good point. Mostly the idea of an Accept replacement for Require came up as a way to distinguish pre-2.4 and 2.4 per-dir authz, and in case there were any Require foo directives which had slightly different meanings in the two contexts and which might therefore trip people up. If we can do without it, all the better. I am OK with this one except for the reason that I mentioned before. By allowing authz to be inherited even when AuthzMergeRules is OFF is kind of a conflict. In other words, since AuthzMergeRules OFF implies an AND, 1 AND 0 should be 0 or no authz rather than inherited authz. However I could buy into this if it seems to make more intuitive sense to the user. Well, I may be missing something, but what I envisioned was that AuthzMergeRules had three options: Off (i.e., inherited until overridden, the pre-2.4 default), SatisfyOne, and SatisfyAll. That would give administrators full control over how they wanted authz in different per-dir blocks to be merged. It seems to me we have three basic possibilities when it comes to merging authz across per-dir blocks, and the most common authz case to consider is going to be where security gets tighter as you move down the document tree. Imagine something like the following in a pre-2.4 configuration, where admin is not a member of team: Directory /htdocs ## full access /Directory Directory /htdocs/team ## anyone in team has access Require group team /Directory Directory /htdocs/team/admin ## only admin user has access Require user admin /Directory 1) The first option for 2.4 merging is to use OR logic (the current default in trunk). This leads to anyone in team having access to /htdocs/team/admin, I believe. I think I'd have to vote -1 against this, because it will lead to lots of previously secure configurations becoming insecure. Plus it would seem to increase the required number of directives (since you have to add AuthzMergeRules Off in each sub-Directory) to achieve what seems to me to be a typical configuration, i.e., increasing security as you go down the tree. 2) Another option might be to use AND logic. In this case, if I'm applying the logic correctly, no one would be able to access /htdocs/team/admin given the configuration above in a 2.4 context (since admin isn't in team). While more secure, this also seems counter-intuitive to me. Maybe -0.5? 3) Finally there's the pre-2.4 logic of overriding all previous authz. This would seem to be the most preferable option, since it ensures many pre-2.4 configurations will continue to work as expected, and I think also reduces configuration verbosity in typical cases. You were concerned, I think, about more complex configurations like this: Directory /www/pages SatisfyAll Require ip 10.10.0.1 Require ldap-group sales SatisfyOne Require ldap-group ne-sales Require ldap-group sw-sales /SatisfyOne /SatisfyAll /Directory Directory /www/pages/private Require ldap-group marketing /Directory I would suggest that the default pre-2.4 logic of overriding previous authz when any authz is defined in a per-dir block is still reasonable as a default. Thus, only those in marketing have access to /www/pages/private, and they can access it from other addresses than 10.10.0.1. Even if this isn't what is desired, it's clear enough that an administrator can figure out what's going on and why the configuration isn't achieving the desired result. I'd propose giving the administrator the choice of both alternatives to the default logic. Instead of simply offering AuthzMergeRules On, there would be two alternatives to the default Off condition. These would be AuthzMergeRules SatisfyOne (the OR logic) and AuthzMergeRules SatisfyAll (the AND logic). We might offer Or and And as synonyms for SatisfyOne and SatisfyAll, respectively. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: This is where it starts to go wrong for me. Where it gets confusing for somebody who is trying to figure out what the configuration is doing is: Directory /www/pages SatisfyAll Require ip 10.10.0.1 Require ldap-group sales SatisfyOne Require ldap-group ne-sales Require ldap-group sw-sales /SatisfyOne /SatisfyAll /Directory Directory /www/pages/private AuthzMergeRules SatisfyOne SatisfyAll Require ldap-group marketing Require ldap-group alt-marketing /SatisfyAll /Directory Now I have to reconcile the logic of the parent with the logic of both the AuthzMergeRules and the SatisfyAll tag. Even though it might not always look like the cleanest configuration, I think it will be less confusing if the logic rules were confined to the SatisfyAll and SatisfyOne tags rather than introducing alternate logic directives. I take your point about complexity, but on the other hand, these aren't alternate logic directives -- they're additional. The logic specified above for /www/pages/private would be: ( Require ip 10.10.0.1 Require ldap-group sales ( Require ldap-group ne-sales || Require ldap-group sw-sales)) || - AuthzMergeRules SastifyOne ( Require ldap-group marketing Require ldap-group alt-marketing) At least with AuthzMergeRules being ON or OFF, the only thing I need to know is if I am merging with the parent or not. All of the logic rules just flow from there. Granted, the above example is complex and maybe non-intuitive, but most people aren't going to attempt or need such a large set of authz directives. For those who really do, there are surely going to be cases where AND'ing block is useful while OR'ing is not at all (since OR'ing tends to short-circuit the configurations deeper down the document tree, which seems unlikely to be what people will mostly want). If you'd like to stick to just Off (my proposed default for AuthzMergeRules) and On, perhaps AND should be the logic implemented by On? Consider the following, where AND'ing helps tighten security as you go down the tree: Directory /www/private Require ip 10.10.0.1 /Directory Directory /www/private/sales ## must come from 10.10.0.1 AuthzMergeRules SatisfyAll SatisfyAll Require ldap-group sales SatisfyOne Require ldap-group ne-sales Require ldap-group sw-sales /SatisfyOne /SatisfyAll /Directory Directory /www/private/sales/admin ## must come from 10.10.0.1, belong to sales, and also ## belong to one of ne-sales or sw-sales AuthzMergeRules SatisfyAll SatisfyAll Require ldap-group admin Require ldap-group sales-admin /SatisfyAll /Directory Perhaps others have opinions on this stuff? Basically my principal concern is that the default AuthzMergeRules setting must be Off. Beyond that, I can live any other choices. Personally, I'm gradually coming around to the feeling that AND is more useful/secure than OR when merging per-dir blocks, and possibly even within a single per-dir block (although that's another conversation), and so should either be an option to AuthzMergeRules or the action implemented by On if there are only two states. The reason I say it might make sense to AND authz requirements within a block is that it reads a little more naturally. Consider the following, which suggests to me that I need a shirt and shoes to be served, not one or the other: Directory /www/service Require shirt on Require shoes on /Directory At rate rate, thanks for hashing through all my scattershot ideas on this stuff. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: I could go along with switching the default merging rule from OR to AND, even within a dir block. The reason why it is OR today was basically for backward compatibility. Since there really wasn't any kind of logic before, OR was just the default. If we switch to AND as being the default within a dir block, it may break some existing configurations. However I also think that AND is a safer merging rule going forward. If we just switch the AuthzMergeRules default state to Off, and make On mean AND, that would be a great start. Then maybe we can revisit and take a look at what breaks if the within-block merging is AND too; if it breaks too much stuff, maybe it needs to stay OR. Right now I can't say I have an informed opinion on that. Thanks again for working through this stuff, Chris.
socache/shmap ZooKeeper provider
Hi -- I wanted to get a little experience with the socache (small object cache) providers which Joe Orton recently refactored out of mod_ssl and so I've written a pair of modules, mod_shmap and mod_socache_zookeeper, which are available here: http://people.apache.org/~chrisd/projects/shared_map/ The mod_shmap module allows HTTP clients to access any configured socache provider's storage using GET, PUT, and DELETE requests; the URI path is used as the ID key. This might be a useful way for clients without a native API to access these providers. My immediate purpose, though, was to write a module that served as a test harness for the various providers and an example of how to use them without making people slog through mod_ssl. The mod_socache_zookeeper module is an experimental socache provider that uses ZooKeeper as its data store.[1][2] ZooKeeper is distributed, highly reliable coordination service similar to Google's Chubby lock service.[3] Like Chubby, it appears to implement the Paxos consensus algorithm.[4][5][6] (ZooKeeper's documentation and code comments are a little sketchy in this regard, however.) A key caveat about mod_socache_zookeeper is that it simply ignores expiry times at the moment. An enhancement would be to have a background thread that periodically culled expired nodes. Here's an example configuration that maps all HTTP requests to ZooKeeper, except for those under /shm, which go to a shared-memory cyclic buffer cache: SharedMapProvider zookeeper zk1.example.com:7000 Location /shm SharedMapProvider shmcb /tmp/shm /Location SetHandler shmap Based on the experience of writing these modules, I have a few thoughts and notes for discussion, in no particular order. I confess I still find socache sits oddly with me as a name, both because of its similarity to mod_so and .so files, and because I continue to doubt everyone will treat these providers as always implementing caches only. It's true that some providers will always impose data size limits, but that could be something the caller can interrogate and reject or require as necessary. It would also be valuable, I think, to disambiguate these providers from the different functionality of mod_cache and its related modules. So I'd again suggest modules/shmap (for shared map) as a possible location and naming scheme. Another very minor naming issue is that AP_SOCACHE_FLAG_NOTMPSAFE reads as no temp safe on first glance; perhaps NOT_MP_SAFE or NOT_ATOMIC would be more readable? I ran into three particular inconsistencies when coding which I think could be addressed quickly: a) The store() call should take an apr_pool_t argument like retrieve() and delete() for temporary allocations. b) The delete() call should return an apr_status_t like the other two, since complex providers may fail here. c) All providers should always return APR_NOTFOUND from retrieve() and delete() when data is not found. Currently at least the shmcb provider returns APR_EGENERAL in this case which makes it impossible to distinguish the not found case from serious errors. Another minor problem is that many of the error messages in the providers betray their mod_ssl origins in their error messages, such as SSLSessionCache: Failed to Create Server and so forth. Following my instinct that some users may not care about the caching/expiry side of things, I think allowing expiry = 0 to mean retain as long as possible would be useful. The namespace and hints arguments to the init() call are somewhat underused and also rather specific to the existing providers. I briefly thought I might be able to pass reslist min/max values in the hints but the ap_socache_hints structure isn't the right place; instead they'd need to be packed into the single string argument passed to create(). At the moment the memcached provider just hard-codes these values; so does my ZookKeeper provider. I wonder if there's a way to open this up a little and make per-provider-instance configuration easier, but I don't have a specific idea here. In a related vein, I think a naive user is going to invoking create() and init() at the right time a little tricky. The create() calls usually create an instance structure and parse arguments, but don't otherwise initialize. In order for their messages to make it to the console at startup time, one needs to invoke create() in the check_config phase, not post_config, since by then stderr is redirected to the logs. In mod_ssl, create() is called with s-process-pool, which means that if a provider is later unloaded, the structures it allocates in create() remain around forever. Global mutexes are also created out of s-process-pool, and similarly remain around indefinitely. In mod_shmap I use pconf exclusively to try to iron out these issues. Meanwhile, init() should ideally be called only during the second and subsequent configuration pass, so you need to do some magic with userdata in
Re: AuthzMergeRules directive
Brad Nicholes wrote: So what I am really trying to say is that intra-block logic and inter-block logic as far as merging goes, are tied together. If we want to change the way that the logic of two block is merged, we would also have to change the base state of each independent block. It's all or nothing. This would affect how the following block is evaluated: Directory /foo require user joe require user sue /Directory As it currently stands, the logic when combining these two rules would be OR. If we make the change, this would also change the same configuration to use AND instead. I think we determined that this logic would be more secure anyway even if it is different than 2.2.x. Well, I suppose the absolutely key thing is to set the default AuthzMergeRules state to Off. Next up, I guess try changing the On merge state to AND and we'll do some thinking and testing at that point. It does look to me like the pre-2.4 intra-block logic was OR, but I don't know how widely people would have depended on that (as in your example above). My gut instinct is that we'll wind up having to replicate OR within blocks, but implement AND between blocks, to achieve both backwards-compatibility and good security. As a first step, though, I'd suggest just making the two changes to the existing defaults (AuthzMergeRules Off as default, and On = AND) and we'll review again at that point. For my part I hope to deal with a parallel, unrelated, and hopefully uncontroversial set of authn/z edits once SVN returns, namely changing all the hard-coded 0 provider versions to use AUTHN/Z_PROVIDER_VERSION macros. That's a simple first step toward the work outlined in this thread: http://mail-archives.apache.org/mod_mbox/httpd-dev/200804.mbox/[EMAIL PROTECTED] Thanks again, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r654797 - in /httpd/httpd/branches/2.2.x: CHANGES STATUS docs/manual/mod/mod_headers.xml modules/metadata/mod_headers.c
Jim Jagielski wrote: Add in r568323 and 568879. The approved patch lacked updates to the doccos and so really shouldn't have been approved as is, but what the heck, so I went ahead and pulled the doccos changes from the orig commit anyway. Also, since this is a userland change, it should really have a CHANGES entry too, so I added one of those as well :) Thanks -- apologies for not doing this too well. I'd caught r568879 and put that into the proposed 2.2.x patch, but missed the necessary CHANGES and doc alterations. Oops, and thanks for catching these and the CHANGES for mod_authn_dbd. I'll try to be more attentive in the future. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules directive
Brad Nicholes wrote: I finally got around to making the switch so that the default merge rule is AND rather than OR. However after making the switch, it occurred to me that since the default rule is AND now, the AuthzMergeRules default should remain ON. Otherwise the rule inheritance won't happen by default which would leave authentication holes in sub-directories. I think with the default being changed to AND, authz should be behaving as discussed on this thread. Thanks, much appreciated. I'll try to set up some tests and take a look as soon as I can. It's been a while since I thought about this stuff, but I think the reason I was keen to make the AuthzMergeRules default off was that it more closely emulated the pre-2.4 behaviour, so that people wouldn't be surprised to discover their 2.2 configurations weren't working as expected. Combined with a default OR rule that might have led, I thought, to unanticipated security holes -- users given access to a subdir with it's own authz config because the OR with the parent dir short- circuited the subdir's authz. Using AND as the default rule will at a minimum close off that problem. My hunch (absent any testing; sorry!) is that a default AND will mean such subdirs are sometimes just not available after an upgrade to 2.4 (assuming no authz config changes are made by someone who doesn't read the release notes) because users won't have access to both the parent dir and the subdir. In 2.2, they'd be authorized against just the subdir's config; here, they'll need to pass the parent's too. (I think.) Anyway, I'll try to work in some testing in the next week or two. Regardless, this change closes a big security problem for quick-and-dirty upgraders, I believe. Thanks again, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r705361 - in /httpd/httpd/trunk/modules/aaa: mod_authz_dbd.c mod_authz_dbm.c mod_authz_groupfile.c mod_authz_owner.c mod_authz_user.c
Eric Covener wrote: Authorization in LDAP has a special path for when authentication wasn't handled by mod_authnz_ldap, but r-user still may be mappable to an DN on the LDAP server. Net, it can't do anything useful without r-user. This short-circuit should be possible well before the problematic functions you mention. OK, I'll look at adding the same !r-user test, unless others think it's unnecessary. It looks like it might be necessary to me, but I'm not 100% sure. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules blocks everything in default configuration
Dan Poirier wrote: I like the idea of replacing ON with AND and OR. It would not only provide more control, but make it explicit what kind of merging was going to happen. I have mixed thoughts about changing the default to OFF. Cons: That would mean every container directive would have to specify some sort of access control (or at least AuthzMergeRules AND) or it'd be wide open, right? I don't think so; at least, that's not what I was intending. Rather, something much like 2.2's behaviour: containers that don't specify any authz are simply protected by the nearest container merged ahead of them that does specify authz. I'm hoping to put this thread to bed shortly with the patches available here: http://people.apache.org/~chrisd/patches/httpd_authnz_configs/ My intent is to finish up the necessary documentation changes and get everything committed to trunk in the next few days. (Fingers crossed!) In the meantime, an overview follows. Many, many thanks are due to Brad Nicholes, whose massive refactoring of the authn/z system makes all of this work possible. 1) Limit and LimitExcept are made nestable, with an error in the case where all methods are configured out. There are also some tuneups related to Limit/LimitExcept being intended to contain authz configurations only and to not be functional outside Directory/ Location/etc. 2) A setting of AuthType None is allowed, which sets ap_auth_type() to NULL and thus provides a way to turn off authentication for a sub-directory. This corresponds to several convenient ways in 2.4 to turn off authorization, including Require all granted (and, at a deeper level, the new SatisfySections Off). 3) The mod_authz_core.c module is rewritten to attempt to deal with the issues discussed on this thread and the previous one, as well as those described at the end of this email. The authz_provider_list two-pronged linked lists are replaced by a tree structure that mirrors what is configured via SatisfyAll and SatisfyAny. A pair of negative authz containers are introduced, SatisfyNotAll and SatisfyNotAny, which negate their operands in the same manner as Reject. Thus we have the following table: RequireA Reject !A SatisfyAll (A B ...) SatisfyAny (A || B || ...) SatisfyNotAll !(A B ...) SatisfyNotAny !(A || B || ...) The SatisfyAny directive is renamed from SatisfyOne so as not to imply XOR-like functionality (requiring exactly one successful operand). A number of configuration-time checks are implemented to warn administrators regarding redundant or non-functional authz configurations. In particular, since the negative authz directives can not contribute meaningfully to OR-like blocks, as they can only supply neutral (AUTHZ_NEUTRAL) or false (AUTHZ_DENIED) values, they are simply not allowed in these containers. (The code should support them, though, if this check is ever removed.) Similarly, AND-like blocks without only negative authz directives also produce a configuration-time error. The MergeAuthzRules directive is renamed SatisfySections and take three possible values, Off, All, and And. The default is Off, meaning that as directory configuration sections are merged, new authz configurations replace previously merged ones. However, a directory section may specify SatisfySections All to force its predecessor's authz to be successful as well as its own. The SatisfySections Any option permits either the predecessor or current section's authz to grant the user access. Note that the setting of SatisfySections continues to be local only to the directory section it appears in; it is not inherited to subsequent sections as they are merged. The default setting of SatisfySections is Off, corresponding to traditional pre-2.4 authz logic. Within a directory section, the default logic corresponds to an AND-like block (i.e., SatisfyAll), which differs from the pre-2.4 logic whereby the first Require statement to succeed authorized the request. Legacy 2.2 configurations should, I hope, work with few or no changes as a result of these revisions. Few administrators, I hope, have configurations with multiple Require directives in a section; e.g.: Directory /foo Require group shirt Require group shoes /Directory If they do, these would need to be revised to either place all the items in a single Require directive (e.g., Require group shirt shoes) or to use a SatisfyAny section. I feel this makes the overall intent of the configuration directives clearer, since it is not apparent that the example above grants access to members of either group, not just those who are members of both. It also means that the following 2.4-style configuration makes intuitive sense, because the negative Reject directive
Re: AuthzMergeRules blocks everything in default configuration
Eric Covener wrote: I tend to prefer something closer to the old name, especially with the Satisfy containers being optional. IOTW the sections here may not be something explicit the user can look back to. (Maybe IOW, in other words? :-) They should be: since Directory and friends all function as SatisfyAll, everything is always a section of some type. I could live with SatisfyOnMerge or something similar, but I'd like Satisfy in there somewhere to clarify the relationship with both the legacy Satisfy and the new Satisfy* blocks. Directory /foo Require group shirt Require group shoes /Directory More common migration breakage might be using two different authz providers there. Yes, and in that case, adding a SatisfyAny would be needed. Or, changing the default, which is simple. I'm inclined to commit as written, and if it looks like this is a sticking point, change it. That should be trivial, since SatisfySections Any creates exactly the same thing (during a per-dir merge), so the code paths are all tested already. When do we need the negative satisfy containers? I'm having a hard time with this one, especially with how negating Require/Reject are interpreted inside -- it almost seems like the containers should be dictating the verbs and you should just have some more agnostic match-type operator with the authz providers. I suppose the most likely case might be something like: Directory /humans_and_friendlies_only Require valid-user SatisfyNotAll Require group alien SatisfyAny Require group hostile Require group neutral Require group noninterventionist /SatisfyAny /SatisfyNotAll /Directory so aliens are OK, unless they're in one or more of a set of other groups. Presumably you can't just say Reject group alien at the top level because some aliens are OK, nor can you say Reject group hostile at the top level, because some hostiles are human. Probably makes the most sense with complex LDAP setups where people belong to lots of overlapping categories. I think negated sections are pretty clear and have some utility; at a minimum, providing the ability to negate a single match (i.e., Reject) but no equivalent negation operator at a higher level makes for some odd constructions which can be more clearly expressed with higher-level negation. It's the equivalent of negated parenthetical constructions in logical expressions, like !(A B). As you note, it's a little counter-intuitive to be using Require/Reject in these locations -- something like Match and NotMatch might make everything clearer. I'm happy to consider such a thing; I think I'll get the commit in first and then we can adjust further. The main reason to keep Require, of course, is backward compatibility. But we could set that up to work like Match, say, and throw an error unless it's used in a naked context (outside of any Satisfy*). That would be easy, and then for configurations with Satisfy* blocks, one would use Match/NotMatch inside of them. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r709708 - in /httpd/httpd/trunk: include/ap_mmn.h include/mod_auth.h modules/aaa/mod_authz_owner.c
Ruediger Pluem wrote: IMHO this requires a major bump (no problem on trunk) and not only a minor one. Thanks for catching that and the missing apr_pcalloc() in mod_authz_core.c! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r709553 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_authn_core.xml modules/aaa/mod_authn_core.c
Ruediger Pluem wrote: could you please split such changes into atomic commits? One issue - one commit. You also committed docs changes you didn't mention in the log message. I'm not sure if the crash fix shouldn't go into CHANGES. Plus it makes it really hard to read the diffs if you mix formating changes with functional changes. Please separate them. Yes, my apologies for that. I confess I'd hoped to cut a couple of corners because these files exist only in trunk, I wasn't planning any backports, and there seemed to be such a remarkably low level of interest in the trunk authn/z stuff. Still, I should have done a better job. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules blocks everything in default configuration
Dan Poirier wrote: I'd find it much easier to understand if we had fewer directives, and just built up the more complicated ideas by writing boolean expressions, which most of us already know how to cope with. Perhaps, and the underlying code should support a range of alternative configuration schemes; if someone wants to add an expression parser, that should be feasible. My own perspective was that I wanted to satisfy number of goals, in descending order of priority: First, and most important, I wanted to ensure default 2.2-style authz. Imagine administrating a large mass virtual hosting service whose customers have uploaded thousands of .htaccess files, and trying to upgrade to 2.4. Clearly, those .htaccess files need to work exactly as before. Even if we supplied a batch conversion script that could find and auto-upgrade them, customers would later upload their own private copies of their old files, thus inadvertently breaking their sites. So, that meant an OR-like context for Require directives, and no merging of authz configurations by default. This whole thread started because I was trying mod_authz_dbd and noticed it didn't work as expected because AuthzMergeRules was On (i.e., OR) by default. (In my previous message I described switching to an AND-like default context for Require directives, but realized before committing that that would break with this prime directive of backwards-compatibility.) So, if people could please test with 2.2-style authz configurations and make sure everything works as expected, that would be superb. Second, I wanted to simplify things a little. The revised mod_auth.h and mod_authz_core.c have shrunk a little, and I felt I could remove the default authn/z modules. (Having both core and default modules for authn and authz, any of which could be compiled out, seemed a likely source of trouble.) Third, while looking into how mod_authz_core worked, I found some ways to configure it that would cause unexpected results, and while trying to fix those came to the conclusion I needed to start over with a tree-based implementation. Doing that and working through the implications of a non-Boolean tri-state logic (where negating a false value results in a neutral, not true, value) I found that adding negated authz container directives was something that just fell out naturally. Finally there was a certain amount of bike-shed re-painting in the form of renaming configuration directives. I settled on Match, MatchAll, etc. based on Eric Covener's comments. If people dislike those names, please jump in and change them. Or if most folks think we'd be better off without authz containers entirely, please vote for the following patch, which simply turns all that stuff off, leaving (I hope) a fairly clean core authz implementation that supports default 2.2-style behaviour and is extensible down the road, should that be desired. Chris. Index: mod_authz_core.c === --- mod_authz_core.c(revision 710120) +++ mod_authz_core.c(working copy) @@ -405,6 +405,7 @@ return NULL; } +#ifdef AUTHZ_CORE_CONTAINERS static const char *add_authz_section(cmd_parms *cmd, void *mconfig, const char *args) { @@ -534,6 +535,7 @@ return NULL; } +#endif /* AUTHZ_CORE_CONTAINERS */ static int authz_core_check_section(apr_pool_t *p, server_rec *s, authz_section_conf *section, int is_conf) @@ -634,6 +636,7 @@ specifies legacy authorization directives of which one must pass for a request to suceeed), +#ifdef AUTHZ_CORE_CONTAINERS AP_INIT_RAW_ARGS(Match, add_authz_provider, NULL, OR_AUTHCFG, specifies authorization directives that must pass (or not) for a request to suceeed), @@ -656,6 +659,7 @@ controls how a Directory, Location, or similar directive's authorization directives are combined with those of its predecessor), +#endif /* AUTHZ_CORE_CONTAINERS */ {NULL} };
mod_unixd troubs?
Hi -- I've been trying to get trunk to compile and run today, and if I compile it without mod_unixd (and with the worker MPM) it compiles and run, but then logs Server MUST relinquish startup privileges ... and exits. If I try to compile mod_unixd, I get compile-time warnings about conflicts with os/unix/unixd.h and unixd.c. The following hack gets it to compile without mod_unixd (not the intention, I understand) and run without exiting at startup. I'm guessing the goal is to remove os/unix/unixd.h and unixd.c? I tried that, naively, and found that the MPM wouldn't compile because it uses #include unixd.h in mpm.h, etc. Chris. Index: os/unix/unixd.c === --- os/unix/unixd.c (revision 710182) +++ os/unix/unixd.c (working copy) @@ -246,6 +246,8 @@ unixd_config.chroot_dir = NULL; /* none */ +sys_privileges_handlers(1); + /* Check for suexec */ unixd_config.suexec_enabled = 0; if ((apr_stat(wrapper, SUEXEC_BIN,
Re: AuthzMergeRules blocks everything in default configuration
Ruediger Pluem wrote: I was hoping that your patches would fix this, but sadly they did not. Ironically, the problem appears to have little to do with authz, but rather authn. The test httpd logs show it's failing to find an htpasswd-type file in which to check the user's login and password. That's because there's no AuthBasicProvider in the test config, and the code -- since way back, I think -- defaults to the file authn provider. Looking back in SVN it seems like that should have been the behaviour for quite a number of years, but I confess I didn't dig too deeply. Nor did I try out the test suite with 2.2.x to see if it succeeds as-is, and if so, why. If it does, my hunch would be that it succeeds because the suite doesn't actually use the normal authn/z providers, but rather supplies its own module called mod_authany.c. That module contains two different alternative set of functions based on an #if AP_MODULE_MAGIC_AT_LEAST(20060110, 0) which corresponds to, I think, an attempt to just get it to compile after the authn/z refactoring in trunk. The log comment for r375595 is: - attempt to adapt for trunk aaa changes; this doesn't work but it does at least compile - not sure whether the problem is in this code or the aaa code. At any rate, my guess would be that it works (if it does) with 2.2.x because the module supplies its own raw check_user_id (i.e., authn) and auth_checker (i.e., authz) hook functions which run as APR_HOOK_FIRST and probably bypass all the stuff in the usual APR_HOOK_MIDDLE hook functions of mod_auth_basic and mod_authz_file. So the fact that it doesn't specify any AuthBasicProvider never comes up because it bypasses mod_auth_basic entirely. Just a guess. At any rate, the patch below makes it run with trunk, and properly too, in the sense that it uses mod_authn_core and mod_authz_core to do the heavy lifting and just supplies a tiny pair of authn/z providers. But, this isn't really a perfect solution because it's not really correct to put the authn provider into the AP_MODULE_MAGIC_AT_LEAST(20060110, 0) block, since it doesn't (I think) have anything in particular to do with that change. Sorry not to spend more time on this. Chris. Index: mod_authany.c === --- mod_authany.c (revision 710145) +++ mod_authany.c (working copy) @@ -5,6 +5,7 @@ require user any-user AuthType Basic AuthName authany + AuthBasicProvider any /Location #endif @@ -19,6 +20,18 @@ #include ap_provider.h #include mod_auth.h +static authn_status authn_check_password(request_rec *r, const char *user, + const char *password) +{ +return strtrue(r-user) strcmp(r-user, guest) == 0 +? AUTH_GRANTED : AUTH_DENIED; +} + +static const authn_provider authn_any_provider = +{ +authn_check_password +}; + static authz_status any_check_authorization(request_rec *r, const char *requirement) { @@ -28,11 +41,13 @@ static const authz_provider authz_any_provider = { -any_check_authorization, +any_check_authorization }; static void extra_hooks(apr_pool_t *p) { +ap_register_provider(p, AUTHN_PROVIDER_GROUP, + any, 0, authn_any_provider); ap_register_provider(p, AUTHZ_PROVIDER_GROUP, user, 0, authz_any_provider); }
Re: svn commit: r709839 - in /httpd/httpd/trunk: ./ build/ modules/aaa/modules/arch/netware/ os/netware/ os/win32/
Brad Nicholes wrote: I haven't tried out the new authnz directives yet, but it at least builds on NetWare. Thanks -- glad to know it compiles, at least! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules blocks everything in default configuration
Roy T. Fielding wrote: IIRC, trunk contains (or contained) a security problem with regard to backward compatibility with 2.x configs. I won't consider it releasable until that has been fixed one way or another, and I can't tell from this mail thread whether the actual fix was committed or not. This posting might answer some of your questions: http://marc.info/?l=apache-httpd-devm=122573959206980w=2 Yes, the fix was committed, and the current intention is that 2.2 configurations should be useable as-is with trunk, without changes. I don't think I can *promise* that's working as intended, but that's the idea; if you encounter bugs, please report them! AuthzMergeRules Off still appears in docs/conf/httpd.conf:162-167 even though it doesn't appear to be a valid config directive either. This should be gone in httpd.conf.in; there's no httpd.conf in SVN. Is there any chance you need to refresh from SVN trunk? The docs seem to indicate this is now MergeAuthz Off and is off by default? Is that true in the code? Yes, that's the intention; again, please report bugs or backwards- compatibility problems. (why those directive names are Match* instead of AuthMatch* boggles my mind). What is the conclusion to this thread? Why are all the Authz directives given random names? Am I the only one that finds this feature set impossible to follow? Please see the final set of comments in the posting I linked to above, and the patch in that posting as well. Just to reiterate, what I wrote there was: Finally there was a certain amount of bike-shed re-painting in the form of renaming configuration directives. I settled on Match, MatchAll, etc. based on Eric Covener's comments. If people dislike those names, please jump in and change them. Or if most folks think we'd be better off without authz containers entirely, please vote for the following patch, which simply turns all that stuff off, leaving (I hope) a fairly clean core authz implementation that supports default 2.2-style behaviour and is extensible down the road, should that be desired. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: AuthzMergeRules blocks everything in default configuration
Hi -- Eric Covener wrote: I had meant iif containers are used, I'd like their name to communicate the require or reject part while the authz providers would be match-like (because the Require on the inside is confusing when surrounted by all the variations) Yes, I thought that was a good point; my further thought was that the container names can't imply require/reject either though, because they can be nested and so their meaning can be inverted if they're contained in a negated context. Roy T. Fielding wrote: But we are already using *Match all over the place to indicate the use of regex matching. :( These are good points; I hadn't thought of the overlap with LocationMatch and friends. A lot of the other obvious access-control-related words and terms are also already in use, especially for older authorization directives (e.g., Allow, Deny, Order, Limit, Require, Satisfy, etc.) In order to avoid confusion, we should probably stay away from all of these too. Perhaps something like Check or Test would suffice, maybe prefixed with Authz? Hopefully someone else has a good idea, or at least stronger opinions. :-) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r724745 - in /httpd/httpd/trunk: include/ap_socache.h modules/ssl/ssl_scache.c
Joe Orton wrote: * include/ap_socache.h: Use C++ safety wrappers, and rename -delete to -remove since the former is a C++ reserved word. Thanks again for the socache refactoring! I've been trying to keep these two modules up-to-date with both the socache stuff and Apache ZooKeeper (http://hadoop.apache.org/zookeeper/) releases: http://people.apache.org/~chrisd/projects/shared_map/ These were conceived as a way to set up a simple test harness for all the socache providers, and also to write one for something interesting outside of the standard SSL cache providers. With a config like this: LoadModule shmap_module modules/mod_shmap.so LoadModule socache_zookeeper_module modules/mod_socache_zookeeper.so SharedMapProvider zookeeper localhost:7000 SetHandler shmap-handler and ZooKeeper listening on localhost:7000 one can get and put stuff into ZK with simple GET/PUT/DELETE commands. Other providers can be swapped in or mapped to various sub-directories: Location /shm SharedMapProvider shmcb /home/chrisd/work/install/dlm1/var/httpd/shm /Location A couple of things came up while I was writing these modules which I think it would be great to tackle before the socache API is locked for 2.4: - have all providers consistently return APR_NOTFOUND from retrieve() when an item is not found - pass a pool argument for temporary allocations to store() and remove() - return an apr_status_t instead of void from remove() and maybe also destroy() And -- I know, I know, I'm a glutton for punishment when it comes to naming issues -- I still feel that socache as a name for this API does it an injustice, since it can be used for a lot more than just a small-object cache. The ZK module, at a minimum, I think shows how you could use it for something that's not a cache at all, and not necessarily for small objects either. Something more generic to the idea of a common, shared key/value storage API seems like a good idea to me. But if no one sees this as a concern, I'll shut up forever about it. I promise. :-) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Authz directives
Roy T. Fielding wrote: I totally understand the desire to make the implementation more modular and to make a more sensible Satisfy logic, but I don't understand the need for Match (as opposed to just extending Require) and the odd changes in defaults (multiple Require defaults to MatchAny semantics, but multiple Match defaults to MatchAll). Match came up only because negated containers cause the meaning of a plain old Require to be inverted and to have the opposite meaning from what it appears. If we drop negated containers that problem goes away. The negated containers, as I mentioned previously, were just something which fell out easily from the refactoring; they're hardly worth keeping if they just cause trouble. 1) make the new directives self-documenting remove MatchNotAll (nobody needs this) s/MatchAny/RequireAny/ig; s/MatchAll/RequireAll/ig; s/MatchNotAny/RequireNone/ig; s/(MergeAuthz|AuthzMerge)/AuthMerging/ig; (off | and | or) 2) move new Match functionality to Require 3) default for multiple Require* is RequireAny - implies that Require and Require not are only mixed when used within a RequireAll or RequireNone container. This is all fairly simple, I think, especially if MatchNotAny/RequireNone is removed as well so that Require retains its apparent meaning everywhere. See if the patch below meets your expectations; if so, I'll commit it and update the docs. The next item on the agenda, then, is probably PR #46214, i.e., getting the various providers to quiet down their log messages. Chris. = --- modules/aaa/mod_authz_core.c(revision 724744) +++ modules/aaa/mod_authz_core.c(working copy) @@ -52,17 +52,19 @@ #define FORMAT_AUTHZ_COMMAND(p,section) \ ((section)-provider \ - ? apr_pstrcat((p), \ + ? apr_pstrcat((p), Require , \ ((section)-negate ? \ -Match not : Match ),\ +not : ), \ (section)-provider_name, , \ (section)-provider_args, NULL) \ - : apr_pstrcat((p), Match, \ + : apr_pstrcat((p), Require, \ ((section)-negate ? Not : ),\ (((section)-op == AUTHZ_LOGIC_AND)\ ? All : Any), \ - , NULL))\ + , NULL)) +#undef AUTHZ_NEGATIVE_CONFIGS + typedef struct provider_alias_rec { char *provider_name; char *provider_alias; @@ -95,7 +97,6 @@ struct authz_core_dir_conf { authz_section_conf *section; authz_logic_op op; -int old_require; authz_core_dir_conf *next; }; @@ -314,12 +315,11 @@ return errmsg; } -static authz_section_conf* create_default_section(apr_pool_t *p, - int old_require) +static authz_section_conf* create_default_section(apr_pool_t *p) { authz_section_conf *section = apr_pcalloc(p, sizeof(*section)); -section-op = old_require ? AUTHZ_LOGIC_OR : AUTHZ_LOGIC_AND; +section-op = AUTHZ_LOGIC_OR; return section; } @@ -331,21 +331,9 @@ authz_section_conf *section = apr_pcalloc(cmd-pool, sizeof(*section)); authz_section_conf *child; -if (!strcasecmp(cmd-cmd-name, Require)) { -if (conf-section !conf-old_require) { -return Require directive not allowed with - Match and related directives; -} - -conf-old_require = 1; -} -else if (conf-old_require) { -return Match directive not allowed with Require directives; -} - section-provider_name = ap_getword_conf(cmd-pool, args); -if (!conf-old_require !strcasecmp(section-provider_name, not)) { +if (!strcasecmp(section-provider_name, not)) { section-provider_name = ap_getword_conf(cmd-pool, args); section-negate = 1; } @@ -377,7 +365,7 @@ section-limited = cmd-limited; if (!conf-section) { -conf-section = create_default_section(cmd-pool, conf-old_require); +conf-section = create_default_section(cmd-pool); } if (section-negate conf-section-op == AUTHZ_LOGIC_OR) { @@ -416,12 +404,6 @@ apr_int64_t old_limited = cmd-limited; const char *errmsg; -if (conf-old_require) { -return apr_pstrcat(cmd-pool, cmd-cmd-name, -directive not allowed with - Require directives, NULL); -} - if (endp == NULL) { return apr_pstrcat(cmd-pool, cmd-cmd-name, directive missing closing '', NULL); @@ -437,13 +419,14 @@ section = apr_pcalloc(cmd-pool, sizeof(*section)); -if (!strcasecmp(cmd-cmd-name, MatchAll)) { +if (!strcasecmp(cmd-cmd-name, RequireAll)) { section-op = AUTHZ_LOGIC_AND; } -
mod_fcgid incubation?
Hi -- As Paul Querna noted recently, some folks are using mod_fcgid these days instead of mod_fastcgi, in part because it was (I believe) the first of the two to work with httpd 2.2. Unfortunately, the original developer of mod_fcgid, Pan Qingfeng, has largely moved on to other things. He and I have been in touch lately about long-term maintenance of mod_fcgid. (We've helped by a colleague of mine who can translate fluently between Chinese and English, although Pan Qingfeng's English is quite good as well.) In a pinch, I've offered to help get releases out and review some of the queued-up patches contributed by various folks. I also asked, though, whether he'd be interested in contributing mod_fcgid to the ASF as, perhaps, a sub-project of httpd, and received a very positive reply: 很高兴认识你们。我开发这个模块的目的也是想尽可能多的人可以用上它,所以 把它交给ASF完全没有问题(实际上我也很希望可以交给ASF),过程中有任何问题我都 可以帮忙的。 which my colleague translates as follows: I am very glad to know you guys. I did this module and hoped more people could use it. There is no problem at all for me to donating the module to ASF (actually, I am glad to do so). If there is anything I can do to help, please feel free to let me know. So, I said I'd raise the question on the httpd dev list. Since I'm a neophyte to such things, and also not a lawyer, I assume this would need to be voted on somehow, and then (if accepted) moved into the incubation stage until legal issues were resolved. One key question I have (jumping ahead a little) is whether everyone who has contributed a patch to the project needs to contacted and a signed contributor agreement recovered from them. If not, then I would think that we'd just need a CLA from Pan Qingfeng to proceed (and perhaps, in that case, we could skip the incubator stage, as Apache ZooKeeper recently did). If we do need CLAs from all patch contributors, that might take more time. Fortunately, there's a good change log for the project with names and email addresses; it includes some of the usual suspects (like Paul Querna and myself) but also a number of other folks we'd need to track down. I know there's mod_proxy_fcgi in httpd already, but among other things, it only handles the FCGI_RESPONDER role. It might be very nice to be able to provide the FCGI_AUTHORIZER functionality as well either via a separate module based on mod_fcgid, or through code merged into mod_proxy_fcgi from that project. (The FCGI_FITLER role seems to be unsupported in both mod_fcgid and mod_fastcgi; looks like that's a job for the future, one way or another.) At any rate, if people wouldn't mind chewing this over for a bit, I'd really appreciate it. The project is available from SourceForge under the GPL version 2.0. (Unfortunately, the actual code files are largely missing the legal headers, but the intent is clear; I'm sure this is an oversight due to [human] language barriers.) See the following for info, code, and my own pending patches: http://fastcgi.coremail.cn/ http://sourceforge.net/projects/mod-fcgid http://people.apache.org/~chrisd/patches/mod_fcgid/ Thanks very much in advance for any advice people can offer, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Authz directives
Hi -- This is all fairly simple, I think, especially if MatchNotAny/RequireNone is removed as well so that Require retains its apparent meaning everywhere. See if the patch below meets your expectations; if so, I'll commit it and update the docs. Sorry, here's a slightly updated one without any whitespace issues; it should apply cleanly against the latest SVN trunk as of a few minutes ago. Chris. = --- modules/aaa/mod_authz_core.c(revision 724813) +++ modules/aaa/mod_authz_core.c(working copy) @@ -52,17 +52,19 @@ #define FORMAT_AUTHZ_COMMAND(p,section) \ ((section)-provider \ - ? apr_pstrcat((p), \ + ? apr_pstrcat((p), Require , \ ((section)-negate ? \ -Match not : Match ),\ +not : ), \ (section)-provider_name, , \ (section)-provider_args, NULL) \ - : apr_pstrcat((p), Match, \ + : apr_pstrcat((p), Require, \ ((section)-negate ? Not : ),\ (((section)-op == AUTHZ_LOGIC_AND)\ ? All : Any), \ , NULL)) +#undef AUTHZ_NEGATIVE_CONFIGS + typedef struct provider_alias_rec { char *provider_name; char *provider_alias; @@ -95,7 +97,6 @@ struct authz_core_dir_conf { authz_section_conf *section; authz_logic_op op; -int old_require; authz_core_dir_conf *next; }; @@ -314,12 +315,11 @@ return errmsg; } -static authz_section_conf* create_default_section(apr_pool_t *p, - int old_require) +static authz_section_conf* create_default_section(apr_pool_t *p) { authz_section_conf *section = apr_pcalloc(p, sizeof(*section)); -section-op = old_require ? AUTHZ_LOGIC_OR : AUTHZ_LOGIC_AND; +section-op = AUTHZ_LOGIC_OR; return section; } @@ -331,21 +331,9 @@ authz_section_conf *section = apr_pcalloc(cmd-pool, sizeof(*section)); authz_section_conf *child; -if (!strcasecmp(cmd-cmd-name, Require)) { -if (conf-section !conf-old_require) { -return Require directive not allowed with - Match and related directives; -} - -conf-old_require = 1; -} -else if (conf-old_require) { -return Match directive not allowed with Require directives; -} - section-provider_name = ap_getword_conf(cmd-pool, args); -if (!conf-old_require !strcasecmp(section-provider_name, not)) { +if (!strcasecmp(section-provider_name, not)) { section-provider_name = ap_getword_conf(cmd-pool, args); section-negate = 1; } @@ -377,7 +365,7 @@ section-limited = cmd-limited; if (!conf-section) { -conf-section = create_default_section(cmd-pool, conf-old_require); +conf-section = create_default_section(cmd-pool); } if (section-negate conf-section-op == AUTHZ_LOGIC_OR) { @@ -416,12 +404,6 @@ apr_int64_t old_limited = cmd-limited; const char *errmsg; -if (conf-old_require) { -return apr_pstrcat(cmd-pool, cmd-cmd-name, -directive not allowed with - Require directives, NULL); -} - if (endp == NULL) { return apr_pstrcat(cmd-pool, cmd-cmd-name, directive missing closing '', NULL); @@ -437,13 +419,14 @@ section = apr_pcalloc(cmd-pool, sizeof(*section)); -if (!strcasecmp(cmd-cmd-name, MatchAll)) { +if (!strcasecmp(cmd-cmd-name, RequireAll)) { section-op = AUTHZ_LOGIC_AND; } -else if (!strcasecmp(cmd-cmd-name, MatchAny)) { +else if (!strcasecmp(cmd-cmd-name, RequireAny)) { section-op = AUTHZ_LOGIC_OR; } -else if (!strcasecmp(cmd-cmd-name, MatchNotAll)) { +#ifdef AUTHZ_NEGATIVE_CONFIGS +else if (!strcasecmp(cmd-cmd-name, RequireNotAll)) { section-op = AUTHZ_LOGIC_AND; section-negate = 1; } @@ -451,6 +434,7 @@ section-op = AUTHZ_LOGIC_OR; section-negate = 1; } +#endif conf-section = section; @@ -473,7 +457,7 @@ authz_section_conf *child; if (!old_section) { -old_section = conf-section = create_default_section(cmd-pool, 0); +old_section = conf-section = create_default_section(cmd-pool); } if (section-negate old_section-op == AUTHZ_LOGIC_OR) { @@ -521,15 +505,15 @@ if (!strcasecmp(arg, Off)) { conf-op = AUTHZ_LOGIC_OFF; } -else if (!strcasecmp(arg, MatchAll)) { +else if (!strcasecmp(arg, And)) { conf-op = AUTHZ_LOGIC_AND; } -else if (!strcasecmp(arg, MatchAny)) { +else if (!strcasecmp(arg, Or)) { conf-op = AUTHZ_LOGIC_OR; } else { return apr_pstrcat(cmd-pool, cmd-cmd-name, must be one of: -
Re: mod_fcgid incubation?
William A. Rowe, Jr. wrote: The mod_fastcgi implementation has the following terms; Open Market permits you to use, copy, modify, distribute, and license this Software and the Documentation solely for the purpose of implementing the FastCGI specification defined by Open Market or derivative specifications publicly endorsed by Open Market and promulgated by an open standards organization and for no other purpose, provided that existing copyright notices are retained in all copies and that this notice is included verbatim in any distributions. No written agreement, license, or royalty fee is required for any of the authorized uses. Modifications to this Software and Documentation may be copyrighted by their authors and need not follow the licensing terms described here, but the modified Software and Documentation must be used for the sole purpose of implementing the FastCGI specification defined by Open Market or derivative specifications publicly endorsed by Open Market and promulgated by an open standards organization and for no other purpose. If modifications to this Software and Documentation have new licensing terms, the new terms must protect Open Market's proprietary rights in the Software and Documentation to the same extent as these licensing terms and must be clearly indicated on the first page of each file where they apply. I'd call that a category X license, submission denied. So far as I know -- I'll check with the author -- mod_fcgid is a completely separate implementation from mod_fastcgi. I don't know of any generally shared or derived code, but I will check. The exception, I think, might be the FCGI protocol itself, which specifies the byte-level structure of the headers that are passed back and forth during communication with FCGI application processes. This definition lives in fcgi_protocol.h in both mod_fastcgi, and appears in very similar forms in both mod_fcgid's fcgi_protocol.h and also httpd's modules/proxy/fcgi_protocol.h. Look for the fcgi_header or FCGI_Header structure, and the various FCGI_* #defines for type values, roles, etc. It's a little hard for me to see how alternate implementations of the FastCGI spec could get away without doing something like this at a minimum, though. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
mod_fcgid license questions
Hi -- I believe Pan Qingfeng (潘庆峰), the developer of mod_fcgid, has joined this list for the time being while the possibility of mod_fcgid becoming project in the Apache incubator is discussed. I'll use his English name of Ryan Pan from here on. I asked Ryan to join so that he could answer the couple of questions regarding the origin of the mod_fcgid code and his licensing intentions which have come up so far, and also to thank him publicly for being willing to consider contributing mod_fcgid to the ASF. Ryan, the main question which has come up in the last couple of days seems to be this one: When you wrote mod_fcgid, was there any code which you borrowed from mod_fastcgi? The other questions I had related to the existing license for mod_fcgid: Your current intention is for mod_fcgid to be available under the GPL version 2.0, correct? Could you confirm that you wanted the GPL to apply to all the mod_fcgid code? (I ask because the LICENSE file in mod_fcgid contains the GPL 2.0, however, the .c and .h files don't also include the usual GPL text.) Finally, Ryan, would you mind re-stating for the record your interest in the idea of mod_fcgid becoming an Apache project? I'd like to personally extend my thanks to Ryan for developing mod_fcgid in the first place, for his interest in the idea of contributing it to the ASF, and for being willing to work through the licensing issues that will involve. Many thanks! (I'd also like to thank my colleague Sharon or Xiaomei Ma (笑梅), an excellent developer in her own right, for her help translating some of the communications Ryan and I have already had.) Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Authz directives
Roy T. Fielding wrote: I don't see a problem with RequireNone inverting the logic, and I think it would actually be useful for blocking a set of bad clients. Is it difficult to include that without MatchNotAny? Not at all difficult; trivial, in fact. The only reason I took it out as well was the name inversion issue. I'll put it back in and commit tomorrow. Might take a few days to deal with the doc adjustments, but otherwise a simple job. Thanks for the comments! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r724745 - in /httpd/httpd/trunk: include/ap_socache.h modules/ssl/ssl_scache.c
Joe Orton wrote: Both modules look very neat! Are you going to commit them? I might debate the naming of mod_shmap ;) Heh, thanks. I don't know, I hadn't really thought about committing them ... maybe the shmap one is more useful to other folks? - have all providers consistently return APR_NOTFOUND from retrieve() when an item is not found - pass a pool argument for temporary allocations to store() and remove() - return an apr_status_t instead of void from remove() and maybe also destroy() Those all seem like good ideas - thanks! I saw the commit, thank you! I confess I'm not sure what I was thinking when I said we should pass a pool to remove(), since it already took one. Maybe destroy() should take a pool too? Perhaps that's overkill. One other thought ... does this change need an ap_mmn.h bump? Well, I stand by my argument on this before: the interface is designed explicitly for abstraction of small object caching, emphasis both on the small and the caching. Sure, you can implement more general backends, but that doesn't change the design of the API. OK, I'll buy that and be quiet ... except for one thing. :-) I feel like so is already quite overloaded with the meaning shared object, and especially so in this case, since we have the long-standing mod_so already in place. To my mind, mod_socache* could easily imply to others that these modules are somehow related to caching .so files in memory, or something like that. Is there anything we could come up with that doesn't imply any connection with mod_so and .so files? I hunted around for something in the past and came up with shmap, but I could live with dcache or scache or something else that seemed unique within in the httpd module population. Any thoughts? Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: scoreboard.h/process_score question
Torsten Foertsch wrote: the struct process_score in scoreboard.h contains a sb_type member. Can anyone please explain what that item is for? I couldn't find any usage in the code. Looks like this was added back in r89115 along with a number of other things which were (mostly) quickly removed in r89554, but some have lingered on. I spotted a couple back in 2006 (see http://marc.info/?l=apache-httpd-devm=114676400704606w=2) but not this one, which does indeed seem to be an unused structure member. Removed now in r726118 and, alas, r726120 (because I forgot to update CHANGES, sigh). Thanks for the suggestion! Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r724745 - in /httpd/httpd/trunk: include/ap_socache.h modules/ssl/ssl_scache.c
Joe Orton wrote: mod_shmap would be useful at least in modules/test so I can write some perl-framework tests for mod_socache! OK, I'll think about doing that. The m4/dsp/NWGNU wizardry required makes me tired just thinking about it, though. :-) In the meantime, I think they compile again against the latest API. Ah, I was waiting for Ruediger to ask ;) Ho ho ho -- yes, he and Jim have caught me out on too many things to count. Personally I think it is redundant maintaining fine-grained API versions across changes in unreleased code. Also, if we are going to API version, arguably it should be done by bumping the provider version. But really, I don't care ;) Personally, I'd have to agree with you in principle about maybe not needing to track unreleased code so carefully. Interesting thought about provider versions. I wonder if there are other providers elsewhere in trunk which we should really bump. Personally, I don't think it's a big enough deal to repaint the bikeshed. It's an API for module developers, so, if someone gets confused with mod_so, what's the worst that could happen? cue disaster movie Well, there is that. If no one seems to think mod_so and mod_socache* conflict name-wise, I'll just hush up permanently on this point. Cheers, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r726082 - /httpd/httpd/trunk/modules/aaa/mod_authz_core.c
Ruediger Pluem wrote: Sorry, but I currently don't get the reason for moving the negate check down in the code. So far as I'm aware, there's no functional or even performance difference at the moment (not that performance really matters here, since this function runs at configuration time). But it occurred to me that I'd feel safer doing the optimization first, then the check -- otherwise, if something is changed somewhere in the future such that you wind up, post-optimization, with a container full of negatives, you still get the right error. Call it defensive programming, I guess. Why do we still need AUTHZ_EXTRA_CONFIGS? It simply hides RequireNotAll, which people didn't like. Please feel free to rename it or use #if 0 or whatever if it seems wrong to you. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: svn commit: r726113 - /httpd/httpd/trunk/server/mpm/worker/fdqueue.c
Ruediger Pluem wrote: Not quite sure if this is really correct because apr_atomic_casptr wants to have a (volatile void**) as first parameter. That's what made me less than sure ... but my gcc 4.1.2 -Wall definitely doesn't like that void** (dereferencing type-punned pointer ...). I think void* should be OK, because the APR function still gets a pointer-sized parameter, and casts it to (volatile void**) locally, which is presumably what really matters: that within the CAS function the optimizer understands that the pointed-to value (another pointer, in this case) can be changed by something else it doesn't know about, and so won't optimize away any accesses. But ... I'm no expert on these kinds of compiler guts. Revert if it seems wrong and better to spew some warnings. Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B