Query on deletion of Request pool
Hi All, I am getting a serious memory issue with my Apache webserver. Initially I was allocating buffer from by using apr_palloc from the request pool assuming the allocated memory is going to be released but not sure what is the problem the memory grows infinitely. I then tried with own malloc and then added a clean up function on apr_pool_cleanup_run.Which on debugging showed the free is being called for the allocated memory but still the behavior is same.On each request there is a increase in memory. Is there any way to release explicitly the memory of the request pool?I tried with apr_pool_cleanup_register and added a clean up function internally I was calling apr_pool_destroy(request_rec-pool).For the first request it worked correctly but on the later there was a crash and restart on the Apache process. It seems I may be doing something silly.Any help would be very much appreciated.BTW my webserver is heavily loaded and it is MPM=Worker Apache version 2.2.8 and OS is Red-Hat 3.0. What are the other possibilities where memory leak can looked into. Looking forward for response. Thanks -A
Re: Query on deletion of Request pool
On Wed, Mar 26, 2008 at 11:39 AM, Arnab Ganguly [EMAIL PROTECTED] wrote: Hi All, I am getting a serious memory issue with my Apache webserver. Initially I was allocating buffer from by using apr_palloc from the request pool assuming the allocated memory is going to be released but not sure what is the problem the memory grows infinitely. How are you measuring memory use? Have you tried MaxMemFree? By default, apache won't continuously return this storage to the native heap because it's likely going to be needed again anyway. -- Eric Covener [EMAIL PROTECTED]
Re: flood random subst patch
-Ursprüngliche Nachricht- Von: Sander Temme Gesendet: Mittwoch, 26. März 2008 06:48 An: dev@httpd.apache.org Betreff: Re: flood random subst patch On Mar 25, 2008, at 9:32 PM, James M. Leddy wrote: A Flood 'project', a httpd-test 'project' with a flood 'component', or an httpd 'project' with new subprojects. Thank you for making this clear. My vote is for choice #2. I'd also tend towards a httpd-test Product with perl-framework and Flood Components. I'll give it a day or so for folks to weigh in, then create it. +1 to this solution. Regards Rüdiger
Dynamic configuration for the hackathon?
There seems to be a demand for dynamic per-request configuration, as evidenced by the number of users hacking it with mod_rewrite, and the other very limited tools available. Modern mod_rewrite usage commonly looks like programming, but it's not designed as a programming language. Result: confused and frustrated users. We could make simple changes to mod_rewrite itself: for example, a RewriteCond container would at least bring users basic block structuring. But so long as the block context applies only to mod_rewrite, it remains ad-hoc tinkering with the problem. I'm wondering what it would take to get us to something like: if [some-per-request-expr] Directives applicable per-request (anything, subject to per-directive context checking) /if (ideally with else and elsif) Clearly an if container has to create a configuration record that'll be merged if and only if the condition is satisfied by a request. The condition should have access to headers_in and subprocess_env (with client info stuff included), as well as the request line. As further step we could consider evaluating the if contents per-request, with support for variable interpolation and backreferences using ap_expr. A noble objective: render mod_rewrite obsolete :-) Anyone fancy spending some hackathon time on this in Amsterdam? -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: Dynamic configuration for the hackathon?
On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote: There seems to be a demand for dynamic per-request configuration, as evidenced by the number of users hacking it with mod_rewrite, and the other very limited tools available. Modern mod_rewrite usage commonly looks like programming, but it's not designed as a programming language. Result: confused and frustrated users. This is what I had in mind when I suggested having Lua blocks of code. No need to invent a new language when a perfectly fine one exists... -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Dynamic configuration for the hackathon?
Akins, Brian wrote: On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote: There seems to be a demand for dynamic per-request configuration, as evidenced by the number of users hacking it with mod_rewrite, and the other very limited tools available. Modern mod_rewrite usage commonly looks like programming, but it's not designed as a programming language. Result: confused and frustrated users. This is what I had in mind when I suggested having Lua blocks of code. No need to invent a new language when a perfectly fine one exists... FWIW, it's done with Perl blocks too (I do some funky things that way), BUT I'm not sure if those are parsed per-request as I think Nick is suggesting. Also, many times people don't want to bloat their processes with a fully-fleged interpreter (again, I'm building on my mod_perl experience here - I know that the shared Perl objects are pretty clunky, and not sure if mod_wombat looks the same). Right now it doesn't look like I'll be in Amsterdam... Issac
Re: Dynamic configuration for the hackathon?
On Wed, 26 Mar 2008 15:39:53 +0200 Issac Goldstand [EMAIL PROTECTED] wrote: Akins, Brian wrote: On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote: There seems to be a demand for dynamic per-request configuration, as evidenced by the number of users hacking it with mod_rewrite, and the other very limited tools available. Modern mod_rewrite usage commonly looks like programming, but it's not designed as a programming language. Result: confused and frustrated users. This is what I had in mind when I suggested having Lua blocks of code. No need to invent a new language when a perfectly fine one exists... I'm not talking about inventing a new language. Those who want one have some options already, as noted below ... FWIW, it's done with Perl blocks too (I do some funky things that way), BUT I'm not sure if those are parsed per-request as I think Nick is suggesting. Neither am I, FWIW. Also, many times people don't want to bloat their processes with a fully-fleged interpreter That is much more of a consideration. As I said, the basic idea is to provide a much simpler rationalisation for the kind of things people struggle to do with mod_rewrite et al. (again, I'm building on my mod_perl experience here - I know that the shared Perl objects are pretty clunky, and not sure if mod_wombat looks the same). AFAICT mod_wombat just provides Lua bindings (some of them stubs) for hooks exported from the core. Nothing for configuration. -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: flood random subst patch
Sander Temme wrote: On Mar 25, 2008, at 9:32 PM, James M. Leddy wrote: A Flood 'project', a httpd-test 'project' with a flood 'component', or an httpd 'project' with new subprojects. Thank you for making this clear. My vote is for choice #2. I'd also tend towards a httpd-test Product with perl-framework and Flood Components. I'll give it a day or so for folks to weigh in, then create it. Rereading - I'd agree that #2 is best.
Re: Dynamic configuration for the hackathon?
On 3/26/08 9:53 AM, Nick Kew [EMAIL PROTECTED] wrote: I'm not talking about inventing a new language. Those who want one have some options already, as noted below ... Right. I was just throwing it out there, so to speak. I'm not opposed to what you are saying, just wondering if we would/should take it to the next level. As to your suggestion: So basically, the per_dir merge would use this mechanism instead of what it does now (file walk, location walk) (or in addition to??) Something like: If Directory == /www/stuff and Remote_IP =~ 10.189. SetEnv coolstuff Elsif HTTP_Host == www.domain.com or Local_Port == 8080 Set something different Elsif ENV{blah} =~ foo or Cookie{baz} == iamset foo bar Else something completely different /endif (Horrible, example I know). If it were easy to extend the expresions (ie, I want to implement (Cache == yes/no) and stuff like ENV{key} were made to work, I'm all for it. It *should* be fairly easy to test this out with the current system (ala Proxy blocks). -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Dynamic configuration for the hackathon?
On Wed, 26 Mar 2008 10:15:05 -0400 Akins, Brian [EMAIL PROTECTED] wrote: As to your suggestion: So basically, the per_dir merge would use this mechanism instead of what it does now (file walk, location walk) (or in addition to??) Something like: If Directory == /www/stuff and Remote_IP =~ 10.189. SetEnv coolstuff Elsif HTTP_Host == www.domain.com or Local_Port == 8080 Set something different Elsif ENV{blah} =~ foo or Cookie{baz} == iamset foo bar Else something completely different /endif Sort-of. There's a question of ordering: merge-config happens before some of the vars we'd like to make available are available, so we'd have a bit more work to make Cookie{baz} work (as opposed to parsing a CGI-style HTTP_COOKIE variable). But that's basically the kind of thing. Oh, and there's no inherent reason it shouldn't apply per-host config too. (Horrible, example I know). If it were easy to extend the expresions (ie, I want to implement (Cache == yes/no) and stuff like ENV{key} were made to work, I'm all for it. Straightforward: conditions on headers, method (obsoletes Limit), request line, env, CGI vars. With the option to disable conditional stuff for speed. Higher-level stuff like *evaluating* caching or cookie headers ... maybe sometime, but one could argue that's the point where Perl or Lua makes more sense. It *should* be fairly easy to test this out with the current system (ala Proxy blocks). Hopefully, yes. Oh, and since ap_expr is a prerequisite for this, it would be great if folks could review at least the API part (ap_expr.h) of what I posted earlier. -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: mod_disk_cache and atimes
On Mar 25, 2008, at 8:04 PM, Akins, Brian wrote: Use really small files so you won't fill up pipe. Using 1 1x1 gif, I run out of CPU before I run out of bandwidth. Agreed. This is what I was working on. My cache size is smaller and is in /dev/shm. In that case - you are not going to see any delta with a mem-disk or real disk :). I am hoping that mod_mem_cache can be made faster though -- as it can in theory dispense with a lot more than just mod_memmap. But the footprint of the bucked-brigades seems such that it barely matters (and I am right now stuck on the CPU which has the ethernet card its IRQ beeing way too afine). Dw
Re: mod_disk_cache and atimes
Hi Dirk-Willem, Can you please clarify your mentioning the bucket-brigade footprint? Are they so slow they make memory-based cache no more efficient then disk-based one? Or the opposite: sendfile() works so well that serving content from memory is not any faster? I'm developing an Apache output filter for highly loaded servers and proxies that juggles small-size buckets and brigades extensively. I'm not at the stage yet where I can do performance tests but if I knew this would definitely impact performance, I would perhaps switch to fixed-size buffers straight away... Thank you. KC On 26 Mar 2008, at 14:43, Dirk-Willem van Gulik wrote: On Mar 25, 2008, at 8:04 PM, Akins, Brian wrote: Use really small files so you won't fill up pipe. Using 1 1x1 gif, I run out of CPU before I run out of bandwidth. Agreed. This is what I was working on. My cache size is smaller and is in /dev/shm. In that case - you are not going to see any delta with a mem-disk or real disk :). I am hoping that mod_mem_cache can be made faster though -- as it can in theory dispense with a lot more than just mod_memmap. But the footprint of the bucked-brigades seems such that it barely matters (and I am right now stuck on the CPU which has the ethernet card its IRQ beeing way too afine). Dw
Proposal: a cron interface for httpd
Hi all, On a number of occasions recently I have run into the need to run some kind of garbage collection within httpd, either in a dedicated process, or a dedicated thread. Attempts to solve this to date have involved setting up of external tools to try and solve garbage collection problems, but this is generally less than ideal, as it amounts to stuff that the potential admin has to configure / get wrong. Ideally I want httpd to worry about its own garbage collection, I as an admin don't want it to be my problem. The interface I had in mind was a set of hooks, as follows: ap_cron_per_second ap_cron_per_minute ap_cron_per_hour ap_cron_per_day ap_cron_per_week It will be up to the relevant MPM to a) create the thread and/or process that is responsible for calling the hooks, and to actually call the hooks. Modules just add a hook as needed, and take it for granted that code gets run within the limitations of the mechanism. While not very sophisticated, it works very well with the hook infrastructure we have now. I am operating on the assumption that one single thread and/or process running on a server that calls a possible-empty hook once a second is cheap enough to not be a problem. Before I code anything up, is this acceptable or are there glaring holes that I have not foreseen? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: Proposal: a cron interface for httpd
On Wed, 26 Mar 2008 17:55:43 +0200 Graham Leggett [EMAIL PROTECTED] wrote: Before I code anything up, is this acceptable or are there glaring holes that I have not foreseen? ap_hook_monitor? -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: Proposal: a cron interface for httpd
Graham Leggett wrote: Hi all, On a number of occasions recently I have run into the need to run some kind of garbage collection within httpd, either in a dedicated process, or a dedicated thread. Attempts to solve this to date have involved setting up of external tools to try and solve garbage collection problems, but this is generally less than ideal, as it amounts to stuff that the potential admin has to configure / get wrong. Ideally I want httpd to worry about its own garbage collection, I as an admin don't want it to be my problem. The interface I had in mind was a set of hooks, as follows: ap_cron_per_second ap_cron_per_minute ap_cron_per_hour ap_cron_per_day ap_cron_per_week It will be up to the relevant MPM to a) create the thread and/or process that is responsible for calling the hooks, and to actually call the hooks. Modules just add a hook as needed, and take it for granted that code gets run within the limitations of the mechanism. While not very sophisticated, it works very well with the hook infrastructure we have now. I am operating on the assumption that one single thread and/or process running on a server that calls a possible-empty hook once a second is cheap enough to not be a problem. Before I code anything up, is this acceptable or are there glaring holes that I have not foreseen? Regards, Graham In general that would be helpful, e.g. mod_jk needs to check for idle connections to close them, and this check is decoupled from request processing. I guess the same could be true for dbd. Of course things get harder, if you want to provide timing guarantee to the modules using the hooks. You might end up using a thread per module and used hook in order to minimize interference of long running methods with the hook timing. Regards, Rainer
Re: Proposal: a cron interface for httpd
Nick Kew wrote: ap_hook_monitor? A quick look found the hook, but no comments or other docs on how it works. The only code in the tree using the hook is mod_example_hooks, but it doesn't reveal any information either. Is this hook documented anywhere? I don't want to add something we already have. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Excessive chunking [was: mod_disk_cache and atimes]
Thanks for the clarification. A small correction: I meant writev() calls instead of sendfile() when working with small-size buckets. The filter I'm developing provisionally splits the supplied buckets into relatively small buckets during content parsing. It then removes some of them and inserts some other buckets. Before passing the resulting brigade further down the filter chain, it merges all buckets that have their data in contiguous memory regions back together. So I guess I'm doing my bit in preventing excessive chunking. I've done some research on the source files of httpd-2.2.6. The CORE filter seems to do de-chunking in the case when 16 or more buckets are passed to it (actually, the brigade is split if it contains flush buckets and each split part is checked for 16 buckets) AND the total amount of bytes in the 16 buckets does not exceed 8000. The filter then buffers the buckets together. Very clever. KC On 26 Mar 2008, at 15:22, Dirk-Willem van Gulik wrote: On Mar 26, 2008, at 4:15 PM, Konstantin Chuguev wrote: Can you please clarify your mentioning the bucket-brigade footprint? Are they so slow they make memory-based cache no more efficient then disk-based one? Or the opposite: sendfile() works so well that serving content from memory is not any faster? No - they are very fast (in an absolute sense) - and your approach is almost certainly the right one. However all-in-all there is a lot of logic surrounding them; and if you are trying to squeeze out the very last drop (e.g. the 1x1 gif example) - you run into all sorts of artificial limits, specifically on linux and 2x2 core machines; as the memory which needs to be accessed is just a little more scattered than one would prefer and all sort of competition around the IRQ handling in the kernel and so on. Or in other words - in a pure static case where you are serving very small files which rarely if ever change, have no variance to any inbound headers, etc - things are not ideal. But that is a small price to pay - i.e. apache is more of a swiss army knife; which saw's OK, but a proper hacksaw is 'better'. I'm developing an Apache output filter for highly loaded servers and proxies that juggles small-size buckets and brigades extensively. I'm not at the stage yet where I can do performance tests but if I knew this would definitely impact performance, I would perhaps switch to fixed-size buffers straight away... I'd bet you are on the right track. However there is -one- small concern; sometimes if you have lots of buckets and very chunked output - then one gets lots and lots of 1-5 byte chunks; each prefixed by the length byte. And this can get really inefficient. Perhaps we need a de-bucketer to 'dechunk' when outputting chunked. Dw Konstantin Chuguev Software Developer Mobile: +44 7734 955973 Fax: + 44 20 7509 9600 Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF, Registered in England No. 3774129
Re: Proposal: a cron interface for httpd
Plüm wrote: What data do you supply to the hooks? What if the execution of the hook takes longer then the defined frequency of this hook? That is something we decide, and code accordingly, depending on what we think we need. We could come up with something capable of spawning a dedicated process and/or thread every time there is a successful tick to run that tick, so that tick+1 isn't delayed if there is an overrun. It also helps prevent leaks, as the tick will have a pool that will be destroyed when the tick is complete and the thread/process terminates normally. The question is, is this good enough? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: Excessive chunking [was: mod_disk_cache and atimes]
On Mar 26, 2008, at 5:23 PM, Konstantin Chuguev wrote: A small correction: I meant writev() calls instead of sendfile() when working with small-size buckets. The filter I'm developing provisionally splits the supplied buckets into relatively small buckets during content parsing. It then removes some of them and inserts some other buckets. Before passing the resulting brigade further down the filter chain, it merges all buckets that have their data in contiguous memory regions back together. So I guess I'm doing my bit in preventing excessive chunking. I've done some research on the source files of httpd-2.2.6. The CORE filter seems to do de-chunking in the case when 16 or more buckets are passed to it (actually, the brigade is split if it contains flush buckets and each split part is checked for 16 buckets) AND the total amount of bytes in the 16 buckets does not exceed 8000. The filter then buffers the buckets together. Very clever. Hmm - I am not sure that this always works - i.e. try this :) $ cat test.shtml !--#set var=foo value=bar --g!--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- !--#set var=foo value=bar -- (make sure that all spaces between the and are gone) and have the usual: AddType text/html .shtml AddOutputFilter INCLUDES .shtml Directory .. Options Includes .. in your config. You then get the output below. Dw. (echo GET /test.shtml HTTP/1.1; echo Host: localhost: echo; echo; sleep 10) | telnet localhost 80 Connected to localhost. Escape character is '^]'. HTTP/1.1 200 OK Date: Wed, 26 Mar 2008 16:39:35 GMT Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 PHP/5.2.5 Accept-Ranges: bytes Transfer-Encoding: chunked Content-Type: text/html 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
Re: Dynamic configuration for the hackathon?
On 3/26/08 10:31 AM, Nick Kew [EMAIL PROTECTED] wrote: Straightforward: conditions on headers, method (obsoletes Limit), request line, env, CGI vars. With the option to disable conditional stuff for speed. In mod_include, we parse into a tree on every request. For the configuration, we should probably just parse it at startup and run it on every request. Also, currently, ap_expr is string specific, it would be nice if this was provider based. Not sure of the exact interface, but it would be extendable for other types of comparisons, for example. typedef struct { apr_status_t (*init_expr)(apr_pool_t *p, const char *lvalue, const char *rvalue, int expr, void**data); apr_status_t (*eval_expr)(reuqest_rec *r, void *data); } ap_expr_provider_t; So this expresssion, at startup: If Remote_IP =~ 10.189. ... /endif Would call the provider registered for Remote_IP as: provider-init_expr(conf-pool, Remote_IP, 10.189., AP_EXPR_REGEX, data); The provider would construct what ever struct it needs, in this case, to do partial ip address matching, shove that into a struct and return it via the data argument; And then, at request time, we would run: provider-eval_expr(r, data) Where data is what was returned by init. This returns basically true or false. The string stuff could be easily integrated and provided by default. The nice thing, is all of this could be then used in mod_include as well, as well as any other modules that use ap_expr. Thoughts? -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Proposal: a cron interface for httpd
On Mar 26, 2008, at 5:35 PM, Graham Leggett wrote: We could come up with something capable of spawning a dedicated process and/or thread every time there is a successful tick to run that tick, so that tick+1 isn't delayed if there is an overrun. Or reduce the interface to a simple 'register callback 'at or before', or 'at or after' this time - and leave it up to the called entity to re-register itself. It also helps prevent leaks, as the tick will have a pool that will be destroyed when the tick is complete and the thread/process terminates normally. Dw
Re: Dynamic configuration for the hackathon?
On 3/26/08 12:42 PM, Akins, Brian [EMAIL PROTECTED] wrote: Thoughts? Of course, it will not work exactly as I have said because we have to take stuff like variable substitution into account, etc. Was just thinking out loud... -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Dynamic configuration for the hackathon?
On Wed, 26 Mar 2008 12:42:51 -0400 Akins, Brian [EMAIL PROTECTED] wrote: On 3/26/08 10:31 AM, Nick Kew [EMAIL PROTECTED] wrote: Straightforward: conditions on headers, method (obsoletes Limit), request line, env, CGI vars. With the option to disable conditional stuff for speed. In mod_include, we parse into a tree on every request. For the configuration, we should probably just parse it at startup and run it on every request. Indeed - hence the parse/eval separation in the proposed API. Also, currently, ap_expr is string specific, it would be nice if this was provider based. Not sure of the exact interface, but it would be extendable for other types of comparisons, for example. Well, we always start from a string. Later when it's tokenised we can, and indeed do, dispatch to a provider (in mod_include's case, functions called handle_foo for keyword foo). typedef struct { apr_status_t (*init_expr)(apr_pool_t *p, const char *lvalue, const char *rvalue, int expr, void**data); apr_status_t (*eval_expr)(reuqest_rec *r, void *data); } ap_expr_provider_t; That's no use at top level, because So this expresssion, at startup: If Remote_IP =~ 10.189. ... /endif Would call the provider registered for Remote_IP as: we have to parse a string before we have Remote_IP. Once we have that, sure, our evaluation function can dispatch to the Remote_IP handler. You seem to be looking a little further than my proposal went. Which is kind-of why it would be good to hackathonise this:-) -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: Dynamic configuration for the hackathon?
On 3/26/08 1:14 PM, Nick Kew [EMAIL PROTECTED] wrote: we have to parse a string before we have Remote_IP. Once we have that, sure, our evaluation function can dispatch to the Remote_IP handler. Of course. I was getting ahead of my self... You seem to be looking a little further than my proposal went. Which is kind-of why it would be good to hackathonise this:-) True. Let me digest this some more... -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Proposal: a cron interface for httpd
On Mar 26, 2008, at 11:55 AM, Graham Leggett wrote: Hi all, On a number of occasions recently I have run into the need to run some kind of garbage collection within httpd, either in a dedicated process, or a dedicated thread. Attempts to solve this to date have involved setting up of external tools to try and solve garbage collection problems, but this is generally less than ideal, as it amounts to stuff that the potential admin has to configure / get wrong. Ideally I want httpd to worry about its own garbage collection, I as an admin don't want it to be my problem. The interface I had in mind was a set of hooks, as follows: ap_cron_per_second ap_cron_per_minute ap_cron_per_hour ap_cron_per_day ap_cron_per_week It will be up to the relevant MPM to a) create the thread and/or process that is responsible for calling the hooks, and to actually call the hooks. Modules just add a hook as needed, and take it for granted that code gets run within the limitations of the mechanism. While not very sophisticated, it works very well with the hook infrastructure we have now. I am operating on the assumption that one single thread and/or process running on a server that calls a possible-empty hook once a second is cheap enough to not be a problem. Before I code anything up, is this acceptable or are there glaring holes that I have not foreseen? Sounds good... Why not have the parent process be the actual cron keeper and use some method to signal the child processes (something pod-like or maybe a shared memory struct) when something needs to be run?
Re: Proposal: a cron interface for httpd
Dirk-Willem van Gulik wrote: Or reduce the interface to a simple 'register callback 'at or before', or 'at or after' this time - and leave it up to the called entity to re-register itself. The Eclipse Job interface works like this, you basically say run this X ms from now, and if you want to run it again, you schedule it again before you're done. The catch is that the timings aren't that accurate, but then for most applications it doesn't need to be. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: Proposal: a cron interface for httpd
On Mar 26, 2008, at 6:45 PM, Graham Leggett wrote: Dirk-Willem van Gulik wrote: Or reduce the interface to a simple 'register callback 'at or before', or 'at or after' this time - and leave it up to the called entity to re-register itself. The Eclipse Job interface works like this, you basically say run this X ms from now, and if you want to run it again, you schedule it again before you're done. The catch is that the timings aren't that accurate, but then for most applications it doesn't need to be. But that is just a matter of having clear semantics; i.e. 'run on, or as soon as possible after'. Or 'try to run before'. And we should be fine. The remainign issue is fragility; i.e. a task which aborts and 'forgets' to re-register is in pain. Dw
Re: Proposal: a cron interface for httpd
Graham Leggett wrote: On a number of occasions recently I have run into the need to run some kind of garbage collection within httpd, either in a dedicated process, or a dedicated thread. I've also written a few modules where each child process runs a private thread in the background. I'd suggest there are perhaps three variants here (in a Unix-like context, anyway): one separate process, one thread in the master process, or one thread per child process. Presumably MPMs should somehow indicate which of these they support. ap_cron_per_second ap_cron_per_minute ap_cron_per_hour ap_cron_per_day ap_cron_per_week I wonder if you could flatten these down to a single ap_monitor_interval() kind of thing, where the module specified the interval it wanted? I suppose one could go the other direction toward a full-blown scheduler, but that seems like a lot of extra effort for perhaps little gain. It might be nice to also offer the option to randomly stagger the creation of processes/threads within some additional time interval -- especially for one-per-child threads, that could help avoid a thundering herd kind of situation where a bunch of child processes start up together (e.g., at a restart), and later all kick off resource-intensive threads at nearly the same time. Of course a module's background threads could wait some random interval on their own after being started, but then they're eating up time sleeping while the invoking process thinks they're doing work. My other thought is that it would be really nice to be able to track the status of these processes and threads in mod_status; e.g., in the scoreboard or a scoreboard-like utility. Moreover, it would be excellent if the scheduler/MPM could use this info to avoid spawning new processes/threads if the old ones were still executing ... for a per-second kind of interval, that might be quite important, especially if there's any chance the task could occasionally get stuck. I feel this relates a bit to my continued interest (mod lack of time) in abstracting the scoreboard and shared-memory subsystems into a shared map facility. Joe Orton did a bunch of work on the SSL session cache which I think moves it in this direction; see his RFC and a couple of my responses: http://marc.info/?l=apache-httpd-devm=120397759902722w=2 http://marc.info/?l=apache-httpd-devm=120406346306713w=2 http://marc.info/?l=apache-httpd-devm=120491055413781w=2 I also put a larger outline of some of my notions in this regard into this document: http://svn.apache.org/viewvc/httpd/sandbox/amsterdam/architecture/scoreboard.txt?view=markup In particular, in thinking about background processes and threads both of the type you're proposing and those created by modules like mod_cgid and mod_fcgid, I had began tossing around the notion of modules being able to register additional scoreboard states beyond those hard-coded now into mod_status: - during pre/check/post-config phases, modules (including MPMs) may indicate if they need IPC space, what type, and how much: - private space in scoreboard table values - additional scoreboard states - at startup, the master process initializes the storage provider: - MPM sizes scoreboard based on runtime process and thread limits, not compile-time maximums - assigns IDs for additional scoreboard states requested by modules - creates scoreboard state-to-ID hash mappings in regular memory as part of read-only configuration data inherited by children We could then offer modules standard ways to ask for processes or threads to be spawned at startup/restart time, or on some schedule (as per your initial proposal), and for these processes/threads to update their status record in the scoreboard. Certain MPMs (e.g., worker, event) also spawn threads that aren't recorded in the scoreboard at the moment; it would be great to see them in the scoreboard too. The administrator could then see everything at a glance in mod_status, including all the background tasks, and the scheduler/MPM would have a standard way to check if a background task was still running from a previous invocation. I feel like there's a certain serendipity in this proposal coming along around the same time as Joe Orton's work, which seems to me to be heading toward not so much of a cache in the traditional httpd sense (i.e., mod_cache and friends) as a generic shared map interface that would be useful in a wide variety of ways, including implementing a configurable scoreboard that could help track arbitrary background tasks. Thoughts, flames? Fire away! Thanks, Chris. -- GPG Key ID: 366A375B GPG Key Fingerprint: 485E 5041 17E1 E2BB C263 E4DE C8E3 FA36 366A 375B
Re: Proposal: a cron interface for httpd
The way I do this is simple and primitive. If I have an action I need to be ran, I do something like: Location /my-cron-job-thing SetHandler MyCronJobThing Allow from 127.0.0.1 Deny from All /Location And then in cron * * * * * curl http://localhost/my-cron-job-thing /logs/cronjob.log 21 -- Brian Akins Chief Operations Engineer Turner Digital Media Technologies
Re: Proposal: a cron interface for httpd
On Mar 26, 2008, at 2:52 PM, Akins, Brian wrote: The way I do this is simple and primitive. If I have an action I need to be ran, I do something like: Location /my-cron-job-thing SetHandler MyCronJobThing Allow from 127.0.0.1 Deny from All /Location And then in cron * * * * * curl http://localhost/my-cron-job-thing /logs/ cronjob.log 21 I've done something similar with mutexed internal subrequests that trigger a specific handler...