Query on deletion of Request pool

2008-03-26 Thread Arnab Ganguly
Hi All,
I am getting a serious memory issue with my Apache webserver.

Initially I was allocating buffer from by using apr_palloc from the request
pool assuming the allocated memory is going to be released but not sure what
is the problem the memory grows infinitely.

I then tried with own malloc and then added a  clean up function on
apr_pool_cleanup_run.Which on debugging showed the free is being called for
the allocated memory but still the behavior is same.On each request there is
a increase in memory.

Is there any way to release explicitly the memory of the request pool?I
tried with apr_pool_cleanup_register and added a clean up function
internally I was calling apr_pool_destroy(request_rec-pool).For the first
request it worked correctly but on the later there was a crash and restart
on the Apache process.

It seems I may be doing something silly.Any help would be very much
appreciated.BTW my webserver is heavily loaded and it is MPM=Worker Apache
version 2.2.8 and OS is Red-Hat 3.0.

What are the other possibilities where memory leak can looked into.
Looking forward for response.
Thanks
-A


Re: Query on deletion of Request pool

2008-03-26 Thread Eric Covener
On Wed, Mar 26, 2008 at 11:39 AM, Arnab Ganguly [EMAIL PROTECTED] wrote:
 Hi All,
  I am getting a serious memory issue with my Apache webserver.

  Initially I was allocating buffer from by using apr_palloc from the request
  pool assuming the allocated memory is going to be released but not sure what
  is the problem the memory grows infinitely.

How are you measuring memory use?  Have you tried MaxMemFree?

By default, apache won't continuously return this storage to the
native heap because it's likely going to be needed again anyway.

-- 
Eric Covener
[EMAIL PROTECTED]


Re: flood random subst patch

2008-03-26 Thread Plüm , Rüdiger , VF-Group
 

 -Ursprüngliche Nachricht-
 Von: Sander Temme 
 Gesendet: Mittwoch, 26. März 2008 06:48
 An: dev@httpd.apache.org
 Betreff: Re: flood random subst patch
 
 
 On Mar 25, 2008, at 9:32 PM, James M. Leddy wrote:
 
  A Flood 'project', a httpd-test 'project' with a flood 
 'component',  
  or
  an httpd 'project' with new subprojects.
  Thank you for making this clear.  My vote is for choice #2.
 
 
 I'd also tend towards a httpd-test Product with perl-framework and  
 Flood Components.  I'll give it a day or so for folks to weigh in,  
 then create it.

+1 to this solution.

Regards

Rüdiger



Dynamic configuration for the hackathon?

2008-03-26 Thread Nick Kew
There seems to be a demand for dynamic per-request configuration,
as evidenced by the number of users hacking it with mod_rewrite,
and the other very limited tools available.  Modern mod_rewrite
usage commonly looks like programming, but it's not designed as
a programming language.  Result: confused and frustrated users.

We could make simple changes to mod_rewrite itself: for example,
a RewriteCond container would at least bring users basic
block structuring.  But so long as the block context applies
only to mod_rewrite, it remains ad-hoc tinkering with the problem.

I'm wondering what it would take to get us to something like:

if [some-per-request-expr]
Directives applicable per-request
(anything, subject to per-directive context checking)
/if

(ideally with else and elsif)

Clearly an if container has to create a configuration
record that'll be merged if and only if the condition is
satisfied by a request.  The condition should have access
to headers_in and subprocess_env (with client info stuff
included), as well as the request line.

As further step we could consider evaluating the if
contents per-request, with support for variable
interpolation and backreferences using ap_expr.

A noble objective: render mod_rewrite obsolete :-)

Anyone fancy spending some hackathon time on this in Amsterdam?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Akins, Brian
On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote:

 There seems to be a demand for dynamic per-request configuration,
 as evidenced by the number of users hacking it with mod_rewrite,
 and the other very limited tools available.  Modern mod_rewrite
 usage commonly looks like programming, but it's not designed as
 a programming language.  Result: confused and frustrated users.


This is what I had in mind when I suggested having Lua blocks of code.  No
need to invent a new language when a perfectly fine one exists...

-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Issac Goldstand




Akins, Brian wrote:

On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote:


There seems to be a demand for dynamic per-request configuration,
as evidenced by the number of users hacking it with mod_rewrite,
and the other very limited tools available.  Modern mod_rewrite
usage commonly looks like programming, but it's not designed as
a programming language.  Result: confused and frustrated users.



This is what I had in mind when I suggested having Lua blocks of code.  No
need to invent a new language when a perfectly fine one exists...



FWIW, it's done with Perl blocks too (I do some funky things that 
way), BUT I'm not sure if those are parsed per-request as I think Nick 
is suggesting.  Also, many times people don't want to bloat their 
processes with a fully-fleged interpreter (again, I'm building on my 
mod_perl experience here - I know that the shared Perl objects are 
pretty clunky, and not sure if mod_wombat looks the same).


Right now it doesn't look like I'll be in Amsterdam...

 Issac


Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Nick Kew
On Wed, 26 Mar 2008 15:39:53 +0200
Issac Goldstand [EMAIL PROTECTED] wrote:

 
 
 
 Akins, Brian wrote:
  On 3/26/08 9:06 AM, Nick Kew [EMAIL PROTECTED] wrote:
  
  There seems to be a demand for dynamic per-request configuration,
  as evidenced by the number of users hacking it with mod_rewrite,
  and the other very limited tools available.  Modern mod_rewrite
  usage commonly looks like programming, but it's not designed as
  a programming language.  Result: confused and frustrated users.
  
  
  This is what I had in mind when I suggested having Lua blocks of
  code.  No need to invent a new language when a perfectly fine one
  exists...

I'm not talking about inventing a new language.  Those who want one
have some options already, as noted below ...

 
 FWIW, it's done with Perl blocks too (I do some funky things that 
 way), BUT I'm not sure if those are parsed per-request as I think
 Nick is suggesting.

Neither am I, FWIW.

 Also, many times people don't want to bloat
 their processes with a fully-fleged interpreter

That is much more of a consideration.  As I said, the basic idea is
to provide a much simpler rationalisation for the kind of things
people struggle to do with mod_rewrite et al.

(again, I'm building
 on my mod_perl experience here - I know that the shared Perl objects
 are pretty clunky, and not sure if mod_wombat looks the same).

AFAICT mod_wombat just provides Lua bindings (some of them stubs)
for hooks exported from the core.  Nothing for configuration.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: flood random subst patch

2008-03-26 Thread William A. Rowe, Jr.

Sander Temme wrote:


On Mar 25, 2008, at 9:32 PM, James M. Leddy wrote:


A Flood 'project', a httpd-test 'project' with a flood 'component', or
an httpd 'project' with new subprojects.

Thank you for making this clear.  My vote is for choice #2.



I'd also tend towards a httpd-test Product with perl-framework and Flood 
Components.  I'll give it a day or so for folks to weigh in, then create 
it.


Rereading - I'd agree that #2 is best.



Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Akins, Brian
On 3/26/08 9:53 AM, Nick Kew [EMAIL PROTECTED] wrote:

 I'm not talking about inventing a new language.  Those who want one
 have some options already, as noted below ...

Right.  I was just throwing it out there, so to speak.  I'm not opposed to
what you are saying, just wondering if we would/should take it to the next
level.

As to your suggestion:

So basically, the per_dir merge would use this mechanism instead of what it
does now (file walk, location walk) (or in addition to??)

Something like:

If Directory == /www/stuff and Remote_IP =~ 10.189.
SetEnv coolstuff
Elsif HTTP_Host == www.domain.com or Local_Port == 8080
Set something different
Elsif ENV{blah} =~ foo or Cookie{baz} == iamset
foo bar
Else
   something completely different
/endif


(Horrible, example I know).  If it were easy to extend the expresions (ie, I
want to implement (Cache == yes/no) and stuff like ENV{key} were made to
work, I'm all for it.

It *should* be fairly easy to test this out with the current system (ala
Proxy blocks).


-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Nick Kew
On Wed, 26 Mar 2008 10:15:05 -0400
Akins, Brian [EMAIL PROTECTED] wrote:

 As to your suggestion:
 
 So basically, the per_dir merge would use this mechanism instead of
 what it does now (file walk, location walk) (or in addition to??)
 
 Something like:
 
 If Directory == /www/stuff and Remote_IP =~ 10.189.
 SetEnv coolstuff
 Elsif HTTP_Host == www.domain.com or Local_Port == 8080
 Set something different
 Elsif ENV{blah} =~ foo or Cookie{baz} == iamset
 foo bar
 Else
something completely different
 /endif

Sort-of.  There's a question of ordering: merge-config happens
before some of the vars we'd like to make available are available,
so we'd have a bit more work to make Cookie{baz} work (as opposed
to parsing a CGI-style HTTP_COOKIE variable).  But that's basically
the kind of thing.

Oh, and there's no inherent reason it shouldn't apply per-host
config too.

 (Horrible, example I know).  If it were easy to extend the expresions
 (ie, I want to implement (Cache == yes/no) and stuff like ENV{key}
 were made to work, I'm all for it.

Straightforward: conditions on headers, method (obsoletes Limit),
request line, env, CGI vars.  With the option to disable conditional
stuff for speed.  Higher-level stuff like *evaluating* caching or
cookie headers ... maybe sometime, but one could argue that's the
point where Perl or Lua makes more sense.

 It *should* be fairly easy to test this out with the current system
 (ala Proxy blocks).

Hopefully, yes.

Oh, and since ap_expr is a prerequisite for this, it would be great
if folks could review at least the API part (ap_expr.h) of what
I posted earlier.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: mod_disk_cache and atimes

2008-03-26 Thread Dirk-Willem van Gulik


On Mar 25, 2008, at 8:04 PM, Akins, Brian wrote:

Use really small files so you won't fill up pipe.  Using 1 1x1 gif,  
I run

out of CPU before I run out of bandwidth.


Agreed. This is what I was working on.


My cache size is smaller and is in /dev/shm.



In that case - you are not going to see any delta with a mem-disk or  
real disk :). I am hoping that mod_mem_cache can be made faster though  
-- as it can in theory dispense with a lot more than just mod_memmap.  
But the footprint of the bucked-brigades seems such that it barely  
matters (and I am right now stuck on the CPU which has the ethernet  
card its IRQ beeing way too afine).


Dw


Re: mod_disk_cache and atimes

2008-03-26 Thread Konstantin Chuguev

Hi Dirk-Willem,

Can you please clarify your mentioning the bucket-brigade footprint?  
Are they so slow they make memory-based cache no more efficient then  
disk-based one? Or the opposite: sendfile() works so well that serving  
content from memory is not any faster?


I'm developing an Apache output filter for highly loaded servers and  
proxies that juggles small-size buckets and brigades extensively. I'm  
not at the stage yet where I can do performance tests but if I knew  
this would definitely impact performance, I would perhaps switch to  
fixed-size buffers straight away...


Thank you.
KC


On 26 Mar 2008, at 14:43, Dirk-Willem van Gulik wrote:


On Mar 25, 2008, at 8:04 PM, Akins, Brian wrote:

Use really small files so you won't fill up pipe.  Using 1 1x1 gif,  
I run

out of CPU before I run out of bandwidth.


Agreed. This is what I was working on.


My cache size is smaller and is in /dev/shm.



In that case - you are not going to see any delta with a mem-disk or  
real disk :). I am hoping that mod_mem_cache can be made faster  
though -- as it can in theory dispense with a lot more than just  
mod_memmap. But the footprint of the bucked-brigades seems such that  
it barely matters (and I am right now stuck on the CPU which has the  
ethernet card its IRQ beeing way too afine).


Dw






Proposal: a cron interface for httpd

2008-03-26 Thread Graham Leggett

Hi all,

On a number of occasions recently I have run into the need to run some 
kind of garbage collection within httpd, either in a dedicated process, 
or a dedicated thread.


Attempts to solve this to date have involved setting up of external 
tools to try and solve garbage collection problems, but this is 
generally less than ideal, as it amounts to stuff that the potential 
admin has to configure / get wrong.


Ideally I want httpd to worry about its own garbage collection, I as an 
admin don't want it to be my problem.


The interface I had in mind was a set of hooks, as follows:

ap_cron_per_second
ap_cron_per_minute
ap_cron_per_hour
ap_cron_per_day
ap_cron_per_week

It will be up to the relevant MPM to a) create the thread and/or process 
that is responsible for calling the hooks, and to actually call the hooks.


Modules just add a hook as needed, and take it for granted that code 
gets run within the limitations of the mechanism.


While not very sophisticated, it works very well with the hook 
infrastructure we have now. I am operating on the assumption that one 
single thread and/or process running on a server that calls a 
possible-empty hook once a second is cheap enough to not be a problem.


Before I code anything up, is this acceptable or are there glaring holes 
that I have not foreseen?


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Proposal: a cron interface for httpd

2008-03-26 Thread Nick Kew
On Wed, 26 Mar 2008 17:55:43 +0200
Graham Leggett [EMAIL PROTECTED] wrote:

 Before I code anything up, is this acceptable or are there glaring
 holes that I have not foreseen?

ap_hook_monitor?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: Proposal: a cron interface for httpd

2008-03-26 Thread Rainer Jung

Graham Leggett wrote:

Hi all,

On a number of occasions recently I have run into the need to run some 
kind of garbage collection within httpd, either in a dedicated process, 
or a dedicated thread.


Attempts to solve this to date have involved setting up of external 
tools to try and solve garbage collection problems, but this is 
generally less than ideal, as it amounts to stuff that the potential 
admin has to configure / get wrong.


Ideally I want httpd to worry about its own garbage collection, I as an 
admin don't want it to be my problem.


The interface I had in mind was a set of hooks, as follows:

ap_cron_per_second
ap_cron_per_minute
ap_cron_per_hour
ap_cron_per_day
ap_cron_per_week

It will be up to the relevant MPM to a) create the thread and/or process 
that is responsible for calling the hooks, and to actually call the hooks.


Modules just add a hook as needed, and take it for granted that code 
gets run within the limitations of the mechanism.


While not very sophisticated, it works very well with the hook 
infrastructure we have now. I am operating on the assumption that one 
single thread and/or process running on a server that calls a 
possible-empty hook once a second is cheap enough to not be a problem.


Before I code anything up, is this acceptable or are there glaring holes 
that I have not foreseen?


Regards,
Graham


In general that would be helpful, e.g. mod_jk needs to check for idle 
connections to close them, and this check is decoupled from request 
processing. I guess the same could be true for dbd.


Of course things get harder, if you want to provide timing guarantee to 
the modules using the hooks. You might end up using a thread per module 
and used hook in order to minimize interference of long running methods 
with the hook timing.


Regards,

Rainer


Re: Proposal: a cron interface for httpd

2008-03-26 Thread Graham Leggett

Nick Kew wrote:


ap_hook_monitor?


A quick look found the hook, but no comments or other docs on how it 
works. The only code in the tree using the hook is mod_example_hooks, 
but it doesn't reveal any information either.


Is this hook documented anywhere?

I don't want to add something we already have.

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Excessive chunking [was: mod_disk_cache and atimes]

2008-03-26 Thread Konstantin Chuguev

Thanks for the clarification.

A small correction: I meant writev() calls instead of sendfile() when  
working with small-size buckets.


The filter I'm developing provisionally splits the supplied buckets  
into relatively small buckets during content parsing. It then removes  
some of them and inserts some other buckets. Before passing the  
resulting brigade further down the filter chain, it merges all buckets  
that have their data in contiguous memory regions back together. So I  
guess I'm doing my bit in preventing excessive chunking.


I've done some research on the source files of httpd-2.2.6. The CORE  
filter seems to do de-chunking in the case when 16 or more buckets are  
passed to it (actually, the brigade is split if it contains flush  
buckets and each split part is checked for 16 buckets) AND the total  
amount of bytes in the 16 buckets does not exceed 8000. The filter  
then buffers the buckets together. Very clever.


KC


On 26 Mar 2008, at 15:22, Dirk-Willem van Gulik wrote:


On Mar 26, 2008, at 4:15 PM, Konstantin Chuguev wrote:

Can you please clarify your mentioning the bucket-brigade  
footprint? Are they so slow they make memory-based cache no more  
efficient then disk-based one? Or the opposite: sendfile() works so  
well that serving content from memory is not any faster?


No - they are very fast (in an absolute sense) - and your approach  
is almost certainly the right one.


However all-in-all there is a lot of logic surrounding them; and if  
you are trying to squeeze out the very last drop (e.g. the 1x1 gif  
example) - you run into all sorts of artificial limits, specifically  
on linux and 2x2 core machines; as the memory which needs to be  
accessed is just a little more scattered than one would prefer and  
all sort of competition around the IRQ handling in the kernel and so  
on.


Or in other words - in a pure static case where you are serving very  
small files which rarely if ever change, have no variance to any  
inbound headers, etc - things are not ideal.


But that is a small price to pay - i.e. apache is more of a swiss  
army knife; which saw's OK, but a proper hacksaw is 'better'.


I'm developing an Apache output filter for highly loaded servers  
and proxies that juggles small-size buckets and brigades  
extensively. I'm not at the stage yet where I can do performance  
tests but if I knew this would definitely impact performance, I  
would perhaps switch to fixed-size buffers straight away...



I'd bet you are on the right track. However there is -one- small  
concern; sometimes if you have lots of buckets and very chunked  
output - then one gets lots and lots of 1-5 byte chunks; each  
prefixed by the length byte. And this can get really inefficient.


Perhaps we need a de-bucketer to 'dechunk' when outputting chunked.

Dw



Konstantin Chuguev
Software Developer

Mobile: +44 7734 955973
Fax: + 44 20 7509 9600
Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF,  
Registered in England No. 3774129





Re: Proposal: a cron interface for httpd

2008-03-26 Thread Graham Leggett

Plüm wrote:


What data do you supply to the hooks?
What if the execution of the hook takes longer then the defined frequency
of this hook?


That is something we decide, and code accordingly, depending on what we 
think we need.


We could come up with something capable of spawning a dedicated process 
and/or thread every time there is a successful tick to run that tick, 
so that tick+1 isn't delayed if there is an overrun.


It also helps prevent leaks, as the tick will have a pool that will be 
destroyed when the tick is complete and the thread/process terminates 
normally.


The question is, is this good enough?

Regards,
Graham
--



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Excessive chunking [was: mod_disk_cache and atimes]

2008-03-26 Thread Dirk-Willem van Gulik


On Mar 26, 2008, at 5:23 PM, Konstantin Chuguev wrote:

A small correction: I meant writev() calls instead of sendfile()  
when working with small-size buckets.


The filter I'm developing provisionally splits the supplied buckets  
into relatively small buckets during content parsing. It then  
removes some of them and inserts some other buckets. Before passing  
the resulting brigade further down the filter chain, it merges all  
buckets that have their data in contiguous memory regions back  
together. So I guess I'm doing my bit in preventing excessive  
chunking.


I've done some research on the source files of httpd-2.2.6. The CORE  
filter seems to do de-chunking in the case when 16 or more buckets  
are passed to it (actually, the brigade is split if it contains  
flush buckets and each split part is checked for 16 buckets) AND the  
total amount of bytes in the 16 buckets does not exceed 8000. The  
filter then buffers the buckets together. Very clever.


Hmm - I am not sure that this always works - i.e. try this :)

$ cat test.shtml
!--#set var=foo value=bar --g!--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar -- !--#set var=foo value=bar --  
!--#set var=foo value=bar --


(make sure that all spaces between the  and  are gone) and have the  
usual:


AddType text/html .shtml
AddOutputFilter INCLUDES .shtml
Directory ..
Options Includes
..

in your config. You then get the output below.

Dw.

(echo GET /test.shtml HTTP/1.1; echo Host: localhost: echo; echo;  
sleep 10) | telnet localhost 80

Connected to localhost.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Wed, 26 Mar 2008 16:39:35 GMT
Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 PHP/5.2.5
Accept-Ranges: bytes
Transfer-Encoding: chunked
Content-Type: text/html



1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

2





Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Akins, Brian
On 3/26/08 10:31 AM, Nick Kew [EMAIL PROTECTED] wrote:

 Straightforward: conditions on headers, method (obsoletes Limit),
 request line, env, CGI vars.  With the option to disable conditional
 stuff for speed. 

In mod_include, we parse into a tree on every request.  For the
configuration, we should probably just parse it at startup and run it on
every request.

Also, currently, ap_expr is string specific, it would be nice if this was
provider based. Not sure of the exact interface, but it would be extendable
for other types of comparisons, for example.

typedef struct {
   apr_status_t (*init_expr)(apr_pool_t *p, const char *lvalue, const char
*rvalue, int expr, void**data);
apr_status_t (*eval_expr)(reuqest_rec *r, void *data);
} ap_expr_provider_t;

So this expresssion, at startup:

If Remote_IP =~ 10.189.
...
/endif

Would call the provider registered for Remote_IP as:

provider-init_expr(conf-pool, Remote_IP, 10.189., AP_EXPR_REGEX,
data);

The provider would construct what ever struct it needs, in this case, to do
partial ip address matching, shove that into a struct and return it via the
data argument;

And then, at request time, we would run:

provider-eval_expr(r, data)

Where data is what was returned by init.  This returns basically true or
false.

The string stuff could be easily integrated and provided by default.  The
nice thing, is all of this could be then used in mod_include as well, as
well as any other modules that use ap_expr.

Thoughts?


-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Proposal: a cron interface for httpd

2008-03-26 Thread Dirk-Willem van Gulik


On Mar 26, 2008, at 5:35 PM, Graham Leggett wrote:

We could come up with something capable of spawning a dedicated  
process and/or thread every time there is a successful tick to run  
that tick, so that tick+1 isn't delayed if there is an overrun.


Or reduce the interface to a simple 'register callback 'at or before',  
or 'at or after' this time - and leave it up to the called entity to  
re-register itself.


It also helps prevent leaks, as the tick will have a pool that will  
be destroyed when the tick is complete and the thread/process  
terminates normally.



Dw


Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Akins, Brian
On 3/26/08 12:42 PM, Akins, Brian [EMAIL PROTECTED] wrote:


 Thoughts?

Of course, it will not work exactly as I have said because we have to take
stuff like variable substitution into account, etc.  Was just thinking out
loud...



-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Nick Kew
On Wed, 26 Mar 2008 12:42:51 -0400
Akins, Brian [EMAIL PROTECTED] wrote:

 On 3/26/08 10:31 AM, Nick Kew [EMAIL PROTECTED] wrote:
 
  Straightforward: conditions on headers, method (obsoletes Limit),
  request line, env, CGI vars.  With the option to disable conditional
  stuff for speed. 
 
 In mod_include, we parse into a tree on every request.  For the
 configuration, we should probably just parse it at startup and run
 it on every request.

Indeed - hence the parse/eval separation in the proposed API.

 Also, currently, ap_expr is string specific, it would be nice if this
 was provider based. Not sure of the exact interface, but it would be
 extendable for other types of comparisons, for example.

Well, we always start from a string.  Later when it's tokenised
we can, and indeed do, dispatch to a provider (in mod_include's
case, functions called handle_foo for keyword foo).

 typedef struct {
apr_status_t (*init_expr)(apr_pool_t *p, const char *lvalue, const
 char *rvalue, int expr, void**data);
 apr_status_t (*eval_expr)(reuqest_rec *r, void *data);
 } ap_expr_provider_t;

That's no use at top level, because

 So this expresssion, at startup:
 
 If Remote_IP =~ 10.189.
 ...
 /endif
 
 Would call the provider registered for Remote_IP as:

we have to parse a string before we have Remote_IP.
Once we have that, sure, our evaluation function can dispatch
to the Remote_IP handler.

You seem to be looking a little further than my proposal went.
Which is kind-of why it would be good to hackathonise this:-)

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: Dynamic configuration for the hackathon?

2008-03-26 Thread Akins, Brian
On 3/26/08 1:14 PM, Nick Kew [EMAIL PROTECTED] wrote:

 we have to parse a string before we have Remote_IP.
 Once we have that, sure, our evaluation function can dispatch
 to the Remote_IP handler.

Of course.  I was getting ahead of my self...


 You seem to be looking a little further than my proposal went.
 Which is kind-of why it would be good to hackathonise this:-)

True.  Let me digest this some more...


-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Proposal: a cron interface for httpd

2008-03-26 Thread Jim Jagielski


On Mar 26, 2008, at 11:55 AM, Graham Leggett wrote:

Hi all,

On a number of occasions recently I have run into the need to run  
some kind of garbage collection within httpd, either in a dedicated  
process, or a dedicated thread.


Attempts to solve this to date have involved setting up of external  
tools to try and solve garbage collection problems, but this is  
generally less than ideal, as it amounts to stuff that the  
potential admin has to configure / get wrong.


Ideally I want httpd to worry about its own garbage collection, I as  
an admin don't want it to be my problem.


The interface I had in mind was a set of hooks, as follows:

ap_cron_per_second
ap_cron_per_minute
ap_cron_per_hour
ap_cron_per_day
ap_cron_per_week

It will be up to the relevant MPM to a) create the thread and/or  
process that is responsible for calling the hooks, and to actually  
call the hooks.


Modules just add a hook as needed, and take it for granted that code  
gets run within the limitations of the mechanism.


While not very sophisticated, it works very well with the hook  
infrastructure we have now. I am operating on the assumption that  
one single thread and/or process running on a server that calls a  
possible-empty hook once a second is cheap enough to not be a problem.


Before I code anything up, is this acceptable or are there glaring  
holes that I have not foreseen?




Sounds good... Why not have the parent process be the actual cron keeper
and use some method to signal the child processes (something pod-like
or maybe a shared memory struct) when something needs to be run?



Re: Proposal: a cron interface for httpd

2008-03-26 Thread Graham Leggett

Dirk-Willem van Gulik wrote:

Or reduce the interface to a simple 'register callback 'at or before', 
or 'at or after' this time - and leave it up to the called entity to 
re-register itself.


The Eclipse Job interface works like this, you basically say run this X 
ms from now, and if you want to run it again, you schedule it again 
before you're done. The catch is that the timings aren't that accurate, 
but then for most applications it doesn't need to be.


Regards,
Graham
--



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Proposal: a cron interface for httpd

2008-03-26 Thread Dirk-Willem van Gulik


On Mar 26, 2008, at 6:45 PM, Graham Leggett wrote:

Dirk-Willem van Gulik wrote:

Or reduce the interface to a simple 'register callback 'at or  
before', or 'at or after' this time - and leave it up to the called  
entity to re-register itself.


The Eclipse Job interface works like this, you basically say run  
this X ms from now, and if you want to run it again, you schedule  
it again before you're done. The catch is that the timings aren't  
that accurate, but then for most applications it doesn't need to be.


But that is just a matter of having clear semantics; i.e. 'run on, or  
as soon as possible after'. Or 'try to run before'. And we should be  
fine. The remainign issue is fragility; i.e. a task which aborts and  
'forgets' to re-register is in pain.


Dw


Re: Proposal: a cron interface for httpd

2008-03-26 Thread Chris Darroch

Graham Leggett wrote:

On a number of occasions recently I have run into the need to run some 
kind of garbage collection within httpd, either in a dedicated process, 
or a dedicated thread.


  I've also written a few modules where each child process runs
a private thread in the background.  I'd suggest there are perhaps
three variants here (in a Unix-like context, anyway): one separate
process, one thread in the master process, or one thread per child
process.  Presumably MPMs should somehow indicate which of these they
support.


ap_cron_per_second
ap_cron_per_minute
ap_cron_per_hour
ap_cron_per_day
ap_cron_per_week


  I wonder if you could flatten these down to a single
ap_monitor_interval() kind of thing, where the module specified
the interval it wanted?

  I suppose one could go the other direction toward a full-blown
scheduler, but that seems like a lot of extra effort for perhaps
little gain.

  It might be nice to also offer the option to randomly stagger the
creation of processes/threads within some additional time interval --
especially for one-per-child threads, that could help avoid a
thundering herd kind of situation where a bunch of child processes
start up together (e.g., at a restart), and later all kick off
resource-intensive threads at nearly the same time.  Of course a
module's background threads could wait some random interval on their own
after being started, but then they're eating up time sleeping while the
invoking process thinks they're doing work.


  My other thought is that it would be really nice to be able to
track the status of these processes and threads in mod_status; e.g.,
in the scoreboard or a scoreboard-like utility.  Moreover, it would
be excellent if the scheduler/MPM could use this info to avoid spawning
new processes/threads if the old ones were still executing ... for a
per-second kind of interval, that might be quite important, especially
if there's any chance the task could occasionally get stuck.

  I feel this relates a bit to my continued interest (mod lack of time)
in abstracting the scoreboard and shared-memory subsystems into a
shared map facility.  Joe Orton did a bunch of work on the SSL
session cache which I think moves it in this direction; see his
RFC and a couple of my responses:

http://marc.info/?l=apache-httpd-devm=120397759902722w=2
http://marc.info/?l=apache-httpd-devm=120406346306713w=2
http://marc.info/?l=apache-httpd-devm=120491055413781w=2

  I also put a larger outline of some of my notions in this regard into
this document:

http://svn.apache.org/viewvc/httpd/sandbox/amsterdam/architecture/scoreboard.txt?view=markup

  In particular, in thinking about background processes and threads
both of the type you're proposing and those created by modules like
mod_cgid and mod_fcgid, I had began tossing around the notion of modules
being able to register additional scoreboard states beyond those
hard-coded now into mod_status:

 - during pre/check/post-config phases, modules (including MPMs)
   may indicate if they need IPC space, what type, and how much:

   - private space in scoreboard table values
   - additional scoreboard states

 - at startup, the master process initializes the storage provider:

   - MPM sizes scoreboard based on runtime process and thread limits,
 not compile-time maximums
   - assigns IDs for additional scoreboard states requested by modules
   - creates scoreboard state-to-ID hash mappings in regular memory
 as part of read-only configuration data inherited by children

  We could then offer modules standard ways to ask for processes or
threads to be spawned at startup/restart time, or on some schedule
(as per your initial proposal), and for these processes/threads to
update their status record in the scoreboard.  Certain MPMs (e.g.,
worker, event) also spawn threads that aren't recorded in the scoreboard
at the moment; it would be great to see them in the scoreboard too.

  The administrator could then see everything at a glance in
mod_status, including all the background tasks, and the scheduler/MPM
would have a standard way to check if a background task was still
running from a previous invocation.

  I feel like there's a certain serendipity in this proposal coming
along around the same time as Joe Orton's work, which seems to me to
be heading toward not so much of a cache in the traditional httpd sense
(i.e., mod_cache and friends) as a generic shared map interface that
would be useful in a wide variety of ways, including implementing a
configurable scoreboard that could help track arbitrary background tasks.

  Thoughts, flames?  Fire away!  Thanks,

Chris.

--
GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263  E4DE C8E3 FA36 366A 375B



Re: Proposal: a cron interface for httpd

2008-03-26 Thread Akins, Brian
The way I do this is simple and primitive.  If I have an action I need to be
ran, I do something like:

Location /my-cron-job-thing
SetHandler MyCronJobThing
Allow from 127.0.0.1
Deny from All
/Location


And then in cron

* * * * * curl http://localhost/my-cron-job-thing  /logs/cronjob.log 21



-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies



Re: Proposal: a cron interface for httpd

2008-03-26 Thread Jim Jagielski


On Mar 26, 2008, at 2:52 PM, Akins, Brian wrote:
The way I do this is simple and primitive.  If I have an action I  
need to be

ran, I do something like:

Location /my-cron-job-thing
   SetHandler MyCronJobThing
   Allow from 127.0.0.1
   Deny from All
/Location


And then in cron

* * * * * curl http://localhost/my-cron-job-thing  /logs/ 
cronjob.log 21


I've done something similar with mutexed internal subrequests
that trigger a specific handler...