Re: what is in modules vs what is in the core

2009-03-30 Thread M. Brian Akins


On Mar 30, 2009, at 7:37 PM, Paul Querna wrote:


mod_watchdog is the latest offender in a series of modules that expose
additional functions to the API. (mod_proxy and mod_cache do too!)

What happened to all functions that are not inside server/* must be
either dynamic optional functions or hooks?



Some modules (mostly 3rd party??) allow it either way - optional  
function or just linkage.  I'm personally a fan of hooks and  
providers.  (With providers, I usually just do the lookup once in,  
say, post-config, and cache the results in the subscribing module  
- this saves some hash lookups on potentially every single request.)


As I hack on some lua stuff, it's useful to have the symbols for  
functions.  That may just be because I'm lazy, because I could do  
optional function lookups in library opens, I suppose.  OT, but I like  
my Lua glue in a lua module and just use require  
'apache2.memcache' (or whatever) to do the linking.  This works  
really well with per thread lua states that are all loaded at  
startup... (hint, hint)


--Brian


Re: mod_serf: now somewhat working

2009-03-28 Thread M. Brian Akins


On Mar 28, 2009, at 11:09 AM, Paul Querna wrote:



- Much simpler configuration than mod_proxy, using location blocks
(or locationMatch), rather than ProxyPass' hacking of URI stuff way
earlier.



It'd be nice to be able to configure and do stuff at run time using Lua.

Someone had to ask...

--Brian


Re: mod_dbd analogue for memcached

2009-03-06 Thread M. Brian Akins


On Mar 6, 2009, at 1:25 AM, Kevac Marko wrote:


And after that you have to restart apache? It's not enough dynamic  
for us.




graceful restart.



Re: mod_dbd analogue for memcached

2009-03-05 Thread Brian Akins
On 3/5/09 9:57 AM, Kevac Marko ma...@kevac.org wrote:

 The question I want to ask is should I base my module on apr_memcache
 or not? Is apr_memcache mature enough? Whether is is used by someone?

Yes. Yes. Yes.

Is this based on the existing mod_memcache?

--Brian




Re: mod_dbd analogue for memcached

2009-03-05 Thread M. Brian Akins


On Mar 5, 2009, at 4:35 PM, Kevac Marko wrote:

ise.

mod_memcache, if we are talking about
http://code.google.com/p/modmemcache/, is too simple.



Works fine for me :)


I need multiple name pools to multiple servers. Something like




Will you implement a default pool like mod_dbd does?


This feature enables us to create dynamic, not static servers list.
For high availability clusters for example.



Sounds interesting, I suppose. I just generate my configs from  
templates and they generate the correct server list based on whatever







Re: [PATCH] mod_dbd with more than one pool

2009-03-05 Thread M. Brian Akins


On Mar 5, 2009, at 4:52 PM, Kevac Marko wrote:


80 columns is a little bit ancient requirement in the world of 22
LCDs, but ok, i'll fix that too.


Not when you have 4 code branches side by side...

Also, netbooks are pretty popular as well.  Most of my coding is done  
on a 15 laptop screen.


--Brian


Re: patch for handling headers_in and headers_out as tables in mod_lua

2009-03-02 Thread Brian Akins
On 2/28/09 8:37 PM, Brian McCallister bri...@skife.org wrote:

 It could be just:
    apr_hash_set(dispatch, document_root, APR_HASH_KEY_STRING,
                 makefun(req_content_encoding_field, APL_REQ_FUNTYPE_STRING,
                         p));
 


Also, couldn't we build the dispatch has once, and only once, and then just
associated it with each apache2.request?  This seems more efficient than
building this array every time.  It would also be nice to use a dispatch
hash for setters as well.

--Brian




Re: mod_wombat and mod_lua

2009-03-02 Thread Brian Akins
On 2/27/09 2:56 PM, Brian McCallister bri...@skife.org wrote:

 Maybe virtually joining :-)

It maybe interesting to have a virtual extension to the hackathon.  IRC is
good, but misses a lot of discussion I think.  Some actual voice
interaction would be nice.  I've see you all before, so I don't really want
video :P  Thoughts?  I'd really like to get mod_lua standardized, fast,
and in 2.2, if possible (or get 2.4 out very soon).  Willing to help, just
let me know how.

--Brian




Re: mod_wombat and mod_lua

2009-02-25 Thread Brian Akins
On 2/25/09 11:23 AM, Brian McCallister bri...@skife.org wrote:
It can use Brian's thread scope approach (thugh I
 actually want to change it to add the pool still, and backport all the
 rest of the trunk changes),

I was thinking about this, and I think having direct access to the theead
from request_rec is not needed.  If we manage the array of pools from within
mod_lua, then you can just handle it there.  This array only needs to get
created if we have thread/server scoped vm's.  My horrible patch I
submitted a while back did it that way.

Also, does the idea of a vm_spec still make sense?

I'd really like a buildable version for 2.2.

FWIW, we hacked up several modules to add Lua support to them and I'd like
to port these to the official mod_lua if possible, as we are using our
hacked (well, now completely rewritten) version of mod_wombat.  We have
memcache, geoip, and some of or internal modules all with lua support now.
Usually the lua glue code is less than 100 lines of C.

I wish we could have a hackathon.  We could knock out a bunch of this things
in one sitting, I think.

--
Brian Akins




Re: use of APR_SENDFILE_ENABLED in mod_disk_cache

2009-02-17 Thread Brian Akins
On 2/16/09 5:06 AM, Niklas Edmundsson ni...@acc.umu.se wrote:

 +core_dir_config *coreconf = ap_get_module_config(r-per_dir_config,
 + core_module);


This is a perfect example of why we need a call to hide core_module stuff
from modules.  We talked about this before and we are still propagating
this, IMO, bad habit.

--bakins




Re: [PATCH] mod_dbd with more than one pool

2009-02-11 Thread Brian Akins
On 2/11/09 4:29 PM, Kevac Marko ma...@kevac.org wrote:


 What so you think?
 
 Patch is ready, but it needs some testing before posting.

+1

I was looking to do the same thing to mod_memcache (which should be imported
into trunk, IMO...)





Re: [PATCH] mod_dbd with more than one pool

2009-02-11 Thread M. Brian Akins


On Feb 11, 2009, at 6:22 PM, Graham Leggett wrote:


Would it make sense for mod_memcache to become a provider beneath  
mod_socache, or am I missing something?


mod_memcache really just provides the config glue for apr_memcache  
so that every module that wants to use apr_memcache doesn't have to  
write all the config glue themselves.  Yeah, it could probably add  
some socache hooks as well. I'm adding some Lua glue right now.   
mod_memcache is already Apache licensed.




Re: [mod_lua] vm management

2009-02-06 Thread Brian Akins
On 2/6/09 8:09 AM, Bertrand Mansion bmans...@mamasam.net wrote:

 What do you call slow ? Do you have benchmarks ?

Some round numbers:
No Lua at all: 40k/sec
Our hack version (based on older mod_wombat): 34k/sec
New mod_lua: 20k


Most of this it seems is from the lua states being created for every single
request pool.  This is some lua that runs on every single request.


  Do you have an
 example of applications that could benefit from this ?

Sure.  We have some lua that runs on (almost) every single request.  We do a
fair bit of requests a second.  The build up and tear down of states - while
cheaper than most scripting languages - is just unnecessary.  We are careful
about variable scoping, etc.

Isn't this a
 way to shoot yourself in the foot ?

We already have several custom C modules, so we are more than comfortable
with load once, run many times.

Slightly OT: I had to stop using my work email for the list bcs the mailing
list mail servers don't like our corporate mail server.  So, it looks like
I'm just another hack.  A few folks can vouch for what I do with apache in
my day job...




Re: [mod_lua] vm management

2009-02-06 Thread Brian Akins
On 2/6/09 12:17 AM, Brian McCallister bri...@skife.org wrote:

 * One entry point to obtain VMs
   This is the apl_get_lua_state(..) function. It is passed the
   information required to either find, or create the needed
   lua_State. This is:
 
   - the lifecycle pool to which it is bound
   - the file name to define stuff in the lua_State
   - package load paths for the lua_State
   - package load paths for lua cmodules
   - a callback and baton to be invoked if the lua_State is created


Do we really need paths and cpaths to be configurable in any way besides
globally?  In individual scripts, you can add more paths.

Also, we already have the lua_open hook as well as lua_request hook, so the
callback doesn't really need to be there either.

So, I'd propose that we change it to this:

 apl_get_lua_state(apr_pool_t *pool, const char *file, const char *data)

Pool is the lifecycle pool to which it is bound. File is , usually, the file
name to define stuff in the lua_State.  Data is a raw string to use (rather
than a file) and if present, then file is used as an identifier.  The whole
spec idea, then is only for internal mod_lua use, not for general
consumption.

Thoughts?





Re: [mod_lua] vm management

2009-02-06 Thread Brian Akins
On 2/6/09 9:40 AM, Bertrand Mansion bmans...@mamasam.net wrote:

  I remember having met a lot
 of problems with the old mod_wombat due to persistent states, even
 with cache set to never. In the end, I had to restart the server
 each time I modified a lua source file. It seems that you suggest to
 reintroduce features that would make this happen again.


Actually, that is exactly what I want.  I don't want it to be by default,
but I want it to be available to those who need it.
The overhead to creating a lua state is very low.  Most of the time is spent
parsing the file (compiling it if you will).  If you need to handle
several thousand requests per second, however, even this small amount of
overhead is too much.  I'd like to have the option - without maintaining a
hacked version - to do have persistent per thread lua states.

--bakins





Re: [mod_lua] vm management

2009-02-06 Thread M. Brian Akins


On Feb 6, 2009, at 12:17 AM, Brian McCallister wrote:



* VMs are attached to apr pools.


+1



* Concurrency is up to the client


What are you defining as the client?


 We should not expose configuration options which will create
 concurrency issues, such as attaching a VM to the server_rec
 pool. It is very possible for someone to programmatically use the
 module to do things like that, but if they do any locking or
 resource pooling is up to them.


Sure, bcs attaching to server_rec pool is silly.  However, we do, IMO,  
need a way to have some other way besides r-pool in stock mod_lua.   
This is just too slow for a lot of things.




* One entry point to obtain VMs


+1.  I personally do not like exposing tons of struct info, but I'm -0  
on that.





We want to be able to get back to stock mod_lua, but it's just too  
darned slow right now :(  Having some type of per-allocation and long  
lived LuaStates helps this allot.  All the C stuff is done that way  
(in memory once, ran many times) and I'd like to be able to do the Lua  
stuff the same way without having to bolt on yet another bakins- 
specific module.


OT: toying with lua server pages :)



Mod_lua per thread states/scopes

2009-02-05 Thread Brian Akins
I'm hacking on getting the former work I did for mod_wombat to have
per-thread lua states/scopes into mod_lua.  I like the new pool approach.
My main reason for this is we have some code that runs on almost every
request that we are careful about variable scope, etc that the overhead of
creating a new state every time would just kill performance.

Hopefully have some code in next couple of days...




Re: Mod_lua per thread states/scopes

2009-02-05 Thread Brian Akins
Here's a patch just to give you an idea of what I'm thinking.  It compiled.
This is more to get some ideas going to see how/iff this fits into mod_lua.

The biggest issue is that the package_paths are per_dir and in post_config,
you can't really get at those in a useful way.  I precompile all of the
server scoped handlers in post_config so that there is no worry about that
once we start taking requests.

This just makes an array of pools sized at thread_limit, so that basically
each thread has it's own pool.  I use the connection-id to determine what
thread a request is running in, bcs there is no real way to get at that.  I
used worker here, so this may not work anywhere else as it's tied to how
worked calculated the connection-id.  Since the pool is per thread, we
don't have to worry about any locking (in worker at least).

--bakins



mod_lua-server-scope.diff
Description: Binary data


Re: patch for handling headers_in and headers_out as tables in mod_lua

2009-02-05 Thread Brian Akins
Is this in trunk? I don't see it, but I've been known to overlook stuff.

On 1/26/09 1:45 PM, Brian McCallister bri...@skife.org wrote:

 For anyone following, it has been applied :-)
 

 
 -- NEW
 function handle(r)
 local host = r.headers_in['host']
 r:puts(host)
 
 -- and can now modify them
 r.headers_in['X-XX-Fake'] = 'rabbits!'
 r.headers_out['wombat'] = 'lua now!'
 end

Wasn't this the way it used to be?

--bakins




Re: patch for handling headers_in and headers_out as tables in mod_lua

2009-02-05 Thread Brian Akins
On 2/5/09 1:51 PM, Brian McCallister bri...@skife.org wrote:

 Yep, Paul changed the internal impl to be less gross, but in doing so
 changed the API, i changed the impl to be not gross and restored old
 API.

Okay I see it now.

I may take a crack at a little performance tuning.  Setting up the dispatch
table every single time doesn't seem necessary.  We should be able to just
do that once at httpd start time.


So, if I'm reading that correctly, this should work now as well??

 r.content_type 

And we would just hack up req_newindex to be able to do
 r.content_type = application/bakins

I like the dispatch idea.  I'll think about the same for newindex as well...




Re: 3.0 - Introduction

2007-02-14 Thread Brian Akins

Make everything possible into a hook or use the provider model.

Simple example: the way we determine if a connection can be kept alive is a 
monolithic function.  This should be a hook.


Disk I/O (read/write/seek, etc.) could be abstracted by providers, for example. 
 Maybe we need full blown VFS??




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: 3.0 - Proposed Goals

2007-02-14 Thread Brian Akins

Jim Jagielski wrote:



This makes a lot of sense, but please NOT AJP... It
seems to be that staying with HTTP is the most scalable,
easiest to debug and troubleshoot, and the most straightforward.



Would be nice if we could do HTTP over unix domain sockets, for example.  No 
need for full TCP stack just to pass things back and forth between Apache and 
back-end processes.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


mod_memcache??

2007-02-02 Thread Brian Akins
I have a need to write a generic way to integrate  apr_memcache into httpd. 
Basically, I have several otehr modules taht use memcached as backend and want 
to combine the boring stuff into a central place, ie configuration, stats, 
etc.  We talked a little on list about this a few months ago, but noone ever did 
anything.   Is anyone else interested in this?  Has anyone did this?


Basically I was thinking there would be a single funtion:

apr_status_t ap_memcache_client(apr_memcache_t **mc)

which would simply give the user an client to use with normal apr_memcache 
functions.  The module could create the underlying mc at post_config.


Basically, mod_memcache could have this config:

MemCacheServer memcache1.turner.com:9020 min=8 smax=16 max=64 ttl=5
MemCacheServer memcache4.turner.com:9020 min=8 smax=16 max=64 ttl=5
MemCacheServer memcache10.turner.com:9020 min=8 smax=16 max=64 ttl=5

or whatever.  This would end the config duplication between various modules. 
This module could also add memcache stats to /server-status


Comments?

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache+mod_rewrite behaviour

2007-01-23 Thread Brian Akins

Niklas Edmundsson wrote:
Since mod_cache runs as a quick handler, matching based on URL would 
probably be the easiest since you don't have the mime type info then.


Maybe something like

CacheEnable disk /special/path ignore_query

Could add other options if future as well.
--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: httpd-proxy-scoreboard how to go on

2006-12-18 Thread Brian Akins

Jim Jagielski wrote:

I thought the whole idea was to abstract out the
scoreboard so that it was easier for people to add
and remove tables from the scoreboard... the so-called
generic scoreboard.


I donated the mod_slotmem code a few weeks ago that was a rather simple way to 
do just that.  Needs some work, but is at least a startr.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Wrong etag sent with mod_deflate

2006-12-13 Thread Brian Akins

Henrik Nordstrom wrote:

But the unique identity of the response entity is defined by request-URI
+ ETag and/or Content-Location. The cache is not supposed to evaluate
Accept-* headers in determining the entity identity, only the origin
server.


However, on an initial request (ie, non-conditional) we do not have an etag from 
the client, we only have info like Host, URI, Accept-*, etc.  So, how would the 
cache identify which entity to serve in this case?



Please see RFC2616 13.6 Caching Negotiated Responses, it explains how
the RFC intends that caches should operate wrt Vary, ETag and
Content-Location in full detail.


I have read it many times.. In our case - cnn.com, etc. - we have to decided to 
be RFC compliant from the client to the cache server.  From the cache to the 
origin, however, we are not as concerned.  In a reverse-proxy-cache, this is not 
a big deal. However, in a normal forward-proxy-cache, where one does not 
control both cache and origin, one must be more careful.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Wrong etag sent with mod_deflate

2006-12-12 Thread Brian Akins

Henrik Nordstrom wrote:

mån 2006-12-11 klockan 14:25 -0500 skrev Brian Akins:

So, multiple variants of the same object can have the same Etag, but still be 
different cached objects.


Your implementation ignores RFC 2616 13.6 Caching Negotiated Responses,
but is otherwise fine. It's functionally compliant but not as effective
as it could be.


That was a simplified explanation, we actually do not store a cache entry for 
every single variant.  In our case the only thing we actually ever care about is 
whether or not you support gzip.  So all the variants for Vary: User-Agent, 
Accept-Encoding actually boil down to 2 variants - gzip or no-gzip.


One of the major reasons we quit using squid was it support for Vary's. (This 
was pre-3.0, so things may have changed). Of course, at the time httpd wasn't 
any better - but it was alot easier to hack ;)



Variants is
identified by ETag or Content-Location. Only if there is neither ETag or
Content-Location in the response entity then is the response entity
identified by the Vary request headers.

Only conditional requests from clients, generally, have If-None-Match headers. 
So the only way for a cache, on an initial request from a client, to determine 
what object to serve is to use the Client supplied information - which doesn't 
include an Etag, so you have to, usually, rely on URI first, and then the Vary 
information.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Wrong etag sent with mod_deflate

2006-12-11 Thread Brian Akins
This is not a response to any post on this subject, but more of a comment.  Here 
is a real world example of how we use deflate and etags with our cache. (Note 
this is very similar to mod_cache, but I do not know the inner workings of it as 
well).


1. Generate key from URI and ap_get_servername
2. open cached object.  Is it Vary? no, goto step 5.
3. Is Vary. Generate new key.
4. Open cached object.
5. Check expiry time, exit if expired.
6. Load headers.
7. Call ap_meets_conditions (etags, IMS, etc.)  If yes, return 304 (or 
whatever).
8. If not meets_conditions, serve from cache.

So, multiple variants of the same object can have the same Etag, but still be 
different cached objects.


This probably has no bearing on the current conversation, but perhaps I am not 
fully appreciating the core of the debate??


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


[PATCH} MaxKeepAliveConnections in http module

2006-11-21 Thread Brian Akins
Rather ugly, but in the http module rather than mpm.  Still needs some work but 
works in most cases here.  I think there are instances where an aborted 
connection will not decrement the count, maybe.  Had to add configuration stuff 
to http_core, since it, wrongly in my opinion, uses the core server config 
structures.



Patch is against 2.2.3, since that's what we run mostly around here.

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies
diff -ur httpd-2.2.3/include/http_core.h 
httpd-2.2.3-keepalive/include/http_core.h
--- httpd-2.2.3/include/http_core.h 2006-07-11 23:38:44.0 -0400
+++ httpd-2.2.3-keepalive/include/http_core.h   2006-11-15 15:26:14.0 
-0500
@@ -682,6 +682,8 @@
 
 /* -- */
 
+AP_DECLARE(int) ap_can_keepalive(request_rec *r);
+
 #ifdef __cplusplus
 }
 #endif
diff -ur httpd-2.2.3/modules/http/http_core.c 
httpd-2.2.3-keepalive/modules/http/http_core.c
--- httpd-2.2.3/modules/http/http_core.c2006-07-24 09:34:19.0 
-0400
+++ httpd-2.2.3-keepalive/modules/http/http_core.c  2006-11-15 
17:20:25.0 -0500
@@ -35,6 +35,8 @@
 
 #include mod_core.h
 
+module AP_MODULE_DECLARE_DATA http_module;
+
 /* Handles for core filters */
 AP_DECLARE_DATA ap_filter_rec_t *ap_http_input_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_http_header_filter_handle;
@@ -42,6 +44,28 @@
 AP_DECLARE_DATA ap_filter_rec_t *ap_http_outerror_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_byterange_filter_handle;
 
+
+typedef struct {
+int maxkaconn;
+} http_core_conf_t;
+
+static apr_uint32_t current_ka = 0;
+
+static void *create_http_conf(apr_pool_t * p, server_rec *s)
+{
+http_core_conf_t *conf = apr_pcalloc(p, sizeof(http_core_conf_t));
+return conf;
+}
+
+static const char *set_maxkaconn(cmd_parms *cmd, void *dummy,
+ const char *arg)
+{
+http_core_conf_t *conf 
+= ap_get_module_config(cmd-server-module_config, http_module);
+conf-maxkaconn = atoi(arg);
+return NULL;
+}
+
 static const char *set_keep_alive_timeout(cmd_parms *cmd, void *dummy,
   const char *arg)
 {
@@ -94,6 +118,9 @@
   or 0 for infinite),
 AP_INIT_TAKE1(KeepAlive, set_keep_alive, NULL, RSRC_CONF,
   Whether persistent connections should be On or Off),
+AP_INIT_TAKE1(MaxKeepAliveConnections, set_maxkaconn, NULL, GLOBAL_ONLY,
+  Maximum number of Keep-Alive connections per child, 
+  or 0 for infinite),
 { NULL }
 };
 
@@ -206,9 +233,43 @@
 return OK;
 }
 
+AP_DECLARE(int) ap_can_keepalive(request_rec *r) {
+http_core_conf_t *conf 
+= ap_get_module_config(r-server-module_config, http_module);
+apr_uint32_t current;
+
+if(conf-maxkaconn) {
+current = apr_atomic_read32(current_ka);
+
+if(current = conf-maxkaconn) {
+return 0;
+}
+}
+
+return 1;
+}
+
+static apr_status_t keepalive_cleanup(void *data)
+{
+conn_rec *c = (conn_rec *)data;
+
+/*need to check for abort??  Also, what happens when client closes and we 
are in keepalive?
+ * should we register a cleanup on connection pool?
+ */
+if(c-keepalive == AP_CONN_KEEPALIVE) {
+apr_atomic_inc32(current_ka);
+}
+
+return APR_SUCCESS;
+}
+
 static int http_create_request(request_rec *r)
 {
 if (!r-main  !r-prev) {
+conn_rec *c = r-connection;
+ http_core_conf_t *conf 
+ = ap_get_module_config(r-server-module_config, http_module);
+ 
 ap_add_output_filter_handle(ap_byterange_filter_handle,
 NULL, r, r-connection);
 ap_add_output_filter_handle(ap_content_length_filter_handle,
@@ -217,6 +278,15 @@
 NULL, r, r-connection);
 ap_add_output_filter_handle(ap_http_outerror_filter_handle,
 NULL, r, r-connection);
+
+if(conf-maxkaconn) {
+/*this connection had been kept alive, but it's now active 
again*/
+if(c-keepalive == AP_CONN_KEEPALIVE) {
+apr_atomic_dec32(current_ka);
+}
+apr_pool_cleanup_register(r-pool, c, keepalive_cleanup,
+  keepalive_cleanup);
+}
 }
 
 return OK;
@@ -265,7 +335,7 @@
 STANDARD20_MODULE_STUFF,
 NULL,  /* create per-directory config structure */
 NULL,  /* merge per-directory config structures */
-NULL,  /* create per-server config structure */
+create_http_conf,  /* create per-server config structure */
 NULL,  /* merge per-server config structures */
 http_cmds, /* command apr_table_t */
 register_hooks /* register hooks */
diff -ur httpd-2.2.3/modules/http/http_protocol.c

non-blocking file buckets, cor output, and 2.2.3

2006-10-31 Thread Brian Akins

In reference to some mod_cache discussions:


It seems, that after some testing, that in 2.2.3, the core output filters will 
block when given file buckets, therefore, stalling the entire brigade (ie, 
slowing reads from proxy, cgi, etc.).  This was a somewhat artificial test I 
did, but can someone confirm if something changed in trunk that allows file 
buckets to be handled differently...



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: non-blocking file buckets, core output, and 2.2.3

2006-10-31 Thread Brian Akins

Brian Akins wrote:
  This was a 
somewhat artificial test I did, but can someone confirm if something 
changed in trunk that allows file buckets to be handled differently...


Actually got of my rear and looked at it myself :) From the looks of it, 
core_output_filter in 2.2.3 does not have any special handling of file buckets 
and is in many ways different from trunk.  FWIW.






--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache summary and plan

2006-10-30 Thread Brian Akins

Davi Arnaut wrote:


The solution consists of using the cache file as a output buffer by
splitting the buckets into smaller chunks and writing then to disk. Once
written (apr_file_write_full) a new file bucket is created with offset
and size of the just written buffer. The old bucket is deleted.


Without having looked very much at the code, this approach sounds feasible.

I'm still confused as to why we need the temporary brigade???  Why not swap 
the buckets?



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_disk_cache summarization

2006-10-24 Thread Brian Akins

Niklas Edmundsson wrote:

The comparison of your and Brian's experience are two ends of extremes on
high volume caches, one low hits large files, the second high hits small
files. This should make for some useful tuning information.


The extreme difference is what makes me think that we should acknowledge 
that they exist and provide the relevant knobs where necessary. As it 
looks right now, those knobs tend to be more OS/filesystem specific, but 
that might change as this evolves.



My thought on this is that we use providers, so in theory, you could use a 
different provider for the different types:


CacheEnable /largecrap large_disk_with_stat_sleep_thing
CacheEnable /normalstuff normal_disk

Perhaps the only difference between these two is the CACHE_IN mechanism (ie, 
serving from cache is the same).


There is no reason, IMNSHO, to try to shove all the functionality under the sun 
into one cache provider.  The current mod_disk_cache (before the mass patches) 
works for a large percentage of cases.  I see no reason to mutilate it.  Just do 
a new provider...


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_disk_cache summarization

2006-10-24 Thread Brian Akins

Plüm wrote:


Agreed. If it turns out that the common code base between both
cases is only small and it is complex to do both things in one
provider just make two providers out of them. The remaining common
code could be factored out in a separate disk_cache_util c file
which is used by both providers.



It may be possible to even decide which provider to use based on various 
factors: object size (content-length), time of day, phase of moon, cpu usage, 
disk io, cache size etc.  Could be as simple as shoving the provider in an env 
like cache-provider = brians_wacky_memcache and mod_cache tries that one first.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: [PATCH] add MaxKeepAliveConns to event mpm

2006-10-24 Thread Brian Akins

Brian Akins wrote:
Allows you to limit number of keepalive connections per child.  Patched 
against 2.2.3 as trunk just seems broken right now.  Did some light 
testing, seems to work.




After some looking, this really belongs in the http module.  However, it is so 
intertwined with core server stuff, that it may be a little tricky to do...


Maybe I'll just submit a mod_max_keepalive...




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


mod_disk_cache summarization

2006-10-23 Thread Brian Akins
Can someone please summarize the various patches for mod_disk_cache that have 
been floating around in last couple weeks?  I have looked at the patches but 
wasn't real sure of the general philosophy/methodology to them.


Others may find it useful as well

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_offline

2006-10-23 Thread Brian Akins

Graham Leggett wrote:
- Cache-Control and Pragma headers must be stripped from requests into 
the cache.


Stripped, or ignored.  Think CacheIgnoreCacheControl.  We can by configuration 
control on a per virtual basis.  We also can ignore query strings in same way.


- mod_cache must, if the backend response is 5xx, deliver the latest 
cached data the server has present, regardless of whether the cached 
entry is stale or not.


Yep.



Is there anything else it would need to do?


There is already a hook in mod_proxy that gets called when all origins are down 
(cache_request_status or something).  Just need an optional function in 
mod_cache, that says serve and don't check expires.  Also, don't delete expired 
data inside mod_cache.  htcache daemon should probably not delete expired stuff 
right away.  Expired objects should get a grace period of some configurable 
length.



Make sense?

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_offline

2006-10-23 Thread Brian Akins

Issac Goldstand wrote:


- Cache-Control and Pragma headers must be stripped from requests into
the cache.


Why?  Just because you are in offline mode doesn't mean other proxies
between you and the client (or the client itself) are.



Graham is talking about headers coming from client to the cache server, I fI 
read this correctly.




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: httpd 2.2 cache - disable and enable

2006-10-12 Thread Brian Akins

 wrote:


Wouldn't it be easier to do a match on mime type, like ExpiresByType?


we do not know the mime type in quick_handler.  In theory, you could 
disable/enable the  CACHE_SAVE filter by mime type, but that seems a little 
messy because we would check in quick_handler to see if its cached.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: httpd 2.2 cache - disable and enable

2006-10-12 Thread Brian Akins

Matthieu Estrade wrote:


IMHO, a regexp based cache enable or disable could be very usefull for a
default caching policy shipped with httpd.
We could do per default caching only on all images, css and all static
content.


Some random thoughts:

Personally, I think the cache rules matching methods should be provider based. 
 ie, something like:


CacheEnable disk /images prefix #prefix could be default
CacheEnable disk *.gif$ regex
CacheEnable mem cache_me match #use strmatch stuff, faster than regex
CacheEnable disk - all #matches every thing.  A little faster than prefix /
CacheDisable ftp protocol #enable/disable based on protocol
CacheDisable 1.2.3.4 host #don't cache things for this client


Of course, we already do almost all that stuff with everything else, but cache 
is quick handler, so as Colm noted, location, locationmatch, etc don't work. 
Unfortunately, if you move it to normal handler, you lose a bit of performance. 
   We cache alot of stuff to avoid the mangled mess of rewrite rules that would 
run if mod_cache was not a quick handler on every single request.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Coding style

2006-10-02 Thread Brian Akins

Garrett Rooney wrote:

Or the even more readable:

rv = do_something(args);
if (rv == APR_SUCCESS) {

}


yuck! Think of all the harmless newlines you are senselessly wasting. 
Our children will have to code with no newlines if we do not conserve 
them today.  Won't someone please think of the children.


Seriously, not a fan of that style.  Doesn't matter that much to me, 
however.


-0

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


[Fwd: Re: [PATCH] setenvif filter]

2006-09-29 Thread Brian Akins

Bringing this up. again.

Adds a filter that allows mod_setenvif to act on response headers.



 Original Message 
Subject: Re: [PATCH] setenvif filter
Date: Wed, 31 May 2006 17:24:33 +0200
From: Francois Pesce [EMAIL PROTECTED]
Reply-To: dev@httpd.apache.org
To: dev@httpd.apache.org
References: [EMAIL PROTECTED] 
[EMAIL PROTECTED]	 [EMAIL PROTECTED]


These patches may fix the r-content_type behaviour. Are you OK with it ?

--
*Francois Pesce*

2006/5/31, Brian Akins [EMAIL PROTECTED]:

Francois PESCE wrote:
 I've discussed about a patch for mod_setenvif 2 years ago, and have
 coded it at that time, it is successfully used on various host in
 production since.


You need to handle content type specially by checking r-content_type.
For some reason, just doing apr_table_get(r-headers_out,
Content-type) would be null, but content_type would be set.

See the patch I posted a few days ago.


+1 in concept

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


mod_setenvif-2-2-x-2.patch
Description: Binary data


mod_setenvif-2-0-x-2.patch
Description: Binary data


Re: Regexp-based rewriting for mod_headers?

2006-09-29 Thread Brian Akins

+1 on the patch

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: [patch 09/16] simplify array and table serialization

2006-09-20 Thread Brian Akins

Davi Arnaut wrote:

Simplify the array and table serialization code, separating it from
the underlying I/O operations.



Probably faster to just put every thing in an iovec (think writev).

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-20 Thread Brian Akins

Niklas Edmundsson wrote:
don't care about performance...


Actually, cache on xfs mounted with atime doesn't seem to be a 
performance killer oddly enough... Our frontends had no problems 
surviving 1k requests/s during the latest mozilla-update-barrage.


1k requests/second is not really that much...  10k requests/second is 
more what I'm used to.  XFS sucks for us as a cache storage.  It tends 
to crock under some traffic patterns (reads vs writes).  ext3 is 
actually more reliable for us.  Reiserfs is interesting, but tends to go 
haywire from time to time.


We clean our cache often because we have a really quick way to find the 
size and remove the oldest expired objects first.  Every cache store 
gets recorded in SQLite with info about the object (size, mtime, expire 
time, url, key, etc.).  Makes it trivial tow write cron jobs to do cache 
management.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: [patch 09/16] simplify array and table serialization

2006-09-20 Thread Brian Akins

Davi Arnaut wrote:


On 20/09/2006, at 10:16, Brian Akins wrote:

  Davi Arnaut wrote:
  Simplify the array and table serialization code, separating it from
  the underlying I/O operations.
 
  Probably faster to just put every thing in an iovec (think writev).

Probably no, apr_brigade_writev does (quite) the same.


Doesn't mean apr_brigade_writev does it fast either...


If the serialization simply returned an iovec, mod_mem_cache could use 
apr_pstrcatv and mod_disk_cache could use apr_file_writev.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-20 Thread Brian Akins

Issac Goldstand wrote:

I don't understand why bother getting so complex.  Touch/truncate the
body file when storing the header, and then a missing body means things
have gone amok - retry the request.  Conversely, a zero-length, or  C-L
body length means another thread is working on the body.



unless 0 is a valid content-length, which it can be.  Also, what about 
when we are reading something in without a know C-L, for example from an 
origin doing chunks?




  You're right, this is a tricky one, but there is a solution out there.
Maybe we're attacking the problem from the wrong angle.  Rather than
modifying mod_cache, modify the garbage-collector (e.g., htcacheclean). 
Do a two pass cleanup. 


I think it's insane that it has to traverse the directory structure to 
do find the objects.  There should be an index of objects.  Traversing 
the tree can be a huge hit on large, busy structures.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_slotmem

2006-09-19 Thread Brian Akins

Oden Eriksson wrote:

onsdag 30 augusti 2006 10:37 skrev Brian Akins:

With all the talk of a generic scoreboard, here's something I whipped
up that allows any other module to have some amount of memory per
worker slot.  We have a different module in-house at CNN which does
something similar. This one is a little rough around the edges, but
gives an idea of what I was thinking about doing.


Care to release this and with a license?


I guess the ASF can have it and use Apache license.  It is a little 
rough, but maybe some will find it useful.





--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-18 Thread Brian Akins

Niklas Edmundsson wrote:



Extra tracking sounds unnecessary if you can do it in a way that
doesn't need it.


It's not extra it just adding some tracking.  When an objects gets 
cached log (sql, db, whatever) that /blah/foo/bar.html is cached as 
/cache/x/y/something.meta.  Then it's very easy to ask the store what 
is /blah/foo/bar.html cached as?  There may be multiples because of vary.



* Clients read from the cache as files are being cached.


That's the hard one, IMO


* Only one session caches the same file.


Easy to do if we use deterministic tmp files and not the way we 
currently do it.  Then all you have to do is when creating temp files 
use O_EXCL.



* Header/Body updates.


Eaiser with seperate files like mod_disk_cache does now.


* No index/files out-of-sync issues. Ever.


Hard to guarantee, but not impossible.  Always to index when storing 
file and remove when deleting.  This should use something like providers 
so it's not in core cache code and can be easily modified.



With locks, yes it's possible but also a hassle to get right with
performance intact.


Not really that hard.  Trust me it has been done...



We, as a ftp mirror operated by a non-profit computer club, have a
slightly different usecase with single files larger than machine RAM
and a working set of approx 40 times larger than RAM. Some bad design
decisions in mod_disk_cache becomes really visible in this
environment.


Seems to me you should approach problem differently, like rsyncing the 
mirrored content.  I don't know your environment, but was just what I 
cam up with off the top of my head.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-18 Thread Brian Akins

Graham Leggett wrote:


I have not seen inside the htcacheclean code, why is the code reading the
headers? In theory the cache should be purged based on last access time,
deleted as space is needed.


Everyone should be mounting cache directories noatime, unless they don't 
care about performance...



Your patch is battle tested, and fixes some specific problems, the only
issue that I think needs to be resolved is the question of whether single
file or multiple files are preferable, taking into account performance on
platforms other that Linux as well.


I'm very interested in this as well.  Very good ideas that just need a 
little refinement.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-18 Thread Brian Akins

Issac Goldstand wrote:


 I can see how other tracking information (like how often the
cached entity is accessed, last access time, etc) would be useful,



Also, those statistics could be updated asynchronously by using a queue 
so that statistics doesn't slow down a busy web server.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-15 Thread Brian Akins

Niklas Edmundsson wrote:

Will it be possible to do away with one file for headers and one file
for body in mod_disk_cache with this scheme?

The thing is that I've been pounding seriously at mod_disk_cache to
make it able to sustain rather heavy load on not-so-heavy equipment,
and part of that effort was to wrap headers and body into one file for
mainly the following purposes:


The separate header and body files work wonderfully for performance 
(filling multiple gig interfaces and/or 30k requests/sec. or rather 
modest hardware).  If you have them all in one, it can make the sendfile 
for the body cumbersome.


If you somehow track what entries or in the cache, it is very easy to 
purge entries.


At Apachecon, I'll talk some about our version of mod_cache. 
Unfortunately, I can't share code :( But I can tell you the separate 
files way is not a performance or housekeeping issue.




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: mod_cache responsibilities vs mod_xxx_cache provider responsibilities

2006-09-15 Thread Brian Akins

Niklas Edmundsson wrote:

If I remember correctly the code in 2.2.3 only does whole-file 
revalidation,


No, it can have a stale handle that it makes fresh if it gets a 304.

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: svn commit: r440337 - in /httpd/httpd/trunk: ./ include/ modules/arch/netware/ modules/experimental/ modules/generators/ modules/http/ modules/mappers/ modules/proxy/ modules/ssl/ server/ server/m

2006-09-05 Thread Brian Akins

Ruediger Pluem wrote:


1. If we stick to

AP_DECLARE(const char *) ap_get_server_version(void);

and do

#define ap_get_server_banner ap_get_server_version



I hate macros.  Just do it like:

AP_DECLARE(const char *) ap_get_server_banner() {
return ap_get_server_version();
}

That way, it gets a symbol rather than disappearing after compile.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


porstfs bug t2000 and 2.2

2006-08-30 Thread Brian Akins
We tested a Sun t2000 with httpd 2.2.  It did okay. Now, Sun says 
there is an issue with 2.2 and portfs on Solaris 10 on the t2000.  Not 
real sure what this means.  Anyone else heard this?



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


mod_slotmem

2006-08-30 Thread Brian Akins
With all the talk of a generic scoreboard, here's something I whipped 
up that allows any other module to have some amount of memory per 
worker slot.  We have a different module in-house at CNN which does 
something similar. This one is a little rough around the edges, but 
gives an idea of what I was thinking about doing.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies
#ifndef __MOD_slotmem__
#define __MOD_slotmem__

typedef struct ap_slotmem_t ap_slotmem_t;

typedef apr_status_t ap_slotmem_callback_fn_t(void* mem, void *data, apr_pool_t 
*pool);

AP_DECLARE(apr_status_t)ap_slotmem_do(ap_slotmem_t *s, ap_slotmem_callback_fn_t 
*func, void *data, apr_pool_t *pool);

AP_DECLARE(apr_status_t) ap_slotmem_create(ap_slotmem_t **new, const char 
*name, apr_size_t item_size, apr_pool_t *pool);

AP_DECLARE(apr_status_t) ap_slotmem_mem(ap_slotmem_t *s, conn_rec *c, 
void**mem);

#endif
#include httpd.h
#include http_config.h
#include http_protocol.h
#include http_connection.h
#include ap_config.h
#include http_log.h
#include scoreboard.h
#include apr_strings.h
#include apr_shm.h
#include ap_mpm.h 

#include sys/types.h
#include unistd.h

#include mod_status.h
#include mod_slotmem.h

module AP_MODULE_DECLARE_DATA slotmem_module;

static int server_limit = 0;
static int thread_limit = 0;
static int total_limit = 0;

static apr_array_header_t *slotmem_array = NULL;

struct ap_slotmem_t {
apr_pool_t *pool;
const char *name;
apr_shm_t *shm;
void *mem;
apr_size_t item_size; /*size of each item*/
apr_size_t total_size;
};


AP_DECLARE(apr_status_t)ap_slotmem_do(ap_slotmem_t *s, ap_slotmem_callback_fn_t 
*func, void *data, apr_pool_t *pool) {
apr_status_t rv = APR_SUCCESS;
int i;
void *mem;

for(i = 0; i  total_limit; i++) {
mem = s-mem + (i * s-item_size);
if((rv = func(mem, data, pool)) != APR_SUCCESS) {
return rv;
}
}
return rv;
}

AP_DECLARE(apr_status_t) ap_slotmem_create(ap_slotmem_t **n, const char *name, 
apr_size_t item_size, apr_pool_t *pool) {
ap_slotmem_t *s, **new;

s = apr_pcalloc(pool, sizeof(ap_slotmem_t));
 
s-pool = pool;
s-name = apr_pstrdup(pool, name);
s-item_size = item_size;

new = (ap_slotmem_t **) apr_array_push(slotmem_array);
(*new) = (ap_slotmem_t *) s;

*n = s;

return APR_SUCCESS;
}

AP_DECLARE(apr_status_t) ap_slotmem_mem(ap_slotmem_t *s, conn_rec *c, void 
**mem) {
/*this should work for all mpm's*/
void *d = s-mem + (c-id * s-item_size);

*mem = d;
if(!d) {
ap_log_cerror(APLOG_MARK, APLOG_ERR, 0, c,
  ap_slotmem_mem: d is NULL, %ld, %p,
  c-id, s-mem
  );
return APR_EGENERAL;
}
return APR_SUCCESS;
}

static int slotmem_status(request_rec *r, int flags)
{
int i;
ap_slotmem_t *sb, **list;

if (!(flags  AP_STATUS_SHORT)) {
ap_rputs(hr /bslotmems/b\n, r);
ap_rputs(table border=1trtdbName/b/tdtdbItem 
Size/b/tdtdbTotal Size/b/td/tr\n, r); 
}
list = (ap_slotmem_t **)slotmem_array-elts;
for(i = 0; i  slotmem_array-nelts; i++) {
sb = list[i]; 
if (!(flags  AP_STATUS_SHORT)) {
ap_rprintf(r, 
trtd%s/tdtd%APR_SIZE_T_FMT/tdtd%APR_SIZE_T_FMT/td/tr\n,
   sb-name, sb-item_size, sb-total_size);
} else {
ap_rprintf(r, slotmem: %s %APR_SIZE_T_FMT%APR_SIZE_T_FMT\n,
   sb-name, sb-item_size, sb-total_size);
}
}

if (!(flags  AP_STATUS_SHORT)) {
 ap_rputs(/table\n, r); 
}
return OK;
}

static int post_config(apr_pool_t *p, apr_pool_t * plog, apr_pool_t * ptemp,
   server_rec *s)
{

apr_size_t amt;
apr_time_t now;
pid_t pid;
char *file;
ap_slotmem_t *sb, **list;
int i;
apr_status_t rv;
const char *temp_dir; 
char *data = NULL;

/*have we been here before? */
apr_pool_userdata_get((void *) data, __FILE__,
  s-process-pool);

if (!data) {
apr_pool_userdata_set((const void *) 1, __FILE__,
  apr_pool_cleanup_null, s-process-pool);
return OK;
}

ap_mpm_query(AP_MPMQ_HARD_LIMIT_THREADS, thread_limit);
ap_mpm_query(AP_MPMQ_HARD_LIMIT_DAEMONS, server_limit);
total_limit = thread_limit * server_limit;

now = apr_time_now();
pid = getpid();
if ((rv = apr_temp_dir_get(temp_dir, p)) != APR_SUCCESS) {
ap_log_error(APLOG_MARK, APLOG_STARTUP, 0, s,
 apr_temp_dir_get failed);
return rv;
}

/*XXX: currently, this uses one shm per slotmem.  We could try to pack them
all into a few larger shm's*/
list = (ap_slotmem_t **)slotmem_array-elts;
for(i = 0; i  slotmem_array-nelts; i++) {
sb = list[i];
sb-total_size = amt = sb-item_size

Re: mod_slotmem

2006-08-30 Thread Brian Akins

Jean-frederic Clere wrote:

Nice stuff but I am not sure that having shared memory per slot scales
when having a lot of entries, but that makes sure that one
process/thread won't overwrite another one slot.


It scales very nicely.  We run with max clients set between 16k-32k with 
no issues.  Our item sizes range from 64-264 bytes.  at 264 bytes and 
24k max clients, it requires about 6.5 MB of shared memory and we 
usually have 8-12 slotmems of varying sizes, so we use about 32-64 MB 
of shared memory for all the slotmems.


It is very fast as there is no locking necessary as each slot is only 
every written to by a single writer (it's tied to a connection id).




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Memory usage in apache

2006-08-22 Thread Brian Akins

Ian Holsman wrote:

Hi.

now that I am a 'founder' of my own and using shared hosting, and no 
longer have millions of machines to play with I am starting to notice

how much memory apache is using.


We generally have 4GB of ram on our webservers and let Apache use almost 
every bit of it...


Could just go the virtual server route ala Zen or Virtuozzo. Lots of 
fairly reasonable plans out there.



so.. I was wondering if anybody has looked at a lightweight apache, or 
other ways to reduce it's memory footprint.


Once you load something like mod_python or mod_perl, it's not really 
Apache memory that's your problem.  Apache can be slimmed down to 8-32 
MB (depending on modules and config) fairly easily just by removing 
unwanted cruft.  Most mod_xxx programming stuff tend to cache alot of 
stuff in RAM.




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


cache and proxy

2006-08-02 Thread Brian Akins
If I am reading the code correctly, we only do not set Server header 
when  r-proxyreq != PROXYREQ_NONE. So on a cached response, even if 
original was reverse proxy, we set our (Apache) server header??



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: svn commit: r426604 - in /httpd/httpd/branches/httpd-proxy-scoreboard: modules/proxy/ support/

2006-07-31 Thread Brian Akins

Jim Jagielski wrote:

I thought that this was about abstracting out scoreboard
so that other modules could have scoreboard-like
access without mucking around with the real scoreboard...


+1.  The proxy could just use this mechanism.  We need to separate the 
two issues.  I am all in favor of a generic scoreboard, that, in the 
future, the real scoreboard might use.



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:


However, you may not be able to wait until the linux router project
picks this up  (but it might be worth looking to see what is
available).


Most of the load-balancing we are discussing on this list is not for 
directly customer facing applications.  These are proxies for 
application servers generally, but they need to be highly available.  We 
are not trying to replace Cisco CSM's.  But a hardware HTTP-only aware 
$20k device is not needed when I just need to load balance an app across 
4 tomcat instances, for example.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Scoreboard was Re: load balancer cluster set

2006-07-31 Thread Brian Akins



I've seen all the traffic on the scoreboard and this is very useful
context ...


Also, I am using a similar scoreboard mechanism to collect lots of per 
 worker stats without the extendedstatus overhead.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:

That's the ultimate case, after all :-)


Not necessarily.  Google's answer is to throw tons of hardware at stuff. 
Which is great if you have unlimited space, power, and cooling.  Some 
other sites do some rather interesting things with a relatively small 
number of servers



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:


The point of contention was scalability ... from a human point of view
it is really annoying to have to solve a problem twice but from the
business pov, outgrowing your load balancer might only be a good thing.



Yes.  But most load balancer can only do layer 7 load balancing. 
Sometimes it is necessary to have very application specific routing. 
Also, in general, most hardware load balancers base their algorithms on 
things such as response time.  Sometimes, it is necessary to know the 
general health of the backend servers.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: svn commit: r423444 - in /httpd/httpd/branches/httpd-proxy-scoreboard/modules/mem: ./ Makefile.in config5.m4 mod_plainmem.c mod_scoreboard.c mod_sharedmem.c slotmem.h

2006-07-21 Thread Brian Akins

Jean-frederic Clere wrote:


Do you mean the proxy back-end connections?


I am thinking of a more general purpose slotmem not particularly tied 
to proxy.  Maybe have some wrapper functions that create a slotmem 
based on threads x procs and can be access using r-connection. 
(internally slotmem could use r-connection-id).


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: svn commit: r423444 - in /httpd/httpd/branches/httpd-proxy-scoreboard/modules/mem: ./ Makefile.in config5.m4 mod_plainmem.c mod_scoreboard.c mod_sharedmem.c slotmem.h

2006-07-20 Thread Brian Akins

Ruediger Pluem wrote:

  +static apr_status_t ap_slotmem_create(ap_slotmem_t **new, const char 
*name, apr_size_t item_size, int item_num, apr_pool_t *pool)


In my thought of a slotmem or scoreboard item_num is max threads * 
max procs just like the normal scoreboard. Or has this morphed into 
something completely different?  I can see uses for this type as well. 
Would be nice to have a function somewhere to get the current 
connections scoreboard slot perhaps...


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in mod_proxy

2006-07-14 Thread Brian Akins

Jean-frederic Clere wrote:


Using the result of your ideas and the explaintions now I have mod_proxy
that uses the scoreboard via a scoreboard provider ;-)
Find enclosed the code.

Any comments?

The next step is to write shared memory scoreboard provider.




I was thinking the scoreboard code would be in something like 
mod_scoreboard that other modules (like mod_proxy) would use.


apr_shm provider is rather trivial. Left as exercise to the reader ;)



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in

2006-07-14 Thread Brian Akins

Jim Jagielski wrote:
at

is the actual scoreboard and what is the proxy's segment
of scoreboard space... Eg the ap_scoreboard_* stuff implies
that it's Apache's real scoreboard.


Like I said, I think this scoreboard stuff should be more generic than 
just proxy.  It could be renamed to avoid confusion with the real 
scoreboard.  But, there is no reason the real scoreboard couldn't use 
mod_scoreboard itself.  It would just have to be core.  probably best 
to have mod_scoreboard now, and merge the two later.




--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in mod_proxy

2006-07-13 Thread Brian Akins

Jim Jagielski wrote:


If this is data that needs to be accessed from non-proxy modules
then yes, I agree.


A basic API could look like.  By worker, I am thinking about the mpm 
sense, not the proxy sense.  I guess slot may be a better term:


/*used for ap_scoreboard_do. mem is the memory associated with a worker, 
data is what is passed to ap_scoreboard_do. pool is pool used to create 
scoreboard*/
typedef apr_status_t ap_scoreboard_callback_fn_t(void* mem, void *data, 
apr_pool_t *pool);


/*call the callback on all worker slots*/
AP_DECLARE(apr_status_t)ap_scoreboard_do(ap_scoreboard_t *s, 
ap_scoreboard_callback_fn_t *func, void *data, apr_pool_t *pool);


/*create a new scoreboard with each item size is item_size.  name is a 
key used for debuggin and in mod_status output. This would create shared 
memory, basically*/
AP_DECLARE(apr_status_t) ap_scoreboard_create(ap_scoreboard_t **new, 
const char *name, apr_size_t item_size, apr_pool_t *pool);


/*get the memory associated with this worker slot. use c-id or c-sbh 
to get offset into shared memory*/
AP_DECLARE(apr_status_t) ap_scoreboard_mem(ap_scoreboard_t *s, conn_rec 
*c, void**mem);



Thoughts.  Somthing very similar to this is used by several very busy 
web sites...


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in

2006-07-13 Thread Brian Akins

Jim Jagielski wrote:



+1. For example, a memcached based scoreboard would be
pretty cool ;)



maybe in mod_scoreboard it may use a provider mechanism to actually 
implement the scoreboard.  Maybe have an ap_scoreboard_create_ex where 
you could explicitly name a provider.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in

2006-07-13 Thread Brian Akins

Jim Jagielski wrote:


Yeah, that's what I was thinking as well!


default could just use apr_shm.

--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in mod_proxy

2006-07-13 Thread Brian Akins

Jean-frederic Clere wrote:
With such an interface you assume only one process will access to one 
slot... That is what the scoreboard allows.

Allowing updates from different proccesses on the same slot.
Should we have an ap_slot_read_look() and an ap_slot_unlock() for that?


No. we don't have such for the built-in scoreboard.  Anything can read 
the scoreboard, only current worker slot can change it.  that's why in 
the sample API, to get the memory you pass a conn_rec.


If it's slow, people won't use it.  Semaphores are generally slow. 
Enforcing it by convention like we currently do seems reasonable.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in mod_proxy

2006-07-13 Thread Brian Akins

Jim Jagielski wrote:

Having some external (or even internal) process update
a slot that isn't its own is dangerous. And the required
locking would be slow.


In my own hacked proxy, an external healthchecker and the proxy share a 
piece of shared memory that is read-only by apache and read-write by 
the external health check.  This is not scoreboard info, just some 
health info.  The 2 are separate things.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: Additing a storage for the shared information of the worker in mod_proxy

2006-07-12 Thread Brian Akins

Jean-frederic Clere wrote:

Hi,

I am still trying to replace the scoreboard by shared memory to store
the shared information of the workers, I am now thinking to get this by
adding modules like the prototype I have enclosed (that is a patch
against trunk).


We really need a generic scoreboard type module with a relatively 
simple API to add and access per worker memory.  I have some ideas if 
anyone wants to hear them...



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: [PATCH] setenvif filter

2006-06-21 Thread Brian Akins

Anyone else care to vote on this so it can get, possibly, committed?


Francois Pesce wrote:

These patches may fix the r-content_type behaviour. Are you OK with it ?

--
*Francois Pesce*

2006/5/31, Brian Akins [EMAIL PROTECTED]:
  Francois PESCE wrote:
   I've discussed about a patch for mod_setenvif 2 years ago, and have
   coded it at that time, it is successfully used on various host in
   production since.
 
 
  You need to handle content type specially by checking r-content_type.
  For some reason, just doing apr_table_get(r-headers_out,
  Content-type) would be null, but content_type would be set.
 
  See the patch I posted a few days ago.
 
 
  +1 in concept
 
  --
  Brian Akins
  Lead Systems Engineer
  CNN Internet Technologies
 




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: hook call tracing [was: debug apache]

2006-06-21 Thread Brian Akins

William A. Rowe, Jr. wrote:


This is definately useful for module developers, and should be added.


+1.  it is also useful, when you have strange things happening to 
production web servers.  I, too, did a rather ugly hack that shouwed the 
current hook using mod_status hook and it has come in handy.  It is 
however very ugly.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: [PATCH] setenvif filter

2006-06-01 Thread Brian Akins

Francois Pesce wrote:

These patches may fix the r-content_type behaviour. Are you OK with it ?



+1



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


[PATCH] add notfound to mod_rewrite rule

2006-06-01 Thread Brian Akins
patch against 2.2.2 (but should work on most other versions).  Adds the 
notfound (NF) flag to rewriterule that will case it to return 
HTTP_NOT_FOUND for matched, similar to the gone and forbidden options.




--- mod_rewrite.c.bak   2006-06-01 15:25:48.0 -0400
+++ mod_rewrite.c   2006-06-01 15:29:18.0 -0400
@@ -3264,6 +3264,11 @@
 || !strcasecmp(key, ocase)) {/* nocase */
 cfg-flags |= RULEFLAG_NOCASE;
 }
+else if (((*key == 'F' || *key == 'f')  !key[1])
+ || !strcasecmp(key, otfound)) {   /* notfound */
+cfg-flags |= (RULEFLAG_STATUS | RULEFLAG_NOSUB);
+cfg-forced_responsecode = HTTP_NOT_FOUND;
+}
 else {
 ++error;
 }


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: [PATCH] add notfound to mod_rewrite rule

2006-06-01 Thread Brian Akins

André Malo wrote:

We don't need this, because you can use [R=404]. F and G are already just 
syntactic sugar.




From the docs:

'redirect|R  [=code]' (force redirect)
Prefix Substitution with http://thishost[:thisport]/ (which makes the 
new URL a URI) to force a external redirection. If no code is given, a 
HTTP response of 302 (MOVED TEMPORARILY) will be returned. If you want 
to use other response codes in the range 300-400, simply specify the 
appropriate number or use one of the following symbolic names: temp 
(default), permanent, seeother. Use this for rules to canonicalize the 
URL and return it to the client - to translate ``/~'' into ``/u/'', or 
to always append a slash to /u/user, etc.



So do the docs need to be updated to say: return arbitrary http code?

BTW, this does work.  patch withdrawn.  somebody needs to fix docs...

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: [PATCH] setenvif filter

2006-05-31 Thread Brian Akins

Francois PESCE wrote:

I've discussed about a patch for mod_setenvif 2 years ago, and have
coded it at that time, it is successfully used on various host in
production since.



You need to handle content type specially by checking r-content_type. 
For some reason, just doing apr_table_get(r-headers_out, 
Content-type) would be null, but content_type would be set.


See the patch I posted a few days ago.


+1 in concept

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: mod_disk_cache read-while-caching patch (try2)

2006-05-30 Thread Brian Akins

Niklas Edmundsson wrote:


In short, the mod_disk_cache read-while-caching patch is usable now
and I'd like to contribute what's possible upstreams.



Does it make more sens at this time to do all these changes in another 
module and leave mod_disk_cache as stable and useable?  call it disk2 or 
something...


CacheEnable disk2 /




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: [PATCH] mod_disk_cache early size-check

2006-05-30 Thread Brian Akins

Niklas Edmundsson wrote:


This patch takes advantage of the possibility to do the size-check of 
the file to be cached early.

 obj-vobj = dobj = apr_pcalloc(r-pool, sizeof(*dobj));



Shouldn't this be in mod_cache so that all providers do not have to 
duplicate this logic?


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


[PATCH] setenvif filter

2006-05-23 Thread Brian Akins
This patch add a filter to mod_setenvif that lets it match against 
response headers as well as request headers.  It is probably a horrible 
implementation, but submitted to encourage others to think of the idea.


This changes the configuration to allow another optional field to 
designate the mode of the match (default is request)


SetEnvIf response Content-Type text/* is_text=1

will match against the response header content-type.

The main purpose of this is to allow configurations such as:

AddOutputFilterByType DEFLATE text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
SetEnvIf response Content-Type text/html user-agent-vary=1
Header append Vary User-Agent env=user-agent-vary

With this patch, the correct vary headers are added in a reverse proxy 
situation.


most of the code was adapted from mod_headers.

Thoughts?

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies
--- mod_setenvif.c.bak  2006-05-23 10:08:56.0 -0400
+++ mod_setenvif.c  2006-05-23 11:03:06.0 -0400
@@ -94,6 +94,8 @@
 #include http_log.h
 #include http_protocol.h
 
+#define SETENVIF_REQUEST 1
+#define SETENVIF_RESPONSE 2
 
 enum special {
 SPECIAL_NOT,
@@ -113,12 +115,15 @@
 apr_table_t *features;  /* env vars to set (or unset) */
 enum special special_type;  /* is it a special header ? */
 int icase;  /* ignoring case? */
+int mode;  /*request or response*/
 } sei_entry;
 
 typedef struct {
 apr_array_header_t *conditionals;
 } sei_cfg_rec;
 
+static ap_filter_rec_t *setenvif_output_filter_handle = NULL;
+
 module AP_MODULE_DECLARE_DATA setenvif_module;
 
 /*
@@ -249,7 +254,7 @@
 }
 
 static const char *add_setenvif_core(cmd_parms *cmd, void *mconfig,
- char *fname, const char *args)
+ char *fname, int mode, const char *args)
 {
 char *regex;
 const char *simple_pattern;
@@ -304,6 +309,7 @@
 
 /* no match, create a new entry */
 new = apr_array_push(sconf-conditionals);
+new-mode = mode;
 new-name = fname;
 new-regex = regex;
 new-icase = icase;
@@ -400,15 +406,35 @@
 static const char *add_setenvif(cmd_parms *cmd, void *mconfig,
 const char *args)
 {
-char *fname;
-
+char *fname = NULL;
+int mode = SETENVIF_REQUEST;
+
 /* get header name */
 fname = ap_getword_conf(cmd-pool, args);
-if (!*fname) {
+/*is this a mode?*/
+
+if (!fname) {
+return apr_pstrcat(cmd-pool, Missing header-field name for ,
+   cmd-cmd-name, NULL);
+}
+
+if(!strcasecmp(fname, request)) {
+mode = SETENVIF_REQUEST;
+fname = NULL;
+} else if (!strcasecmp(fname, response)) {
+mode = SETENVIF_RESPONSE;
+fname = NULL;
+} 
+
+if(!fname) {
+fname = ap_getword_conf(cmd-pool, args);
+}
+
+if (!fname) {
 return apr_pstrcat(cmd-pool, Missing header-field name for ,
cmd-cmd-name, NULL);
 }
-return add_setenvif_core(cmd, mconfig, fname, args);
+return add_setenvif_core(cmd, mconfig, fname, mode, args);
 }
 
 /*
@@ -418,7 +444,7 @@
  */
 static const char *add_browser(cmd_parms *cmd, void *mconfig, const char *args)
 {
-return add_setenvif_core(cmd, mconfig, User-Agent, args);
+return add_setenvif_core(cmd, mconfig, User-Agent, SETENVIF_REQUEST, 
args);
 }
 
 static const command_rec setenvif_module_cmds[] =
@@ -444,7 +470,7 @@
  * signal which call it is by having the earlier one pass a flag to the
  * later one.
  */
-static int match_headers(request_rec *r)
+static int match_headers(request_rec *r, int mode)
 {
 sei_cfg_rec *sconf;
 sei_entry *entries;
@@ -454,7 +480,14 @@
 int i, j;
 char *last_name;
 ap_regmatch_t regm[AP_MAX_REG_MATCH];
-
+apr_table_t *headers;
+
+if(SETENVIF_RESPONSE == mode) {
+headers = r-headers_out;
+} else {
+headers = r-headers_in;
+}
+
 if (!ap_get_module_config(r-request_config, setenvif_module)) {
 ap_set_module_config(r-request_config, setenvif_module,
  SEI_MAGIC_HEIRLOOM);
@@ -468,9 +501,17 @@
 entries = (sei_entry *) sconf-conditionals-elts;
 last_name = NULL;
 val = NULL;
+
 for (i = 0; i  sconf-conditionals-nelts; ++i) {
 sei_entry *b = entries[i];
-
+
+if(b-mode != mode) {
+continue;
+}
+
+ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r-server,
+ setenvif: trying %s, b-name);
+
 /* Optimize the case where a bunch of directives in a row use the
  * same header.  Remember we don't need to strcmp the two header
  * names because we made sure the pointers were equal during
@@ -505,7 +546,7 @@
  * headers

Re: [PATCH] setenvif filter

2006-05-23 Thread Brian Akins
Here's a newer version with some special handling for content-type.  In 
response headers, we need to match r-content_type rather than the header.



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies
--- mod_setenvif.c.bak  2006-05-23 10:08:56.0 -0400
+++ mod_setenvif.c  2006-05-23 16:55:50.0 -0400
@@ -94,6 +94,8 @@
 #include http_log.h
 #include http_protocol.h
 
+#define SETENVIF_REQUEST 1
+#define SETENVIF_RESPONSE 2
 
 enum special {
 SPECIAL_NOT,
@@ -102,7 +104,8 @@
 SPECIAL_REQUEST_URI,
 SPECIAL_REQUEST_METHOD,
 SPECIAL_REQUEST_PROTOCOL,
-SPECIAL_SERVER_ADDR
+SPECIAL_SERVER_ADDR,
+SPECIAL_CONTENT_TYPE
 };
 typedef struct {
 char *name; /* header name */
@@ -113,12 +116,15 @@
 apr_table_t *features;  /* env vars to set (or unset) */
 enum special special_type;  /* is it a special header ? */
 int icase;  /* ignoring case? */
+int mode;  /*request or response*/
 } sei_entry;
 
 typedef struct {
 apr_array_header_t *conditionals;
 } sei_cfg_rec;
 
+static ap_filter_rec_t *setenvif_output_filter_handle = NULL;
+
 module AP_MODULE_DECLARE_DATA setenvif_module;
 
 /*
@@ -249,7 +255,7 @@
 }
 
 static const char *add_setenvif_core(cmd_parms *cmd, void *mconfig,
- char *fname, const char *args)
+ char *fname, int mode, const char *args)
 {
 char *regex;
 const char *simple_pattern;
@@ -304,6 +310,7 @@
 
 /* no match, create a new entry */
 new = apr_array_push(sconf-conditionals);
+new-mode = mode;
 new-name = fname;
 new-regex = regex;
 new-icase = icase;
@@ -345,6 +352,9 @@
 else if (!strcasecmp(fname, server_addr)) {
 new-special_type = SPECIAL_SERVER_ADDR;
 }
+else if ((SETENVIF_RESPONSE == mode)  !strcasecmp(fname, 
content-type)) {
+new-special_type = SPECIAL_CONTENT_TYPE;
+}
 else {
 new-special_type = SPECIAL_NOT;
 /* Handle fname as a regular expression.
@@ -400,15 +410,35 @@
 static const char *add_setenvif(cmd_parms *cmd, void *mconfig,
 const char *args)
 {
-char *fname;
-
+char *fname = NULL;
+int mode = SETENVIF_REQUEST;
+
 /* get header name */
 fname = ap_getword_conf(cmd-pool, args);
-if (!*fname) {
+/*is this a mode?*/
+
+if (!fname) {
+return apr_pstrcat(cmd-pool, Missing header-field name for ,
+   cmd-cmd-name, NULL);
+}
+
+if(!strcasecmp(fname, request)) {
+mode = SETENVIF_REQUEST;
+fname = NULL;
+} else if (!strcasecmp(fname, response)) {
+mode = SETENVIF_RESPONSE;
+fname = NULL;
+} 
+
+if(!fname) {
+fname = ap_getword_conf(cmd-pool, args);
+}
+
+if (!fname) {
 return apr_pstrcat(cmd-pool, Missing header-field name for ,
cmd-cmd-name, NULL);
 }
-return add_setenvif_core(cmd, mconfig, fname, args);
+return add_setenvif_core(cmd, mconfig, fname, mode, args);
 }
 
 /*
@@ -418,7 +448,7 @@
  */
 static const char *add_browser(cmd_parms *cmd, void *mconfig, const char *args)
 {
-return add_setenvif_core(cmd, mconfig, User-Agent, args);
+return add_setenvif_core(cmd, mconfig, User-Agent, SETENVIF_REQUEST, 
args);
 }
 
 static const command_rec setenvif_module_cmds[] =
@@ -444,7 +474,7 @@
  * signal which call it is by having the earlier one pass a flag to the
  * later one.
  */
-static int match_headers(request_rec *r)
+static int match_headers(request_rec *r, int mode)
 {
 sei_cfg_rec *sconf;
 sei_entry *entries;
@@ -454,7 +484,14 @@
 int i, j;
 char *last_name;
 ap_regmatch_t regm[AP_MAX_REG_MATCH];
-
+apr_table_t *headers;
+
+if(SETENVIF_RESPONSE == mode) {
+headers = r-headers_out;
+} else {
+headers = r-headers_in;
+}
+
 if (!ap_get_module_config(r-request_config, setenvif_module)) {
 ap_set_module_config(r-request_config, setenvif_module,
  SEI_MAGIC_HEIRLOOM);
@@ -468,9 +505,17 @@
 entries = (sei_entry *) sconf-conditionals-elts;
 last_name = NULL;
 val = NULL;
+
 for (i = 0; i  sconf-conditionals-nelts; ++i) {
 sei_entry *b = entries[i];
-
+
+if(b-mode != mode) {
+continue;
+}
+
+ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r-server,
+ setenvif: trying %s, b-name);
+
 /* Optimize the case where a bunch of directives in a row use the
  * same header.  Remember we don't need to strcmp the two header
  * names because we made sure the pointers were equal during
@@ -498,6 +543,10 @@
 case SPECIAL_REQUEST_PROTOCOL:
 val = r-protocol;
 break

bug in ap_set_content_type or somewhere

2006-05-23 Thread Brian Akins

If I have this configured:

AddOutputFilterByType DEFLATE text/html

but get something like host/server-status?auto, it still gets gzipped. 
 It looks like because in mod_status.c we do this:


 ap_set_content_type(r, text/html);

and a little later:
   case STAT_OPT_AUTO:
ap_set_content_type(r, text/plain);


and in ap_set_content_type we call ap_add_output_filters_by_type(r);

so, even though we change the content type, we still have added DEFLATE 
for this.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


mod_mime, filters and proxy

2006-05-17 Thread Brian Akins

Any reason why we do this check alot in mod_mime.c:
 r-proxyreq == PROXYREQ_NONE

why can't we add by type to proxy requests?  This is at top:

/* X - fix me - See note with NOT_PROXY
 */
 but I don't see a note.

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


move ap_sb_handle_t def

2006-05-15 Thread Brian Akins
Any objections to these two small diffs?  Basically, it just moves the 
definition of ap_sb_handle_t from being private in scoreboard.c to being 
available to everyone in scoreboard.h.  This allows modules to get a 
scoreboard handle for themself using r-connection-sbh.


Thoughts?

--- include/scoreboard.h.bak2006-05-15 07:44:02.0 -0400
+++ include/scoreboard.h2006-05-15 07:44:47.0 -0400
@@ -169,7 +169,10 @@
 lb_score *balancers;
 } scoreboard;

-typedef struct ap_sb_handle_t ap_sb_handle_t;
+typedef struct {
+int child_num;
+int thread_num;
+} ap_sb_handle_t;

 AP_DECLARE(int) ap_exists_scoreboard_image(void);
 AP_DECLARE(void) ap_increment_counts(ap_sb_handle_t *sbh, request_rec *r);


--- server/scoreboard.c.bak 2006-05-15 07:42:39.0 -0400
+++ server/scoreboard.c 2006-05-15 07:43:53.0 -0400
@@ -63,11 +63,6 @@
 static APR_OPTIONAL_FN_TYPE(ap_proxy_lb_workers)
 *proxy_lb_workers;

-struct ap_sb_handle_t {
-int child_num;
-int thread_num;
-};
-
 static int server_limit, thread_limit, lb_limit;
 static apr_size_t scoreboard_size;

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-04 Thread Brian Akins

Graham Leggett wrote:


I think in the long run, a dedicated process is the way to go.


I think using a provider architecture would be best and keep complexity 
out of mod_cache.  Some module(s) would implement the necessary cache 
management functions and mod_cache would push/pull/probe the manager 
using this interface.  The manager may or may not be tied to the storage 
provider.  We may have enough generic interfaces already to allow 
completely stand alone cache managers.


At least, that's how I would do it...


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-03 Thread Brian Akins

Graham Leggett wrote:

Moving towards and keeping with the above goals is a far higher priority 
than simplifying the generic backend cache interface.


This response was a perfect summation of why we do *not* run the stock 
mod_cache here...



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-03 Thread Brian Akins

Graham Leggett wrote:


Seriously, please move this off list to keep the noise out of people's
inboxes.


Does this discussion belong off-list?  I would think this is the type of 
thing we need to discuss on this list.


Is there any consensus as to how to move forward?  Do we just leave it 
as it is currently?


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-03 Thread Brian Akins

Roy T. Fielding wrote:


For the record, Graham's statements were entirely correct,
Brian's suggested architecture would slow the HTTP cache,


No. It would simplify the existing implementation.  The existing 
implementation, as Graham has noted, is not fully functional.  Graham 
argues - and I'm still mulling it over - that a generic cache 
architecture would get in the way of making a fully functional http cache.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


RFC: rename mod_cache to mod_http_cache

2006-05-03 Thread Brian Akins
Not wanting to stir the huge pot o' stuff that is going on here, but 
what are the thoughts of renaming mod_cache to mod_http_cache? 
mod_cache is http specific.  This would follow the general ide that 
mod_proxy uses.


I am not suggesting changing any functionality at this time, simply 
renaming it to a more suitable name.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: RFC: rename mod_cache to mod_http_cache

2006-05-03 Thread Brian Akins

William A. Rowe, Jr. wrote:

Not in 2.2 branch, but in trunk?  The issue is that it's half httpd, and
half generic.  Let me mull this over.


can we separate out the http specific parts without violating Graham's 
concerns?  My whole original idea was to just do that... I was not fully 
aware of the issues in the current mod_cache.




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Generic cache architecture

2006-05-03 Thread Brian Akins
Is anyone else interested in having a generic cache architecture?  (not 
http).  I have plenty of cases were I re-invent the wheel for caching 
various things (IP's, sessions, whatever, etc.).  It would be nice to 
have a provider based architecture for such things.




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Generic cache architecture

2006-05-03 Thread Brian Akins

Gonzalo Arana wrote:


I am. How about adding it to apr?


How about someone figuring out how to get providers into apr?  Doesn't 
look horribly hard.  Perhaps I should ask on apr-devel?



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-03 Thread Brian Akins

Roy T. Fielding wrote:

That is a heck of a lot easier than convincing everyone to dump
the current code based on an untested theory.


I think the idea may be a lot more tested than you think.  Most things I 
suggest have had an incubation period somewhere...



I'm fine with not screwing with current mod_cache.  I just think it 
should be either: renamed or made generic.  We may or may not need a 
generic mod_backend_cache.  I have posted a psuedo-implementation that 
got lost in the latest thread bloat.  I can repost if anyone is interested.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Generic cache architecture

2006-05-03 Thread Brian Akins

Roy T. Fielding wrote:
  provide this functionality once, and reuse

On the contrary, it makes no sense whatsoever to use a generic
storage facility for cached HTTP responses in a front-end cache
because those responses can only be delivered at maximum speed
through a single system call IFF they are not generic.  That is
why our front-end cache is not, and has never needed to be, a
generic cache.


a generic cache can deliver objects in a single system call.  Thinks 
VFS.  the generic storage facility may be only a thin wrapper around 
something like current mod_disk_cache or it may be a memcache frontend, 
or something completely different.


Trust me, I am extremely concerned about performance.




--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-02 Thread Brian Akins

Graham Leggett wrote:

- the cache says cool, will send my copy upstream. Oops, where has my 
data gone?.





So, the cache says, okay must get content the old fashioned way (proxy, 
filesystem, magic fairies, etc.).


Where's the issue?



--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-02 Thread Brian Akins

Graham Leggett wrote:


The way HTTP caching works is a lot more complex than in your example, you
haven't taken into account conditional HTTP requests.
...


Still not sure how this is different from what we are proposing.  we 
really want to separate protocol from cache stuff.  If we have a 
revalidate for the generic cache it should address all your concerns.  ???





--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Possible new cache architecture

2006-05-02 Thread Brian Akins

Graham Leggett wrote:


To be HTTP compliant, and to solve thundering herd, we need the following
from a cache:



This seems more like a wish list.  I just want to separate out the cache 
and protocol stuff.




- The ability to amend a subkey (the headers) on an entry that is already
cached.


mod_http_cache should handle.  to new mod_cache, it's just another 
key/value.



- The ability to invalidate a particular cached variant (ie headers +
data) in one atomic step, without affecting threads that hold that cached
entry open at the time.


mod_http_cache should handle. Keep a list of variants cached - this 
should use a provider interface as well.  mod_cache would handle 
whatever locking, ref counting, etc, needs to be done, if any.



- The ability to read from a cached object that is still being written to.


Nice to have.  out of scope for what I am proposing.  new mod_cache 
should be the place to implement this if underlying provider supports it.




- A guarantee that the result of a broken write (segfault, timeout,
connection reset by peer, whatever) will not result in a broken cached
entry (ie that the cached entry will eventually be invalidated, and all
threads trying to read from it will eventually get an error).


agreed.  new mod_cache should handle this.


Certainly separate the protocol from the physical cache, just make sure
the physical cache delivers the shopping list above :)


Most seem like protocol specific stuff.


--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


  1   2   3   4   5   >