from:"Graham Dumpleton"

Re: Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-13 Thread Graham Dumpleton

If interested, my initial blog post about the issue in relation to mod_wsgi
is now posted at:

* http://blog.dscpl.com.au/2015/01/important-modwsgi-information-about.html

The link to that has also been posted on the mod_wsgi mailing list and
Twitter.

Graham

On 13 January 2015 at 16:34, Graham Dumpleton grah...@apache.org wrote:

  But the damage has been done for some months on 2.2, and we are noticing
 this, now?

 All distros still shipping Apache 2.2 still are using older mod_wsgi 3.X
 versions which I don't at this point believe are affected by this issue.

 People who build stuff from source code themselves would be using latest
 Apache 2.4.

 So the big hit on mod_wsgi will come with Apache 2.4.11.

 Graham

Re: Re: Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-13 Thread Graham Dumpleton

On 14 January 2015 at 09:10, wr...@rowe-clan.net wrote:

 - Original Message -
 Subject: Re: Re: CVE-2013-5704 fix breaks mod_wsgi
 From: Graham Dumpleton grah...@apache.org
 Date: 1/12/15 11:34 pm
 To: dev@httpd.apache.org dev@httpd.apache.org

  But the damage has been done for some months on 2.2, and we are noticing
 this, now?
  All distros still shipping Apache 2.2 still are using older mod_wsgi 3.X
 versions

 Makes sense...

which I don't at this point believe are affected by this issue.

  And why not?

 https://github.com/GrahamDumpleton/mod_wsgi/blob/stable/3.X/mod_wsgi.c

 /* Create and populate our own request object. */
 apr_pool_create(p, c-pool);
 r = apr_pcalloc(p, sizeof(request_rec));

Because the code which is doing this is not running inside of the normal
Apache child worker processes but a separate managed process that mod_wsgi
creates just to run the WSGI application.

In that separate process things are much more controlled and arbitrary
Apache modules don't run stuff.

The only parts of the Apache code base that touch the new structure members
that I can find are the HTTP input filter, the proxy modules, sub requests
and logging.

In this separate managed processes the proxy modules are never use nor are
sub requests. The log functions which use the new structure members are
also not triggered as that only occurs in the Apache child worker processes.

The HTTP input filter is used, but due to the specific way that mod_wsgi
transfers data from the Apache child process to the separate managed
process, that request content is never chunked. As a consequence
the read_chunked_trailers() function which updates the structure members is
never called.

So although the request_rec size is going to be wrong, nothing ever
attempts to read or write past the short memory which is created in the
case of mod_wsgi 4.4.0 and so old versions aren't crashing when tested.

I agree this isn't ideal and users should update in case some other change
is made to Apache down the track which may change this, but right now it at
least means those older versions will not crash as mod_wsgi 4.4.0+ is.

FWIW, version 4.4.0 was only released November 28th 2014 and so not many
are actually likely using it.

Graham

Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-12 Thread Graham Dumpleton

BTW. I need to go back and check, but I actually suspect that the crash
will only occur in mod_wsgi where mod_wsgi 4.4.0 or later was being used.
It was only in 4.4.0 that content started to be passed between the Apache
child worker processes and the mod_wsgi daemon process using chunking.

The WSGI specification doesn't actually permit chunked request content or
the idea of Apache input filters that can change the length of the request
content but not change the content length.

I did have a WSGIChunkedRequests directive to allow you to side step the
WSGI specification and allow your application to work with these, but it
was broken for mod_wsgi daemon mode and only worked for embedded mode. The
change in 4.4.0 fixed that and at the same time had to start using chunking
so could detect when request content had been truncated before all being
received.

If this turns out to be the case, that may limit damage a big, as most
distros are still on ancient mod_wsgi versions and not using mod_wsgi 4.X
versions at all. They wouldn't therefore need to patch the older versions I
don't think, nor even possibly recompile them.

I will do some checking with older versions tomorrow, as well as work on
the hack that tries to infer the request_rec size to work out if the CVE
change has been back ported.

Graham

On 12 January 2015 at 23:20, Graham Dumpleton grah...@apache.org wrote:

 On 12 January 2015 at 22:27, Joe Orton jor...@redhat.com wrote:

 On Sat, Jan 10, 2015 at 09:04:12AM +1100, Graham Dumpleton wrote:
  1. Verify that recompiling mod_wsgi is actually sufficient given than my
  direct use of request_rec isn't going to populate the extra fields and
 they
  will remain NULL still. As trailers shouldn't be expected in context the
  request_rec is being used directly by mod_wsgi those attributes
 shouldn't
  be touched, but if that is the case, why would it be crashing without
  recompilation happening. So need to also actually verify whether it
 can't
  limp on as is for now if it isn't crashing.

 Yup, I should have mentioned that too.  You are right, we had to pach
 mod_wsgi to fix the issue properly as well:


 http://pkgs.fedoraproject.org/cgit/mod_wsgi.git/plain/mod_wsgi-4.4.3-trailers.patch?h=f21

 that can/should be surrounded with

 #if AP_MODULE_MAGIC_AT_LEAST(20120211, 37)
 ...
 #endif

 ...to make it conditional on an httpd with those fields.  (Hadn't
 submitted that back upstream yet sorry - we wanted to find a proper
 solution httpd-side for this.)


 The problem I have is that Linux distros who are back porting the change
 aren't going to be updating MODULE_MAGIC_NUMBER.

 So if I add:

 #if AP_MODULE_MAGIC_AT_LEAST(20120211, 37)
 r-trailers_in = apr_table_make(r-pool, 5);
 r-trailers_out = apr_table_make(r-pool, 5);
 #endif

 This only helps if someone is using Apache 2.4.11 or later where
 MODULE_MAGIC_NUMBER has been updated.

 Someone who takes the latest mod_wsgi code with the above change and
 compiles it against an Apache with back ported change will still find that
 mod_wsgi will crash as that code will never be compiled in.

 In short, I can't see that I have any way of detecting that an Apache
 instance which has back ported the change is being used at compile time.

 The only hack I can think of, is that for where
 AP_MODULE_MAGIC_AT_LEAST(20120211, 37) doesn't succeed, I try and calculate
 whether the 'useragent_ip' member of the structure is still likely to be
 the last thing that can fit in the amount of memory allocated for the
 request_rec and if it isn't and the following memory is enough to hold the
 trailers_in and trailers_out pointers, that i somehow work out where they
 would fall and set them. This could be fiddly because of struct packing
 issues.

 I know it is asking a lot and likely too late, but what would have helped
 immensely in such cases where back ports occur which change structure sizes
 is that a #define was added within the struct where the new members were
 added.

 char *useragent_ip;

 #define CVE_2013_5704 1

 /** MIME trailer environment from the request */
 apr_table_t *trailers_in;
 /** MIME trailer environment from the response */
 apr_table_t *trailers_out;
 };

 I know this goes against the MODULE_MAGIC_NUMBER idea, but the magic
 number doesn't help with back ported changes like this.

 With such #define's then I could have had:

 #if AP_MODULE_MAGIC_AT_LEAST(20120211, 37) || defined(CVE_2013_5704)
 r-trailers_in = apr_table_make(r-pool, 5);
 r-trailers_out = apr_table_make(r-pool, 5);
 #endif

 So right now it looks like I have to use the rather fragile approach of
 trying to work out whether the size of request_rec is no larger without
 actually being able to access the members, which if they didn't exist would
 cause a compile time failure.

 Graham

Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-12 Thread Graham Dumpleton

On 12 January 2015 at 22:27, Joe Orton jor...@redhat.com wrote:

 On Sat, Jan 10, 2015 at 09:04:12AM +1100, Graham Dumpleton wrote:
  1. Verify that recompiling mod_wsgi is actually sufficient given than my
  direct use of request_rec isn't going to populate the extra fields and
 they
  will remain NULL still. As trailers shouldn't be expected in context the
  request_rec is being used directly by mod_wsgi those attributes shouldn't
  be touched, but if that is the case, why would it be crashing without
  recompilation happening. So need to also actually verify whether it can't
  limp on as is for now if it isn't crashing.

 Yup, I should have mentioned that too.  You are right, we had to pach
 mod_wsgi to fix the issue properly as well:


 http://pkgs.fedoraproject.org/cgit/mod_wsgi.git/plain/mod_wsgi-4.4.3-trailers.patch?h=f21

 that can/should be surrounded with

 #if AP_MODULE_MAGIC_AT_LEAST(20120211, 37)
 ...
 #endif

 ...to make it conditional on an httpd with those fields.  (Hadn't
 submitted that back upstream yet sorry - we wanted to find a proper
 solution httpd-side for this.)


The problem I have is that Linux distros who are back porting the change
aren't going to be updating MODULE_MAGIC_NUMBER.

So if I add:

#if AP_MODULE_MAGIC_AT_LEAST(20120211, 37)
r-trailers_in = apr_table_make(r-pool, 5);
r-trailers_out = apr_table_make(r-pool, 5);
#endif

This only helps if someone is using Apache 2.4.11 or later where
MODULE_MAGIC_NUMBER has been updated.

Someone who takes the latest mod_wsgi code with the above change and
compiles it against an Apache with back ported change will still find that
mod_wsgi will crash as that code will never be compiled in.

In short, I can't see that I have any way of detecting that an Apache
instance which has back ported the change is being used at compile time.

The only hack I can think of, is that for where
AP_MODULE_MAGIC_AT_LEAST(20120211, 37) doesn't succeed, I try and calculate
whether the 'useragent_ip' member of the structure is still likely to be
the last thing that can fit in the amount of memory allocated for the
request_rec and if it isn't and the following memory is enough to hold the
trailers_in and trailers_out pointers, that i somehow work out where they
would fall and set them. This could be fiddly because of struct packing
issues.

I know it is asking a lot and likely too late, but what would have helped
immensely in such cases where back ports occur which change structure sizes
is that a #define was added within the struct where the new members were
added.

char *useragent_ip;

#define CVE_2013_5704 1

/** MIME trailer environment from the request */
apr_table_t *trailers_in;
/** MIME trailer environment from the response */
apr_table_t *trailers_out;
};

I know this goes against the MODULE_MAGIC_NUMBER idea, but the magic number
doesn't help with back ported changes like this.

With such #define's then I could have had:

#if AP_MODULE_MAGIC_AT_LEAST(20120211, 37) || defined(CVE_2013_5704)
r-trailers_in = apr_table_make(r-pool, 5);
r-trailers_out = apr_table_make(r-pool, 5);
#endif

So right now it looks like I have to use the rather fragile approach of
trying to work out whether the size of request_rec is no larger without
actually being able to access the members, which if they didn't exist would
cause a compile time failure.

Graham

Re: Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-12 Thread Graham Dumpleton

 But the damage has been done for some months on 2.2, and we are noticing
this, now?

All distros still shipping Apache 2.2 still are using older mod_wsgi 3.X
versions which I don't at this point believe are affected by this issue.

People who build stuff from source code themselves would be using latest
Apache 2.4.

So the big hit on mod_wsgi will come with Apache 2.4.11.

Graham

Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-09 Thread Graham Dumpleton

FWIW, there is potentially another issue for mod_wsgi coming up as well.
Seems that I was using an APR function which was tagged as internal and in
trunk of APR the header file that function is defined in is no longer
installed, or at least when within srclib of httpd, thus mod_wsgi will no
longer build against APR trunk. Luckily this function was only used in some
old dead code for Apache 1.3 and support for it has been dropped, so I can
remove the dependency. Will still require an update to mod_wsgi to ensure
that it can compile with latest APR. I am not sure if there is an intention
to update APR soon as well as httpd.

Graham

On 10 January 2015 at 09:04, Graham Dumpleton grah...@apache.org wrote:

 Thanks for the heads up and I appreciate very much the steps you are
 taking to limit possible affects.

 What I will do is the following:

 1. Verify that recompiling mod_wsgi is actually sufficient given than my
 direct use of request_rec isn't going to populate the extra fields and they
 will remain NULL still. As trailers shouldn't be expected in context the
 request_rec is being used directly by mod_wsgi those attributes shouldn't
 be touched, but if that is the case, why would it be crashing without
 recompilation happening. So need to also actually verify whether it can't
 limp on as is for now if it isn't crashing.

 2. Publicise upcoming problem on my blog, mod_wsgi mailing list and doc
 sites and repo.

 3. As a hack to try and ease the transition for anyone compiling from
 source code themselves, make a quick patch release of mod_wsgi which pads
 the size of request_rec when it is allocated, with at least 2 * size of a
 pointer. This way people can recompile this patched mod_wsgi now in advance
 and they shouldn't have an issue when the httpd binaries themselves are
 updated.

 4. Work on more permanent solution. The possibility of API functions for
 creating the structures has been suggested but not ideal as it is catering
 for an obscure case where mod_wsgi may be the only transgressor. I have
 contemplated doing away with using the request_rec in the mod_wsgi daemon
 mode, but it was attractive for a few reasons. I will need to reassess how
 much I do need it and whether I can eliminate it and find other ways to do
 the things I was dependent on it for. One of the main things from memory
 was actually related to logging, so it may be possible to do away with it.

 Thanks again for giving consideration for the problem I have caused.

 Graham


 On 10 January 2015 at 07:55, Ruediger Pluem rpl...@apache.org wrote:



 On 01/09/2015 09:48 PM, Jeff Trawick wrote:
  On Fri, Jan 9, 2015 at 3:23 PM, Joe Orton jor...@redhat.com mailto:
 jor...@redhat.com wrote:
 
  Since Jim is talking 2.4.11, I should report this now.  We
 discovered
  this week in Fedora: mod_wsgi does some interesting things in daemon
  mode, notably that it allocates a request_rec internally which ends
 up
  getting used by httpd.
 
  Reason is, the fix for CVE-2013-5704 extends the request_rec:
 
  http://svn.apache.org/r1619884
 
  A mod_wsgi built against = 2.4.10 will allocate a request_rec
 using the
  old, smaller wrong size, and hence, if such a build is used with
 =
  2.4.11, it passes in the wrong-sized request_rec and that breaks
 later
  when httpd tries to access r-trailers_*.
 
  It's one of those fuzzy boundaries in the API, you can argue
 mod_wsgi is
  wrong, but, I could argue it back; the struct *is* public, not got a
  strong opinion on this personally.
 
  Either way, the fix for CVE-2013-5704 ends up breaking backwards
  compatibility with existing 2.4.x builds of mod_wsgi, which is kind
 of
  Bad.  I don't have a good proposal for how to fix or avoid this.
 Worst
  case, we make clear the mod_wsgi case is API/ABI abuse and warn
 binary
  distributors they have to handle this by rebuilding.
 
  Regards, Joe
 
 
  * One-time only: Make clear in announcement that mod_wsgi has to be
 rebuilt.
  * Add helper functions to allocate a request_rec, conn_rec,
 server_rec.  It doesn't solve all possible problems of
  course but can drastically reduce the frequency of needing to recompile
 a module that needs to do such things.
  * Module authors who allocate structures generally created by httpd own
 the monitoring and announcement, or should just
  document You must recompile this module every time you update httpd.
 

 +1

 Regards

 Rüdiger

Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-09 Thread Graham Dumpleton

Okay, I screwed up that analysis a bit. It is APR 1.X to 2.X which is the
issue and I can fix by having:

#if APR_MAJOR_VERSION  2
#include apr_support.h
#endif

The specific code was:

#if APR_MAJOR_VERSION  2
rv = apr_wait_for_io_or_timeout(NULL, sock, 0);
#else
rv = apr_socket_wait(sock, APR_WAIT_WRITE);
#endif

Either way, a minor tweak to mod_wsgi code.

Graham


On 10 January 2015 at 14:28, Graham Dumpleton grah...@apache.org wrote:

 FWIW, there is potentially another issue for mod_wsgi coming up as well.
 Seems that I was using an APR function which was tagged as internal and in
 trunk of APR the header file that function is defined in is no longer
 installed, or at least when within srclib of httpd, thus mod_wsgi will no
 longer build against APR trunk. Luckily this function was only used in some
 old dead code for Apache 1.3 and support for it has been dropped, so I can
 remove the dependency. Will still require an update to mod_wsgi to ensure
 that it can compile with latest APR. I am not sure if there is an intention
 to update APR soon as well as httpd.

 Graham

 On 10 January 2015 at 09:04, Graham Dumpleton grah...@apache.org wrote:

 Thanks for the heads up and I appreciate very much the steps you are
 taking to limit possible affects.

 What I will do is the following:

 1. Verify that recompiling mod_wsgi is actually sufficient given than my
 direct use of request_rec isn't going to populate the extra fields and they
 will remain NULL still. As trailers shouldn't be expected in context the
 request_rec is being used directly by mod_wsgi those attributes shouldn't
 be touched, but if that is the case, why would it be crashing without
 recompilation happening. So need to also actually verify whether it can't
 limp on as is for now if it isn't crashing.

 2. Publicise upcoming problem on my blog, mod_wsgi mailing list and doc
 sites and repo.

 3. As a hack to try and ease the transition for anyone compiling from
 source code themselves, make a quick patch release of mod_wsgi which pads
 the size of request_rec when it is allocated, with at least 2 * size of a
 pointer. This way people can recompile this patched mod_wsgi now in advance
 and they shouldn't have an issue when the httpd binaries themselves are
 updated.

 4. Work on more permanent solution. The possibility of API functions for
 creating the structures has been suggested but not ideal as it is catering
 for an obscure case where mod_wsgi may be the only transgressor. I have
 contemplated doing away with using the request_rec in the mod_wsgi daemon
 mode, but it was attractive for a few reasons. I will need to reassess how
 much I do need it and whether I can eliminate it and find other ways to do
 the things I was dependent on it for. One of the main things from memory
 was actually related to logging, so it may be possible to do away with it.

 Thanks again for giving consideration for the problem I have caused.

 Graham


 On 10 January 2015 at 07:55, Ruediger Pluem rpl...@apache.org wrote:



 On 01/09/2015 09:48 PM, Jeff Trawick wrote:
  On Fri, Jan 9, 2015 at 3:23 PM, Joe Orton jor...@redhat.com mailto:
 jor...@redhat.com wrote:
 
  Since Jim is talking 2.4.11, I should report this now.  We
 discovered
  this week in Fedora: mod_wsgi does some interesting things in
 daemon
  mode, notably that it allocates a request_rec internally which
 ends up
  getting used by httpd.
 
  Reason is, the fix for CVE-2013-5704 extends the request_rec:
 
  http://svn.apache.org/r1619884
 
  A mod_wsgi built against = 2.4.10 will allocate a request_rec
 using the
  old, smaller wrong size, and hence, if such a build is used with
 =
  2.4.11, it passes in the wrong-sized request_rec and that breaks
 later
  when httpd tries to access r-trailers_*.
 
  It's one of those fuzzy boundaries in the API, you can argue
 mod_wsgi is
  wrong, but, I could argue it back; the struct *is* public, not got
 a
  strong opinion on this personally.
 
  Either way, the fix for CVE-2013-5704 ends up breaking backwards
  compatibility with existing 2.4.x builds of mod_wsgi, which is
 kind of
  Bad.  I don't have a good proposal for how to fix or avoid this.
 Worst
  case, we make clear the mod_wsgi case is API/ABI abuse and warn
 binary
  distributors they have to handle this by rebuilding.
 
  Regards, Joe
 
 
  * One-time only: Make clear in announcement that mod_wsgi has to be
 rebuilt.
  * Add helper functions to allocate a request_rec, conn_rec,
 server_rec.  It doesn't solve all possible problems of
  course but can drastically reduce the frequency of needing to
 recompile a module that needs to do such things.
  * Module authors who allocate structures generally created by httpd
 own the monitoring and announcement, or should just
  document You must recompile this module every time you update httpd.
 

 +1

 Regards

 Rüdiger

Re: CVE-2013-5704 fix breaks mod_wsgi

2015-01-09 Thread Graham Dumpleton

Thanks for the heads up and I appreciate very much the steps you are taking
to limit possible affects.

What I will do is the following:

1. Verify that recompiling mod_wsgi is actually sufficient given than my
direct use of request_rec isn't going to populate the extra fields and they
will remain NULL still. As trailers shouldn't be expected in context the
request_rec is being used directly by mod_wsgi those attributes shouldn't
be touched, but if that is the case, why would it be crashing without
recompilation happening. So need to also actually verify whether it can't
limp on as is for now if it isn't crashing.

2. Publicise upcoming problem on my blog, mod_wsgi mailing list and doc
sites and repo.

3. As a hack to try and ease the transition for anyone compiling from
source code themselves, make a quick patch release of mod_wsgi which pads
the size of request_rec when it is allocated, with at least 2 * size of a
pointer. This way people can recompile this patched mod_wsgi now in advance
and they shouldn't have an issue when the httpd binaries themselves are
updated.

4. Work on more permanent solution. The possibility of API functions for
creating the structures has been suggested but not ideal as it is catering
for an obscure case where mod_wsgi may be the only transgressor. I have
contemplated doing away with using the request_rec in the mod_wsgi daemon
mode, but it was attractive for a few reasons. I will need to reassess how
much I do need it and whether I can eliminate it and find other ways to do
the things I was dependent on it for. One of the main things from memory
was actually related to logging, so it may be possible to do away with it.

Thanks again for giving consideration for the problem I have caused.

Graham


On 10 January 2015 at 07:55, Ruediger Pluem rpl...@apache.org wrote:



 On 01/09/2015 09:48 PM, Jeff Trawick wrote:
  On Fri, Jan 9, 2015 at 3:23 PM, Joe Orton jor...@redhat.com mailto:
 jor...@redhat.com wrote:
 
  Since Jim is talking 2.4.11, I should report this now.  We discovered
  this week in Fedora: mod_wsgi does some interesting things in daemon
  mode, notably that it allocates a request_rec internally which ends
 up
  getting used by httpd.
 
  Reason is, the fix for CVE-2013-5704 extends the request_rec:
 
  http://svn.apache.org/r1619884
 
  A mod_wsgi built against = 2.4.10 will allocate a request_rec using
 the
  old, smaller wrong size, and hence, if such a build is used with =
  2.4.11, it passes in the wrong-sized request_rec and that breaks
 later
  when httpd tries to access r-trailers_*.
 
  It's one of those fuzzy boundaries in the API, you can argue
 mod_wsgi is
  wrong, but, I could argue it back; the struct *is* public, not got a
  strong opinion on this personally.
 
  Either way, the fix for CVE-2013-5704 ends up breaking backwards
  compatibility with existing 2.4.x builds of mod_wsgi, which is kind
 of
  Bad.  I don't have a good proposal for how to fix or avoid this.
 Worst
  case, we make clear the mod_wsgi case is API/ABI abuse and warn
 binary
  distributors they have to handle this by rebuilding.
 
  Regards, Joe
 
 
  * One-time only: Make clear in announcement that mod_wsgi has to be
 rebuilt.
  * Add helper functions to allocate a request_rec, conn_rec, server_rec.
 It doesn't solve all possible problems of
  course but can drastically reduce the frequency of needing to recompile
 a module that needs to do such things.
  * Module authors who allocate structures generally created by httpd own
 the monitoring and announcement, or should just
  document You must recompile this module every time you update httpd.
 

 +1

 Regards

 Rüdiger

Re: mod_fcgid kill all subprocesses in reload

2014-12-25 Thread Graham Dumpleton

Sounds like it would perhaps be for the same reason as mod_wsgi has issues
with that sort of thing.

Only Apache child worker processes get special dispensation as far as
graceful shutdowns or reloads are concerned. If instead a module creates
additional processes using the other child API calls in APR:

http://apr.apache.org/docs/apr/1.4/group__apr__thread__proc.html#gaf8d2be452a819161aa4cd6205a17761e

then when a shutdown or restart occurs they get a hard 3 seconds to
shutdown or they will be killed with SIGKILL. Up to that 3 seconds they
will be sent a normal SIGTERM each second to try and get them to exit.

Thus there is no facility to allow them to linger longer that I know of
unless something has been added in more recent times.

Graham


On 26 December 2014 at 16:57, Stefan Priebe - Profihost AG 
s.pri...@profihost.ag wrote:

 Hi list,

 i like mod fcgid a lot but there's one bug which makes me crazy.

 On DSO unload (Apache reload ) all child's get killed no matter if they
 process requests or not. This makes no sense to me httpd processes itself
 are also kept until all requests are served.

 Stefan

 Excuse my typo sent from my mobile phone.

Re: commercial support

2014-11-23 Thread Graham Dumpleton

On 24 November 2014 at 04:59, Jeff Trawick traw...@gmail.com wrote:


 If you're doing Python web apps it would be cool to pip install httpd
 FRAMEWORK-httpd-wiring and have a command that wires it up based on
 framework settings and a bit of other declarative configuration.  (similar
 for other ecosystems with a packaging/build infrastructure)  mod_wsgi
 actually has a version in PyPI that works like this, although it doesn't
 bring httpd with it.


Downloading and compiling the whole of httpd as a side effect of doing a
Python pip install isn't really practical. The process would just take too
long for a start, plus it doesn't solve the problem that many systems will
not have the dependencies installed in order to compile it. You don't want
to have to also be separately downloading and compiling APR, APU-UTIL and
PCRE now that they aren't bundled with the Apache source code.

I have tried going that path, albeit not triggered by pip, in trying to
create a build pack for Heroku which could be used to bring mod_wsgi to
that PaaS and it was a right pain, especially since the resulting size of
all the compiled components would chew up a significant part of the image
slug allowance that Heroku gave you. In the end I gave up on it because it
was so customised and unsupported by Heroku that no one would be likely to
use it.

So for the pip installable mod_wsgi it does at least rely on you having
httpd and the httpd-dev packages, plus any dependencies for those installed.

This still doesn't help with PaaS services which have such a narrow view of
what they want to allow you to do. For example, Heroku will not provide the
httpd and httpd-dev packages in the operating system image they use to
allow people to run it using their own custom configurations and compile
and use their own Apache modules. It even took me a couple of years at
least to get Heroku to update their Python installations so they provided
shared libraries and so allow any sort of dynamically loaded embedded
system such as the mod_wsgi module inside of Apache to be able to use their
Python installations. Before that I would have to also compile Python
source code from scratch as well.

Heroku isn't the only PaaS who has gone down a path which makes it near on
impossible to use them with Apache and a customised setup. OpenShift does
actually provide an Apache/Python/mod_wsgi cartridge, but they hardwire the
Apache configuration and you cannot change it. The particular configuration
actually has various problems in the way it is done and so provides a sub
optimal experience. They also use a very old mod_wsgi version which RHEL
version they use ships. Even if you could get around that you can't change
the Apache configuration and not even the startup command, it isn't even
possible to build an Apache module from scratch as they don't install the
httpd-dev package for RHEL.

The only PaaS where I could do what I want and use the pip installable
mod_wsgi was dotcloud. This as because it was what became docker and so
allowed a user to install the missing httpd-dev package in your own space
and so it was possible to then actually compile custom Apache modules.

So for me and turning around the rapid decline in mod_wsgi usage caused by
the narrow options most PaaS providers give you, docker is definitely the
way forward.

The idea of a pip installable mod_wsgi is therefore two fold.

The first is to work around the fact that Linux distributions ship very out
of date versions of packages. Most Linux distributions are over a dozen
releases behind on mod_wsgi.

The second is that the pip installable mod_wsgi does more than just compile
the mod_wsgi Apache module. It also installs a script called
mod_wsgi-express that automatically generates an Apache configuration for
you which is setup properly for mod_wsgi. This is what Jeff is alluding to
in saying 'a command that wires it up based on framework settings and a bit
of other declarative configuration'.

This solves another serious problem that mod_wsgi has had over the years.
That is that the default Apache configuration isn't particularly
appropriate. This is especially the case for prefork MPM where Python code
is run in embedded mode inside of the Apache child work process rather than
in mod_wsgi daemon mode, whereby the Python code runs in separate
processes. This isn't aided by what I would argue as being a somewhat
flawed child worker dynamic scaling algorithm in Apache which causes too
much process churn, negatively affecting embedded systems which have a
large startup cost.

So what mod_wsgi-express does is provide a turn key solution for setting up
Apache with Python as a form of appliance which is going to suit the
majority of cases where users are just running a single Python web
application. I can take all the knowledge I have accumulated over the years
as to what is the best way of setting up Apache for Python web applications
to avoid problems and distil that in a custom streamlined Apache

Re: MAJOR SECURITY-PROBLEM Apache 2.4.6

2014-10-21 Thread Graham Dumpleton

On 22 October 2014 13:51, Yehuda Katz yeh...@ymkatz.net wrote:

 On Wed, Oct 1, 2014 at 2:19 PM, Eric Covener cove...@gmail.com wrote:


 On Wed, Oct 1, 2014 at 2:16 PM, Eric Covener cove...@gmail.com wrote:

 To me, this does not exonerate mod_php, it implicates it.  I suspect
 your source code is served because PHP swallowed the LimitRequestBody and
 then passed control back to Apache.  I'm fairly certain I responded to you
 privately with similar information already.


 I should add that I don't understand your scenario completely, where the
 file is not processed. I think my own test result was the same as Yehuda
 ITT which is not the same as what I just described with the default handler
 taking over.


 1. Is this result (PHP executed) still a bug (could be in mod_php)? If a
 413 comes up, shouldn't no other content be returned?
 I am considering setting up a new VM to do some testing, but I want to
 make sure this is not the expected behavior (whether the PHP is executed or
 not).

 2. Is there another module that hooks in with a similar way to mod_php
 that might also show this behavior (mod_lua for example)?


 FWIW, I noted similar behaviour in implementing mod_wsgi many years ago.
Since then I have code in mod_wsgi in the handler before anything is done
which specifically does:

/*
 * Check to see if the request content is too large if the
 * Content-Length header is defined then end the request here. We do
 * this as otherwise it will not be done until first time input data
 * is read in by the application. Problem is that underlying HTTP
 * output filter will also generate a 413 response and the error
 * raised from the application will be appended to that. The call to
 * ap_discard_request_body() is hopefully enough to trigger sending
 * of the 413 response by the HTTP filter.
 */

lenp = apr_table_get(r-headers_in, Content-Length);

if (lenp) {
char *endstr;
apr_off_t length;

if (wsgi_strtoff(length, lenp, endstr, 10)
|| *endstr || length  0) {

wsgi_log_script_error(r, apr_psprintf(r-pool,
Invalid Content-Length header value of '%s' was 
supplied., lenp), r-filename);

return HTTP_BAD_REQUEST;
}

limit = ap_get_limit_req_body(r);

if (limit  limit  length) {
ap_discard_request_body(r);
return OK;
}
}

So in the case of mod_wsgi it wasn't that the source code was being
appended, but that any error response from the hosted Python WSGI
application, generated in reaction to the reading of the request content
failing because of the length check by the input filter, that got appended
to the end of the 413 error response that the HTTP filter had already
caused to be delivered back.

Graham

Re: [RFC] CGIPassHeader Authorization|Proxy-Authorization|...

2014-08-18 Thread Graham Dumpleton

The problem is sys admins who don't know what they are doing as far as
administering Apache.

I used to work in a corporate environment where they allowed everyone a
~username directory for placing stuff. As they wanted to allow people to
setup certain type of scripts in their directory, they allowed .htaccess
files.

Only thing wrong with that is that they also required you to authenticate
using HTTP basic authentication with your own credentials to access anyones
~username directory.

Using any of the backdoors you could harvest credentials by getting others
in the company to go to your personal ~username area.

So in some respects if a directive is going to be provided, it has to be
something that can be consulted by any module and all core modules which
have such backdoor ways of getting at the authorisation header should be
changed to block access if the directive hasn't been set to allow access to
that header.

If this was possible then I would certainly change mod_wsgi to be gated by
the directive as an extra step as I can't say that I am happy that I
allowed myself to be pushed into allowing enabling in .htaccess files with
the mod_wsgi option.

Graham



On 19 August 2014 06:29, André Malo n...@perlig.de wrote:

 Hi,

 only short notes from me. I'd appreciate such a directive very much. I
 think, allowing it in .htaccess won't hurt. I can't come up with a use
 case, where the person behind the script doesn't have access to the
 credentials anyway.

 As for the passing right now, you don't need the whole mod_rewrite
 machinery
 for this:

 SetEnvIf Authorization (.+) HTTP_AUTHORIZATION=$1

 that's, what I've been using so far :)

 nd

 * Graham Dumpleton wrote:

  A few comments on this.
 
  The first is that mod_wsgi originally never allowed its
  WSGIPassAuthorization directive in a htaccess file, and then when it it
  did first allow it, it was only honoured if AuthConfig was allowed for
  that context.
 
  I kept having people who needed that ability when they had a htaccess
  file, but didn't have AuthConfig.
 
  One of the things that was pointed out was that if you have htaccess
  enabled, and mod_rewrite was being loaded into Apache, you could get
  access to the Authorization header anyway.
 
  RewriteEngine on
  RewriteBase /
  RewriteCond %{HTTP:Authorization}  ^(.*)
  RewriteRule ^(.*)$ $1 [e=HTTP_AUTHORIZATION:%1]
 
  In the end, since mod_wsgi arguably shouldn't ever be used in shared
  environments anyway, I ended up caving in and allowing
  WSGIPathAuthorization in htaccess to make it convenient for the more
  typical scenario that kept coming up.
 
  Now having the ability within mod_wsgi for a web application to handle
  authorisation this showed up a couple of problems in the
  ap_scan_script_header_err_core()
  function. This arose because the response data when it comes back from a
  mod_wsgi daemon process just users the CGI way of doing things and so
  used that function.
 
  The first problem was that when WWW-Authenticate headers came back in a
  response, the ap_scan_script_header_err_core() function would merge the
  values of multiple instances of such headers with a comma in between the
  values. As a result, a single header would get returned back to the HTTP
  client. As with Set-Cookie header, this would cause problems with some
  HTTP clients and they would fail due to the merging. More recent versions
  of mod_wsgi therefore no longer use ap_scan_script_header_err_core() and
  have had to duplicate what it does so as to prevent merging of
  WWW-Authenticate headers.
 
  The second problem was the size limitation on the values of the headers
  coming back from a CGI script. As is, the complete header name and value
  must fit within MAX_STRING_LEN (8192). If the size didn't fit under that,
  you would from memory get the cryptic error message of 'Premature end of
  script header'. In recent times, application such as OpenStack KeyStone
  have been generating values for the WWW-Authenticate header which are
  larger than MAX_STRING_LEN and so they were failing when used with
  mod_wsgi because of the use of ap_scan_script_header_err_core(). In
  recent versions of mod_wsgi, the buffer size used to read the header name
  and value defaults to 8192, but can be overridden through a configuration
  option to allow a larger value to come back.
 
  So irrespective of where you are going to allow this CGIPassHeader
  directive, you might want to look at these other two issues, and if not
  both, certainly the issue of WWW-Authenticate being merged as it will
  cause issues for some browsers if someone ends up passing back multiple
  WWW-Authenticate headers as I am told it can if it supports a choice of
  authentication schemes.
 
  Graham
 
  On 17 August 2014 06:16, Jeff Trawick traw...@gmail.com wrote:
   This core directive would be used to modify the processing of
   ap_add_common_vars() to pass through Authorization or
   Proxy-Authorization

Re: [RFC] CGIPassHeader Authorization|Proxy-Authorization|...

2014-08-16 Thread Graham Dumpleton

A few comments on this.

The first is that mod_wsgi originally never allowed its
WSGIPassAuthorization directive in a htaccess file, and then when it it did
first allow it, it was only honoured if AuthConfig was allowed for that
context.

I kept having people who needed that ability when they had a htaccess file,
but didn't have AuthConfig.

One of the things that was pointed out was that if you have htaccess
enabled, and mod_rewrite was being loaded into Apache, you could get access
to the Authorization header anyway.

RewriteEngine on
RewriteBase /
RewriteCond %{HTTP:Authorization}  ^(.*)
RewriteRule ^(.*)$ $1 [e=HTTP_AUTHORIZATION:%1]

In the end, since mod_wsgi arguably shouldn't ever be used in shared
environments anyway, I ended up caving in and allowing
WSGIPathAuthorization in htaccess to make it convenient for the more
typical scenario that kept coming up.

Now having the ability within mod_wsgi for a web application to handle
authorisation this showed up a couple of problems in the
ap_scan_script_header_err_core()
function. This arose because the response data when it comes back from a
mod_wsgi daemon process just users the CGI way of doing things and so used
that function.

The first problem was that when WWW-Authenticate headers came back in a
response, the ap_scan_script_header_err_core() function would merge the
values of multiple instances of such headers with a comma in between the
values. As a result, a single header would get returned back to the HTTP
client. As with Set-Cookie header, this would cause problems with some HTTP
clients and they would fail due to the merging. More recent versions of
mod_wsgi therefore no longer use ap_scan_script_header_err_core() and have
had to duplicate what it does so as to prevent merging of WWW-Authenticate
headers.

The second problem was the size limitation on the values of the headers
coming back from a CGI script. As is, the complete header name and value
must fit within MAX_STRING_LEN (8192). If the size didn't fit under that,
you would from memory get the cryptic error message of 'Premature end of
script header'. In recent times, application such as OpenStack KeyStone
have been generating values for the WWW-Authenticate header which are
larger than MAX_STRING_LEN and so they were failing when used with mod_wsgi
because of the use of ap_scan_script_header_err_core(). In recent versions
of mod_wsgi, the buffer size used to read the header name and value
defaults to 8192, but can be overridden through a configuration option to
allow a larger value to come back.

So irrespective of where you are going to allow this CGIPassHeader
directive, you might want to look at these other two issues, and if not
both, certainly the issue of WWW-Authenticate being merged as it will cause
issues for some browsers if someone ends up passing back multiple
WWW-Authenticate headers as I am told it can if it supports a choice of
authentication schemes.

Graham



On 17 August 2014 06:16, Jeff Trawick traw...@gmail.com wrote:

 This core directive would be used to modify the processing of
 ap_add_common_vars() to pass through Authorization or Proxy-Authorization
 as HTTP_foo.  (Nothing else is currently blocked, so any other header name
 wouldn't make sense.)

 This directive would be configurable at the directory level, but not in
 htaccess.

 Various mods (mod_fastcgi, mod_fcgid, mod_wsgi, etc.) have ways to pass
 this information through; bug 56855 has a patch to add it to mod_proxy_fcgi
 too.  With that patch in place, at least mod_proxy_scgi in our tree still
 couldn't front an app that wants to handle Basic auth.  It would be good to
 consolidate over time the code/documentation around suppressing
 *Authorization.

 Some concerns: Processing it in ap_add_common_vars() is not finely scoped
 to natural users of the data; e.g., mod_include and mod_ext_filter would
 see it.  At the same time, not allowing it in htaccess may negate its
 usefulness in some environments.

 Thoughts?

 --
 Born in Roswell... married an alien...
 http://emptyhammock.com/

Re: Apache2 crashes with segmentation fault

2014-07-17 Thread Graham Dumpleton

Since you don't say what version of mod_wsgi you are using, or what version
of Apache, then the only other thing I can suggest right now is to ensure
that you are using the latest mod_wsgi version.

The latest version of mod_wsgi is version 4.2.6. Pretty well all Linux
distributions are still shipping version 3.3.

Older versions may exhibit segmentation faults with Apache 2.4 in some
situations, although this is generally only on process shutdown and I have
never heard of mod_wsgi itself to cause a hang when using Apache 2.4,
although lxml is very much known to in some cases.

Either way, please take this discussion now to the mod_wsgi mailing list as
described in the mod_wsgi documentation.

http://code.google.com/p/modwsgi/wiki/WhereToGetHelp?tm=6#Asking_Your_Questions

When you post to the mod_wsgi mailing list, please provide proper Apache
and mod_wsgi version details as described in:

http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Apache_Build_Information
http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Apache_Modules_Loaded

As well as results of verifying Python installation in use:

http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Python_Shared_Library
http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Python_Installation_In_Use

and confirmation that your application is indeed running in the main
interpreter and daemon mode.

http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Embedded_Or_Daemon_Mode
http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Sub_Interpreter_Being_Used

See you in the mod_wsgi mailing list.

Graham


On 17 July 2014 01:32, Elhadi Falah hadi.fa...@gmail.com wrote:

 Hello,

 I use mod_wsgi to run processes and not mod_python.

   WSGIDaemonProcess p1 threads=25
 python-path=/opt/appengine/google_appengine:/opt/appengine/google_appengine/lib/django:/opt/appengine/google_appengine/lib/webob:/opt/appengine/google_appengine/lib/yaml/lib
   WSGIProcessGroup p1

   WSGIScriptAlias / /var/www/application/rep/handler.wsgi

 I already used the directive WSGIApplicationGroup %{GLOBAL} to run un the
 main interpreter context but the issue persists after executing apache
 graceful or reload.

 Regards



 2014-07-16 13:44 GMT+00:00 Graham Dumpleton grah...@apache.org:

 It is well known that the lxml package doesn't work properly in a Python
 sub interpreter context. Force it to run in the main interpreter context.

 See:


 http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API

 In other words look at using:

 WSGIApplicationGroup %{GLOBAL}

 as documented.

 Graham


 On 16 July 2014 03:11, Elhadi Falah hadi.fa...@gmail.com wrote:

 Hello,

 We are using lxml in several of our applications with Python 2.6 and
 from time to time, the application stops responding after a segmentation
 fault error ( [notice] child pid 10544 exit signal Segmentation fault
 (11)), and this kind of backtrace:

 Jul 1 15:24:48 server1 httpd: *** glibc detected *** /usr/sbin/apache2:
 munmap_chunk(): invalid pointer: 0x7f6468bf2c00 ***

 Jul 1 15:24:48 server1 httpd: === Backtrace: =

 Jul 1 15:24:48 server1 httpd: /lib/libc.so.6(+0x78bf6)[0x7f64767ecbf6]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(xmlCopyError+0xd1)[0x7f6473311801]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(__xmlRaiseError+0x30b)[0x7f6473312ecb]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(+0x393e5)[0x7f64733173e5]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(xmlParseDocument+0x2dc)[0x7f647332e5cc]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(+0x50895)[0x7f647332e895]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x8cbc2)[0x7f645691cbc2]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x2c7cf)[0x7f64568bc7cf]

 After trying several versions of lxml we are still facing the issue.I've
 checked for the system memory consumption but everything looks fine to me,
 plenty of memory available, I don't see any process consuming abnormally.

 The issue is reproducible everytime when we execute the commande apache
 (apache2 reload or apache2 graceful). As workaround for this issue we
 execute apache2 restart.

 We've followed recommendations defined on these 2 links but we're still
 facing the issue.


 http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API


 http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters

 Library version:

print(%-20s: %s % ('Python',   sys.version_info))

 Python  : (2, 6, 5, 'final', 0)

print(%-20s: %s % ('lxml.etree',   etree.LXML_VERSION))

 lxml.etree  : (2, 3, 5, 0)

print(%-20s: %s % ('libxml used',  etree.LIBXML_VERSION))

 libxml used : (2, 7, 6)

print(%-20s: %s % ('libxml compiled',
  etree.LIBXML_COMPILED_VERSION))

 libxml compiled

Re: Apache2 crashes with segmentation fault

2014-07-16 Thread Graham Dumpleton

It is well known that the lxml package doesn't work properly in a Python
sub interpreter context. Force it to run in the main interpreter context.

See:

http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API

In other words look at using:

WSGIApplicationGroup %{GLOBAL}

as documented.

Graham


On 16 July 2014 03:11, Elhadi Falah hadi.fa...@gmail.com wrote:

 Hello,

 We are using lxml in several of our applications with Python 2.6 and from
 time to time, the application stops responding after a segmentation fault
 error ( [notice] child pid 10544 exit signal Segmentation fault (11)), and
 this kind of backtrace:

 Jul 1 15:24:48 server1 httpd: *** glibc detected *** /usr/sbin/apache2:
 munmap_chunk(): invalid pointer: 0x7f6468bf2c00 ***

 Jul 1 15:24:48 server1 httpd: === Backtrace: =

 Jul 1 15:24:48 server1 httpd: /lib/libc.so.6(+0x78bf6)[0x7f64767ecbf6]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(xmlCopyError+0xd1)[0x7f6473311801]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(__xmlRaiseError+0x30b)[0x7f6473312ecb]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(+0x393e5)[0x7f64733173e5]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(xmlParseDocument+0x2dc)[0x7f647332e5cc]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/libxml2.so.2(+0x50895)[0x7f647332e895]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x8cbc2)[0x7f645691cbc2]

 Jul 1 15:24:48 server1 httpd:
 /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x2c7cf)[0x7f64568bc7cf]

 After trying several versions of lxml we are still facing the issue.I've
 checked for the system memory consumption but everything looks fine to me,
 plenty of memory available, I don't see any process consuming abnormally.

 The issue is reproducible everytime when we execute the commande apache
 (apache2 reload or apache2 graceful). As workaround for this issue we
 execute apache2 restart.

 We've followed recommendations defined on these 2 links but we're still
 facing the issue.


 http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API


 http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters

 Library version:

print(%-20s: %s % ('Python',   sys.version_info))

 Python  : (2, 6, 5, 'final', 0)

print(%-20s: %s % ('lxml.etree',   etree.LXML_VERSION))

 lxml.etree  : (2, 3, 5, 0)

print(%-20s: %s % ('libxml used',  etree.LIBXML_VERSION))

 libxml used : (2, 7, 6)

print(%-20s: %s % ('libxml compiled',  etree.LIBXML_COMPILED_VERSION))

 libxml compiled : (2, 7, 6)

print(%-20s: %s % ('libxslt used', etree.LIBXSLT_VERSION))

 libxslt used: (1, 1, 26)

print(%-20s: %s % ('libxslt compiled',
 etree.LIBXSLT_COMPILED_VERSION))

 libxslt compiled: (1, 1, 26)

 Apache 2.2.14

 Here is the source code that generate the issue:

 ID_TRANSFORM =
 os.environ['APPLICATION_WORKING_PATH']+'/statics/xsl/list.xsl'

 styledoc = lxml.etree.parse(ID_TRANSFORM)

 transform = lxml.etree.XSLT(styledoc)

 doc_root = lxml.etree.XML(str(atom))

 Could you help us on this case?

 Regards

Re: Issue with connect() call made in mod_proxy_fdpass?

2014-06-01 Thread Graham Dumpleton

What I don't quite understand is why the Linux manual pages:

  http://man7.org/linux/man-pages/man7/unix.7.html

are even promoting the style:

  offsetof(struct sockaddr_un, sun_path) + strlen(sun_path) + 1

That would produce a length with is technically 1 greater than what the
size of sockaddr_un can be. I can only imagine that within connect() it
must be deducting 1 for any possible null byte and so allowing the last
byte of sun_path to be a non null byte and never actually checking the last
byte as specified by the length, which would be beyond the valid length of
the structure.

In the Apache code for mod_proxy_fdpass at least it doesn't really matter,
as when copying into sun_path it ensures that the last byte that could ever
be written is the last one possible in sun_path and that will have a null
byte. So there is no risk the Apache code could overrun sun_path.

I expect therefore that the Apache code could have simply used:

  sizeof(struct sockaddr_un)

As for the original issue with mod_proxy_fdpass, that was plain wrong and
on some Linux systems could result in connect() failing with error:

  (22)Invalid argument

as it would regard the supplied length as being over some internal limit it
checks. Doesn't happen on all Linux versions though based on what I have
had reported for some code of my own where I happened to replicate what
mod_proxy_fdpass was doing in that function.

Graham



On 1 June 2014 16:56, Christophe JAILLET christophe.jail...@wanadoo.fr
wrote:

 Fix in r1598946.

 CJ

Re: Issue with connect() call made in mod_proxy_fdpass?

2014-06-01 Thread Graham Dumpleton

Ahh, I am partly being a goose. I kept reading that strlen() as sizeof()
when reading the manual page. :-(

Graham


On 1 June 2014 21:44, Jeff Trawick traw...@gmail.com wrote:

 On Sun, Jun 1, 2014 at 3:10 AM, Graham Dumpleton grah...@apache.org
 wrote:

 What I don't quite understand is why the Linux manual pages:

   http://man7.org/linux/man-pages/man7/unix.7.html

 are even promoting the style:

   offsetof(struct sockaddr_un, sun_path) + strlen(sun_path) + 1

 That would produce a length with is technically 1 greater than what the
 size of sockaddr_un can be.


 If sun_path has non-\0 in every position, it can be even larger depending
 on where the first \0 is ;)  But see below.


  I can only imagine that within connect() it must be deducting 1 for any
 possible null byte and so allowing the last byte of sun_path to be a non
 null byte and never actually checking the last byte as specified by the
 length, which would be beyond the valid length of the structure.


 It is possible (or potentially required) to manage the sockaddr_un storage
 allocation yourself and support longer filesystem paths than will fit in
 the preallocated sun_path field.  Exact limits vary by platform, but
 sun_path is generally declared smaller than the max supported path length.

 (See http://bugs.python.org/issue8882 for just one example discussion.)

 Thus the calculation of the sockaddr length based on actual data instead
 of the size of the structure, which may have to be mapped over a larger
 buffer in order to accommodate a more realistic path limit...

 (I think I alluded to this on the APR list.  APR really isn't helping
 appropriately if it doesn't handle this ugly issue.)


 In the Apache code for mod_proxy_fdpass at least it doesn't really
 matter, as when copying into sun_path it ensures that the last byte that
 could ever be written is the last one possible in sun_path and that will
 have a null byte. So there is no risk the Apache code could overrun
 sun_path.

 I expect therefore that the Apache code could have simply used:

   sizeof(struct sockaddr_un)

 As for the original issue with mod_proxy_fdpass, that was plain wrong and
 on some Linux systems could result in connect() failing with error:

   (22)Invalid argument

 as it would regard the supplied length as being over some internal limit
 it checks. Doesn't happen on all Linux versions though based on what I have
 had reported for some code of my own where I happened to replicate what
 mod_proxy_fdpass was doing in that function.

 Graham



 On 1 June 2014 16:56, Christophe JAILLET christophe.jail...@wanadoo.fr
 wrote:

 Fix in r1598946.

 CJ





 --
 Born in Roswell... married an alien...
 http://emptyhammock.com/
 http://edjective.org/

Issue with connect() call made in mod_proxy_fdpass?

2014-05-30 Thread Graham Dumpleton

In mod_proxy_fdpass there is a function socket_connect_un():


https://svn.apache.org/repos/asf/httpd/httpd/trunk/modules/proxy/mod_proxy_fdpass.c

which contains the code:

rv = connect(rawsock, (struct sockaddr*)sa,
   sizeof(*sa) + strlen(sa-sun_path));

Can some explain to me why it is using:

sizeof(*sa) + strlen(sa-sun_path)

rather than just:

sizeof(*sa)

It just doesn't seem right.

One does find on the Internet examples which use:

#define SERV_PATH ./serv.path struct sockaddr_un serv_addr; int servlen;
bzero((char *) serv_addr, sizeof(serv_addr)); serv_addr.sun_family =
AF_UNIX; strcpy(serv_addr.sun_path, SERV_PATH); servlen =
strlen(serv_addr.sun_path) + sizeof(serv_addr.sun_family); connect(sockfd,
(struct sockaddr *) serv_addr, servlen);

That is, the sockaddr_un structure length is calculated as:

servlen = strlen(serv_addr.sun_path) + sizeof(serv_addr.sun_family)

It almost looked like someone started with something similar, but rather
than replace the whole thing with:

sizeof(*sa)

replaced just the part:

sizeof(serv_addr.sun_family)

and then wrongly still added the length of the sun_path member of the
struct to that.

Any comments? Is there something else funny going on with mod_proxy_fdpass
that requires it be done this way?

Graham

Re: modules calling ap_lingering_close()!!!

2014-02-20 Thread Graham Dumpleton

On 21 February 2014 02:23, Joe Orton jor...@redhat.com wrote:

 On Thu, Feb 20, 2014 at 07:52:34AM -0500, Jeff Trawick wrote:
  WSGI 3.4 daemon mode crashing with httpd 2.4.x...
 
  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0xaef17b70 (LWP 32761)]
  0x08078a32 in update_child_status_internal ()
  (gdb) where
  #0  0x08078a32 in update_child_status_internal ()
  #1  0x0809952d in ap_start_lingering_close ()
  #2  0x080995a9 in ap_lingering_close ()

 Fixed in:


 http://code.google.com/p/modwsgi/source/detail?path=/mod_wsgi.cname=mod_wsgi-3.Xr=bdbeacb88f348909845445e9d52eb7be401abaf1

 mod_wsgi does some surprising things with httpd interfaces which should
 probably be internal-only, or at least better documented API!


Crap. I thought those httpd 2.4 fixes were already in mod_wsgi 3.4.

Another reason I have to get off my backside and release an updated
version. Has been too long.

And yes mod_wsgi does lots of evil things which no doubt would be regarded
as evil. Most come from daemon mode. One day I will simply rewrite daemon
mode request handling to not be dependent on Apache request structures and
then it will not be so bad.

Graham

Re: triggering a process recreation of a child process

2013-11-17 Thread Graham Dumpleton

On 17 November 2013 22:05, jean-frederic clere jfcl...@gmail.com wrote:

 Hi,

 Is there a way to trigger a clean recreation of a child from a  module?


See the apr_proc_other_child_*() family of functions.

For an example, go look at the implementation of mod_cgid.

Graham

Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton

Don't know if will be applicable in the case of those modules or not, but
mod_python and mod_wsgi have similar conflicts over Python interpreter
initialisation and destruction and have had to do a little dance over who
gets precedence to ensure things don't crash.

In the next version of mod_wsgi though I am dropping support for
coexistence. I want to flag that fact with a big error message and refuse
to start up if both loaded.

What I have done is relied on the fact that mod_python
uses apr_pool_userdata_set() to set a module specific key in the module
init function to avoid doing certain interpreter initialisation on first
run through the config when Apache is started.

In other words, in mod_wsgi it will look for the mod_python key and
complain.

/*
 * No longer support using mod_python at the same time as
 * mod_wsgi as becoming too painful to hack around
 * mod_python's broken usage of threading APIs when align
 * code to the stricter API requirements of Python 3.2.
 */

userdata_key = python_init;

apr_pool_userdata_get(data, userdata_key, s-process-pool);
if (data) {
ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
 mod_wsgi (pid=%d): The mod_python module can 
 not be used on conjunction with mod_wsgi 4.0+. 
 Remove the mod_python module from the Apache 
 configuration., getpid());

return HTTP_INTERNAL_SERVER_ERROR;
}

Don't know if the modules you are worried about use this convention of
using apr_pool_userdata_set() to flag whether module init has already been
run or not for configuration to avoid doing stuff twice which shouldn't.

Graham



On 6 February 2013 08:48, Mikhail T. mi+t...@aldan.algebra.com wrote:

 On 05.02.2013 16:37, Nick Kew wrote:

 But in general, querying another module, or knowing anything about
 its cleanups, would be a violation of modularity.  If it's legitimate
 for a module to expose its inner workings, it can do so by exporting
 an API.

 Why the questions?  Are you writing two modules that relate closely
 to each other?

 I'm not writing them -- they already exist. The two Tcl-modules (rivet and
 websh) both destroy the Tcl-interpreter at exit. The module, that gets to
 run the clean up last usually causes a crash: https://issues.apache.org/**
 bugzilla/show_bug.cgi?id=54162https://issues.apache.org/bugzilla/show_bug.cgi?id=54162

 If each module could query, whether the other one is loaded too, the first
 one could skip destroying the interpreter -- leaving the task to the last
 one. This approach would work even if only one of them has been patched to
 do this.

 The modularity is a great thing, of course, but when the modules use
 shared data-structures (from another library -- such as libtcl), they
 better cooperate, or else...

 Yours,

-mi

Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton

The mod_python project is no longer developed and was moved into the ASF
attic. It is no longer recommended that it be used and the last official
release will not compile on current Apache versions. It only continues in
any form because some Linux distros are making their own patches so it will
compile. They can only ever keep this up for Apache 2.2 though, as 2.4
differences were too great and minor patches will not make it work there.

Graham


On 6 February 2013 09:30, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 17:14, Graham Dumpleton wrote:

 In the next version of mod_wsgi though I am dropping support for
 coexistence. I want to flag that fact with a big error message and refuse
 to start up if both loaded.

 I'm not sure, how Python-users will react, but, as a Tcl-user, I'd hate to
 be forced to choose one of the two modules. I'm hosting to completely
 unrelated vhosts, which use the two Tcl-using modules.


 On 05.02.2013 17:20, Jeff Trawick wrote:

 module *modp;
 for (modp = ap_top_module; modp; modp = modp-next) {
foo(modp-name);
 }

 Cool! I thought of relying on the fact, that server_rec's module_config is
 a an array of module-pointers, but the above seems more reliable. Thank you!

 -mi

Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton

Is this being done in the Apache parent process or only in the child
processes?

If in the Apache parent process, you would still have to call Tcl_Finalize()
at some point wouldn't you to ensure that all memory is reclaimed?

One of the flaws early on in mod_python was that it didn't destroy the
Python interpreter. When an Apache restart was done, mod_python and the
Python library would be unloaded from memory. When the in process startup
was done after rereading the configuration Apache would load them again.
Because it was reloaded it was a completely clean set of static variables
holding interpreter state and so interpreter had to be reinitialised.

In other words, the unload/load that happens for modules on a restart meant
that it leaked memory into the Apache parent process, resulting in the
parent process continually growing over time when restarts were done.

Even though mod_python was fixed and destroying the interpreter done.
Python itself still didn't always clean up memory completely and left some
static data in place on basis that if interpreter reinitialised in same
process, would just reuse that to avoid creating it again. Unfortunately
the unload/load cycle of modules still meant that memory leaked and so
mod_python as a result still leaks memory into the Apache parent process.

In the end in mod_wsgi, because of Python leaking memory in this way, had
to defer initialisation of interpreter until child processes were forked,
as simply wasn't possible to get Python to change what it did.

Graham




On 6 February 2013 10:11, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:01, William A. Rowe Jr. wrote:

 What if both attempt to register an identical apr_optional_fn for
 tcl_destroy.  That way you will never have both optional functions
 called.

  My plan was for each of the modules to skip the destruction, if the OTHER
 module is registered to run clean-up AFTER it.

 This way the last module in the list will always run the destructor.

  FWIW I would call that function as a destructor of the process_pool,
 which you can find by walking the config pool's parents.

  That's an idea... But, I think, I found a Tcl-specific solution for this
 particular problem -- instead of calling Tcl_Finalize(), which ruins libtcl
 for everyone in the same process, mod_rivet should simply delete the
 Tcl-interpreter it created (websh does limit itself to exactly that
 already).

 Let's see, what mod_rivet maintainers have to say (
 https://issues.apache.org/bugzilla/attachment.cgi?id=29923action=diff).

 But this was a very educating thread nonetheless. Thank you, everyone.
 Yours,

 -mi

Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton

Is this being done in the Apache parent process or only in the child
processes?

If in the Apache parent process, you would still have to call Tcl_Finalize()
at some point wouldn't you to ensure that all memory is reclaimed?

One of the flaws early on in mod_python was that it didn't destroy the
Python interpreter. When an Apache restart was done, mod_python and the
Python library would be unloaded from memory. When the in process startup
was done after rereading the configuration Apache would load them again.
Because it was reloaded it was a completely clean set of static variables
holding interpreter state and so interpreter had to be reinitialised.

In other words, the unload/load that happens for modules on a restart meant
that it leaked memory into the Apache parent process, resulting in the
parent process continually growing over time when restarts were done.

Even though mod_python was fixed and destroying the interpreter done.
Python itself still didn't always clean up memory completely and left some
static data in place on basis that if interpreter reinitialised in same
process, would just reuse that to avoid creating it again. Unfortunately
the unload/load cycle of modules still meant that memory leaked and so
mod_python as a result still leaks memory into the Apache parent process.

In the end in mod_wsgi, because of Python leaking memory in this way, had
to defer initialisation of interpreter until child processes were forked,
as simply wasn't possible to get Python to change what it did.

Graham


On 6 February 2013 10:11, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:01, William A. Rowe Jr. wrote:

 What if both attempt to register an identical apr_optional_fn for
 tcl_destroy.  That way you will never have both optional functions
 called.

  My plan was for each of the modules to skip the destruction, if the OTHER
 module is registered to run clean-up AFTER it.

 This way the last module in the list will always run the destructor.

  FWIW I would call that function as a destructor of the process_pool,
 which you can find by walking the config pool's parents.

  That's an idea... But, I think, I found a Tcl-specific solution for this
 particular problem -- instead of calling Tcl_Finalize(), which ruins libtcl
 for everyone in the same process, mod_rivet should simply delete the
 Tcl-interpreter it created (websh does limit itself to exactly that
 already).

 Let's see, what mod_rivet maintainers have to say (
 https://issues.apache.org/bugzilla/attachment.cgi?id=29923action=diff).

 But this was a very educating thread nonetheless. Thank you, everyone.
 Yours,

 -mi

Re: Can a module find out, whether another module is present?

2013-02-05 Thread Graham Dumpleton

On 6 February 2013 10:53, Mikhail T. mi+t...@aldan.algebra.com wrote:

  On 05.02.2013 18:25, Graham Dumpleton wrote:

 If in the Apache parent process, you would still have to call Tcl_Finalize()
 at some point wouldn't you to ensure that all memory is reclaimed?

 I don't think so. If only because after calling Tcl_Finalize(), any other
 calls into libtcl are undefined -- not supposed to happen. So, it can not
 be done on graceful restart anyway. From Tcl's man-page:

 Tcl_Finalize is similar to Tcl_Exit except that it does not  exit  from
 the  current  process.   It is useful for cleaning up when a process is
 finished using Tcl but wishes to continue executing, and  when  Tcl  is
 used  in  a  dynamically loaded extension that is about to be unloaded.
 Your code should always invoke Tcl_Finalize when Tcl is being unloaded,
 to  ensure  proper cleanup. Tcl_Finalize can be safely called more than
 once.

  One of the flaws early on in mod_python was that it didn't destroy the
 Python interpreter. When an Apache restart was done, mod_python and the
 Python library would be unloaded from memory. When the in process startup
 was done after rereading the configuration Apache would load them again.
 Because it was reloaded it was a completely clean set of static variables
 holding interpreter state and so interpreter had to be reinitialised.

 websh is already doing just the Tcl_DeleteInterpreter -- for the
 interpreter *it created*. That seems like the right thing to do anyway.

 If websh is wrong (and mod_rivet is right) in that an explicit call to
 Tcl_Finalize is needed for an exiting process,


It is not for an exiting process that is the problem. It is the module
cleanup, unloading and then reloading that occurs of the module within the
same Apache parent process when an Apache restart/graceful is done. The
main Apache parent process isn't actually killed in this situation.

So the section of documentation you quote appears to support what I am
saying that Tcl_Finalize() still needs to be called. After the module is
loaded and initialised again, then Tcl_Init(), or whatever is used to
create it again, would be called to start over and allow new instance of
interpreter to be setup in parent process before new child processes are
forked.

As I asked before, is this being done in the Apache parent process or only
in the child processes? If it is all only going on in the child processes,
the point I am making is moot, but if the interpreter is being initialised
in the Apache parent process before the fork, then it would be relevant.

Graham

Re: The Case for a Universal Web Server Load Value

2012-11-12 Thread Graham Dumpleton

You say:

I have traditional Unix-type load-average and the percentage of how
idle and busy the web-server is. But is that enough info? Or is that
too much? How much data should the front-end want or need? Maybe a single
agreed-upon value (ala load average) is best... maybe not. These are the
kinds of questions to answer.

How is 'idle' and 'busy' measure being calculated?

Now to deviate a bit in to related topic .

One of the concerns I have had when looking over how MPMs work of late is
that the measure of how many threads are busy used to determine whether
processes should be created or destroyed is a spot measure. At least that
is how I interpret the code and I could well be wrong, so please correct me
if I am :-)

That is, how many threads are in use only at the time the maintenance cycle
is run is taken into consideration.

In the Python world where one cannot preload in the Apache parent the
Python interpreter or your application for various reasons, and need to
defer it to child worker processes, recycling processes can be an expensive
exercise as everything is done in the child after the fork.

What worries me is that the current MPM calculation with using a spot
measure isn't really a true indication of how much the server is being
utilised over time. Imagine the worst case where you were under load and
had a large number of concurrent requests and a commensurate number of
processes, but a substantial number finished just before the maintenance
cycle ran. The spot measure could use a quite low number which doesn't
truly reflect the request load on the server in the period just before
that, and what may therefore come after.

As a result of a low number for a specific maintenance cycle, it could
think it had more idle threads than needed and kill off one process. On
next cycle one second later the maintenance cycle may hit again when high
number of concurrent request and think it has to create a process again.

Another case is where you had a momentary network issue and so requests
were getting through and so for a short period the busy measure was low and
number of processes progressively get killed off at a rate of one a second.

Using a spot measure rather than looking at business over an extended
window of time, especially when killing processes, could cause process
recycling when not warranted or when it would be better that it simply
didn't do it.

The potential for this is in part avoided by what the min/max idle threads
is set to. That is, it effectively smooths out small fluctuations, but
because the busy measure is a spot metric, am still concerned that the
randomness of when requests run means that the spot metric could still jump
around quite a lot between maintenance cycles to the extent that could
exceed min/max levels and so kill off processes.

Now for a Python site where recycling processes is expensive, the solution
is to reconfigure the MPM settings to start more servers at the outset and
allow a lot more idle capacity. But we know how many people actually bother
to tune these settings properly.

Anyway, that has had me wondering, and why I ask how you are calculating
'idle' and 'busy', whether such busy measures should not perhaps be done
differently so that it can look back in time at prior traffic during the
period since last maintenance cycle or even beyond that.

One way of doing this is looking at one I call thread utilisation or what
some also refer to as instance busy.

At this point it is going to be easier for me to refer to:

http://blog.newrelic.com/2012/09/11/introducing-capacity-analysis-for-python/

which has some nice pictures and description to help explain this thread
utilisation measure.

The thread utilisation over time since last maintenance cycle could
therefore be used, perhaps weighted in some way with current spot busy
value and also prior time periods to better smooth the value being used in
the decision.

I am guessing perhaps that some systems do have something more elaborate
than the simplistic mechanism that MPM appears to use by my reading of the
code. So what for example does mod_fcgid do?

Even using thread utilisation, one thing it cannot capture is queueing
time. That is, how long was a request sitting in the listener queue waiting
to be accepted.

Unfortunately I don't know of anyway to calculate this direct from the
operating system and so it generally relies on some front end sticking in a
header with a time stamp and looking at the elapsed time when it hits the
backend server. If the machines are on different servers though then you
got issues of clock skew to deal with.

Anyway, sorry for the long ramble.

I guess am just curious how busy is being calculated. Are there better ways
of calculating what busy is which are more accurate? Or does it mostly not
matter because when you start to reach higher levels of utilisations the
spot metric will tend towards becoming more reflective of actual
utilisation. Can additional measures, if

Re: Survery: how do you use httpd?

2011-10-31 Thread Graham Dumpleton

On 31 October 2011 18:24, William A. Rowe Jr. wr...@rowe-clan.net wrote:
 On 10/31/2011 2:19 AM, Sander Temme wrote:
 Dear Apache developers/users,

 I have created a quick survey to see how YOU use Apache and what is 
 important to you:

 http://www.surveymonkey.com/s/HFGDY3C

 It's only eight questions, and there's only one matrix!  Shouldn't take but 
 a minute to fill out.  Nothing official: I'm just curious.

 Under 4. you missed fcgid... and next to mod_python, mod_wsgi.

Since mod_python has been sent to the attic, is no longer maintained,
hasn't had a release in almost 5 years and would have to be patched to
even build with latest Apache versions, one wonders why it is even
listed. Listing it in same category next to mod_wsgi, which is
actively looked after, is going to give importance to mod_python that
probably it shouldn't get.

Graham

Re: PHP5.3.6

2011-03-18 Thread Graham Dumpleton

On 18 March 2011 07:24, Rich Bowen rbo...@rcbowen.com wrote:
 I wanted to be sure that folks are aware of what's going on in the 
 Windows/PHP world. I know that, in one sense, it's not our problem, but it 
 *feels* like our problem to me, and to many of our users.

 PHP5.3.6 was just released, and the Windows binaries are built with VC9, 
 meaning that it won't work with our Windows binaries. I know that it's been 
 discussed before, and there's a plan to move to VC9, but as of last week, the 
 official PHP build doesn't run with the official Apache httpd build. The PHP 
 website recommends that folks use the Apache Lounge build.

 This sucks.

 It sucks that our users have to jump through additional hoops. It sucks even 
 more that there wasn't (or at least, it appears to me that there wasn't) 
 conversation between the two communities prior to this happening. The folks 
 in php-land are aware that it's a problem, but don't see to really think that 
 it's *their* problem. For our part, we seem to be unaware that anything 
 happened.

 I don't know that the relationship between Apache httpd and php communities 
 is anybody's *fault*, but it's long struck me as a great shame that there 
 isn't closer cooperation between the two communities.

 I'm not sure exactly what I'm suggesting we do about this. It would be nice 
 if we could provide binaries built with VC9, or if we could recommend on the 
 download site that people get binaries from ApacheLounge. I don't know if 
 either of these is really an option. How would folks feel about our download 
 site encouraging folks to use ApacheLounge's version of 2.2? I suspect that 
 there'd be some resistance to this, based on our previous interactions with 
 them.

 I have a foot in the documentation team of both projects, so I tend to hear 
 both sides of the conversation at least from that perspective. I'd like for 
 us to be more proactive about strengthening the community bond between us and 
 what is probably the most important third-party Apache httpd module. There 
 seems to be a pretty strong they don't ever listen to us attitude on both 
 sides, and I'm not sure that it's really warranted.

If I read this right, this is a similar issue to what we have in the
Python world with some Python extension modules on Windows.

One discussion thread about it can be found at:

  http://psycopg.lighthouseapp.com/projects/62710/tickets/20

Scan down towards end of discussion for overview.

They have solved this problem in Python world by having the affected
package reinsert the missing VC runtime reference into the manifest
file used with the extension.

So, as far as I can see, PHP has a way of solving this themselves
without requiring a change in Apache.

Graham

Re: Inspiration for mod_lua

2010-12-30 Thread Graham Dumpleton

On 31 December 2010 07:37, Brian McCallister bri...@skife.org wrote:
 2010/12/28 Igor Galić i.ga...@brainsware.org:
 Hey folks,

 I'm looking for some inspiration on how to make good use of
 mod_lua. Those familiar with its documentation, might find
 it a little bit lacking in this regard.

 My original aim (and what I still use mod_wombat for) is various small
 modules I don't want to be bothered using C for, but which need to run
 in a threaded MPM (making mod_python/mod_perl not viable options).

Ignoring the fact that mod_python is now dead, there was never a
restriction on using mod_python in a threaded MPM.

Graham

 Auth against a remote service, interaction with shared memory for some
 other things on the box, graylisting access control, etc.

 I know it has also been used in the wild for browser sniffing for
 mobile devices, which means updating things frequently.

 A very nice use case I have seen is basically using it as the initial
 impl for things that will eventually become conventional C modules,
 once the functionality is understood and stabilises enough to make it
 worth it.

 -Brian

 -Brian


 I've got an elaborate project plotted out, which I'll implement
 real soon now, but what I'm chiefly looking for is what mod_lua
 has promised us: A cure for mod_rewrite.

 Please share your particularly ugly, involved, unaesthetic or
 otherwise /wrong/ solutions done with mod_rewrite because it
 was the only hammer available for the screws at that time ;)

 To extend our gallery:
 http://wiki.apache.org/httpd/WhenNotToUseRewrite
 on one hand, and on the other to find inspiration to solve the
 (more involved?) problems - if possible or sensible - in mod_lua.

 So long,
 i

 --
 Igor Galić

 Tel: +43 (0) 664 886 22 883
 Mail: i.ga...@brainsware.org
 URL: http://brainsware.org/

Re: Inspiration for mod_lua

2010-12-30 Thread Graham Dumpleton

On 31 December 2010 10:56, William A. Rowe Jr. wr...@rowe-clan.net wrote:
 On 12/30/2010 3:25 PM, Graham Dumpleton wrote:
 On 31 December 2010 07:37, Brian McCallister bri...@skife.org wrote:
 2010/12/28 Igor Galić i.ga...@brainsware.org:
 Hey folks,

 I'm looking for some inspiration on how to make good use of
 mod_lua. Those familiar with its documentation, might find
 it a little bit lacking in this regard.

 My original aim (and what I still use mod_wombat for) is various small
 modules I don't want to be bothered using C for, but which need to run
 in a threaded MPM (making mod_python/mod_perl not viable options).

 Ignoring the fact that mod_python is now dead, there was never a
 restriction on using mod_python in a threaded MPM.

 Nor for properly deployed mod_perl, but either is far more heavyweight than
 lua.  And when you multiply interpreter contexts across worker threads, both
 mod_perl and mod_python suffer huge bloat.  I'm hoping we see much different
 results with lua as the 'defacto' scripting engine.

The problem with mod_python was that it was poorly implemented. If you
were start from scratch and do it over, it would be possible to make
it much more light weight. The mod_wsgi module has shown this can be
the case. Problem is that mod_python's failings have resulted in this
overall perception that embedding Python inside of an Apache module is
bad, when it doesn't need to be. The problems with mod_python weren't
made any better through poor Python installations which didn't provide
a shared Python library. End result was that library was linked
statically and each process ended up with its own copy because of
forced address relocations, thus contributing to the perception of
memory bloat.

Anyway, all too late now as the perception that Python as scripting
language inside of Apache is bad is too prevalent and people continue
to propagate this even though in reality they are really misinformed.
:-(

Graham

Re: rational behind not checking the return value of apr_palloc and apr_pcalloc

2010-09-01 Thread Graham Dumpleton

On 1 September 2010 20:15, Graham Leggett minf...@sharp.fm wrote:
 On 01 Sep 2010, at 6:07 AM, dave b wrote:

 What is the rational behind not checking the return value of
 apr_palloc and apr_pcalloc?

 The rationale is to not be forced to check for and handle hundreds of
 potential failure cases when you're probably doomed anyway.

 The APR pools API gives you the apr_pool_abort_set() function, which
 specifies a function to call if the memory allocation fails. In the case of
 httpd, a function is registered which gracefully shuts down that particular
 server process if the allocation fails, and apr_palloc() is in the process
 guaranteed to never return NULL.

Noting that apr_pool_abort_set() is only setup in Apache 2.3 and not
in Apache 2.2.16 or earlier. Not being in 2.X explains why I couldn't
find it before.

Any reason why setting up apr_pool_abort_set() wasn't back ported to Apache 2.2?

Graham

 Obviously if you're not using APR from httpd, or if you're writing a library
 that depends on APR, and you haven't set an abort function, NULL will
 potentially be returned and you should check for and handle that case.

 Regards,
 Graham
 --

Re: rational behind not checking the return value of apr_palloc and apr_pcalloc

2010-08-31 Thread Graham Dumpleton

On 1 September 2010 14:07, dave b db.pub.m...@gmail.com wrote:
 What is the rational behind not checking the return value of
 apr_palloc and apr_pcalloc?

Specifically here talking about why HTTPD code doesn't check. Ie.,
core server code and modules supplied with HTTPD.

I am clarifying this because he is hitting up on me as to why mod_wsgi
doesn't do it, yet the HTTPD code itself doesn't do it and I am just
following that precedent. So suggested he ask here why there is no
practice of checking for NULL values in HTTPD code when doing
allocations against pools. :-)

Graham

 code memory/unix/apr_pools.c from apr-1.4.2
 APR_DECLARE(void *) apr_pcalloc(apr_pool_t *pool, apr_size_t size);
 APR_DECLARE(void *) apr_pcalloc(apr_pool_t *pool, apr_size_t size)
 {
    void *mem;

    if ((mem = apr_palloc(pool, size)) != NULL) {
        memset(mem, 0, size);
    }

    return mem;
 }

 and
 apr_palloc can return NULL.
 So I modified the code and the testdir test failed in one place -

     node = active-next;
    if (size = node_free_space(node)) {
        list_remove(node);
    }
    else {
        if ((node = allocator_alloc(pool-allocator, size)) == NULL) {
            if (pool-abort_fn)
                pool-abort_fn(APR_ENOMEM); /* HERE */

            return NULL;
        }
    }

 When you run the testdir (test). If you change the above to be:


 .
        if ((node = allocator_alloc(pool-allocator, size)) == NULL) {
            if (!   pool-abort_fn) /* note the ! added */
                pool-abort_fn(APR_ENOMEM);

            return NULL; /* you end up here */
        }
    }
 and you will fail one of the tests. This to me suggests that this scenario is
 possible if the pool is like that one failed test *but* pool-abort_fn is not
 true :)
 

 So what is the rational behind most users of these method *not*
 checking the return code - because from what I have seen / know it is
 possible return NULL.

 Also see:  https://issues.apache.org/bugzilla/show_bug.cgi?id=49847

Re: HTTPD upgraded on eos - 2.3.8

2010-08-24 Thread Graham Dumpleton

On 25 August 2010 10:10, Tony Stevenson pct...@apache.org wrote:
 On Wed, Aug 25, 2010 at 01:04:01AM +0100, Tony Stevenson wrote:

 Had to comment out an output filter line in the main httpd.conf (line 117)

 More specifically had to disable deflate -  AddOutputFilterByType DEFLATE 
 text/html text/plain text/xml application/xml application/xml+rss text/css 
 application/x-javascript
 The deflate module is loaded, and seemingly not causing any outwardly obvious 
 issues

 Thoughts?

What is the actual problem that caused you to remove it?

If you are seeing problems when this output filter is being used, is
it for URLs which are being handled by whatever Python web application
you are hosting via mod_wsgi?

I ask as I have had two reports of people having issues when using
DEFLATE on mod_wsgi responses. Feedback suggests that responses are
being delayed in being sent, giving appearance of slow response. The
information I have is confusing however, as one person suggested it
only happened for mod_wsgi daemon mode and the other thought it was
only mod_wsgi embedded mode. I have not got enough information back
from anyone to try and work out what issue may be and could not
duplicate a problem myself in testing with standard Apache on Mac OS
X. But then, I may well not have configured it in same way as people
having problem.

Graham

 --
 Cheers,
 Tony

 
 Tony Stevenson

 t...@pc-tony.com - pct...@apache.org
 pct...@freenode.net - t...@caret.cam.ac.uk

 http://blog.pc-tony.com

 1024D/51047D66

Re: Failing startup for vhost configuration problems

2010-08-05 Thread Graham Dumpleton

On Thursday, August 5, 2010, Niklas Edmundsson ni...@acc.umu.se wrote:
 On Thu, 5 Aug 2010, Graham Dumpleton wrote:


 On Thursday, August 5, 2010, Stefan Fritsch s...@sfritsch.de wrote:

 On Tuesday 03 August 2010, Dan Poirier wrote:

 I'd like to propose that in 2.3/2.4, we fail startup for any of the
 virtual host misconfigurations for which behavior is undefined but
 right now we only issue a warning.

 snip

 Perhaps warnings around bad MPM configuration can be reviewed as well.

 From memory it doesn't at the moment warn you when something like

 StartServers is set over what it can be based on MaxClients and
 ThreadsPerChild. I think it may just silently reduce the value. Since
 MPM settings are often misunderstood, better hints about bad
 configuration there would be useful.


 +1 on both suggestions. This would be a good time to change/improve on this 
 front.

Another thing which would be nice related to MPM settings and other
core settings is the ability to do:

  httpd -t -DDUMP_CORE_CONFIG

The idea being that in doing this it would log what all the MPM
related settings are, ie. The final result of taking user settings,
applying defaults where no user supplied settings, plus any
adjustments due to stupid settings. Add to that critical core settings
which come into play as to performance of a site such as Timeout,
KeepAlive, KeepAliveTimeout, ListenBacklog, EnableSendFile etc.

This way when trying to assist someone to understand whether their
configuration is appropriate one can get all the key important
settings easily and not have to get them to troll through their
configuration files, work out whether they are even being included,
adding defaults for things not set etc.

All up this would certainly make the job of helping newbies work out
how they screwed up there site by changing these settings in the first
place, a lot easier.

Graham

Re: [PATCH] tproxy2 patch to the apache 2.2.15

2010-08-03 Thread Graham Dumpleton

2010/8/4 Daniel Ruggeri drugg...@primary.net:
 On 8/3/2010 9:57 AM, JeHo Park wrote:
 hello ~
 it's my first mail to apache dev .. and i am beginner of the apache. :-)
 Anyway ... recently, i wrote transparent proxy [tproxy2] patch to the
 httpd-2.2.15
 because i needed web proxy and needed to know the source address of
 any client who try to connect to my web server
 and after all, i tested the performance of my patched tproxy with
 AVALANCHE 2900. if anyone ask me the performance result, i will send
 it to him [the size of the test result pdf is big size]
 *- here is the platform infomation this patch applied ---*
 1. OS
 CentOS release 5.2 (Final)
 2. KERNEL
 Linux version 2.6.18-194.el5-tproxy2 (r...@localhost.localdomain
 mailto:r...@localhost.localdomain)
 (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46))
 #10 SMP Wed May 26 17:35:19 KST 2010
 3. iptables
 iptables-1.3.8 + tproxy2 supporting patch
 *-- here is the usage of tproxy2 patched httpd configuration ---*
 httpd.conf
 VirtualHost 192.168.200.1:80
 ProxyTproxy On # On/Off flag
 ProxyTPifaddr 192.168.200.1 # IP address of bridge interface br0.
 example) br0 = eth0 + eth1 
 /VirtualHost
 i attach the kernel tproxy2 patch to the kernel
 above[2.6.18-194.el5-tproxy2 ], httpd-2.2.15 tproxy2 patch and kernel
 configuration for tproxy2
 above all, i want to know my patch is available or not .. and want
 feedback from anyone :-)

 JeHo;
 Hi, can you help me understand what the usage case is for this patch?
 What service or capability does it provide that is not currently available?

In particular, how is X-Forwarded-For not going to provide the
information required.

http://en.wikipedia.org/wiki/X-Forwarded-For

Graham

Re: 2.3 upgrade on apache.org

2010-07-18 Thread Graham Dumpleton

On Monday, July 19, 2010, William A. Rowe Jr. wr...@rowe-clan.net wrote:
 On 7/18/2010 12:58 PM, Paul Querna wrote:

 We have now disabled Sendfile on apache.org, and the load average
 dropped from ~80 to 0.35.

 Wow.

 Is it unreasonable for us to change the API to disable sendfile as the default
 from 2.3-alphas forward?

 The feature is loaded with simply too many gotchas - NFS mounts, broken 
 kernels,
 and although we don't need to remove it, and can encourage people to use it 
 with
 caution, it doesn't seem rational to leave such an unintuitive choice up to 
 the
 novice/beginning user.

 Comments?

Can someone comment on whether the minimal list of potential sendfile
problems listed against the EnableSendfile directive documentation is
up to date with all known potential problems, or have other issues
been found since the documentation was written?

Graham

Re: Problem with mod_fcgid handling ErrorDocuments

2010-07-06 Thread Graham Dumpleton

On 6 July 2010 22:56, Edgar Frank ef-li...@email.de wrote:
 Hi mod_fcgid developers,

 I'm currently exploring a potential problem with mod_fcgid.
 Let's assume a setup with mod_security and mod_fcgid
 (has nothing to do with mod_security itself - it just helps to
 trigger the problem).

 Now we have a large POST request which mod_security blocks
 (by SecRequestBodyLimit) with 413 Request Entity Too Large.

Presumably it might also occur with LimitRequestBody and not even need
mod_security. Unless that is that this is a problem with mod_security
itself and how it handles a 413.

So, is the issue with mod_fcgid. If there is an issue here with
ErrorDocument for a 413 where the handler is a proxy of some form,
then likely could affect other modules besides mod_fcgid.

I would be investigating where ErrorDocument for 413 is handed off to
URL implemented by CGI or even mod_proxy to see what happens.

Graham

 The ErrorDocument for 413 is configured to a Location which
 mod_fcgid serves. (Please don't argue that it's this way - I know
 the problems and I'm not happy with it, but it's not my decision
 to do it that way.)

 HTTPD issues a GET subrequest for the ErrorDocument and
 mod_fcgid kicks in. But now it starts consuming the request body
 we just blocked - or if the request body size is larger than
 FcgidMaxRequestLen, ErrorDocument generation fails.

 I wonder how to circumvent this. In fcgid_bridge.c:bridge_request
 I found:

 if (role == FCGI_RESPONDER) {
  rc = add_request_body( [...] );
 }

 Could one change this to something like the following without
 causing trouble?

 if (role == FCGI_RESPONDER  !ap_is_HTTP_ERROR(r-status)) {
  rc = add_request_body( [...] );
 }

 Or maybe something like a HTTP method check? (Is there a
 reliable way to detect if we're in ErrorDocument generation
 anyway?) But at this point we have put the Content-Length header
 already into the stream to the FCGI backend, so one would also
 have to take action earlier.

 What do you think in general of handling this? I'd really
 appreciate an elaborate answer - if you find it fix-worthy,
 first ideas how to fix it - and if not, why not.

 Regards,
 Edgar

 FYI:
 mod_fcgid 2.3.5
 with httpd 2.2.15
 on CentOS 5.4 x64
 built from source with gcc 4.2.4

Re: Problem with mod_fcgid handling ErrorDocuments

2010-07-06 Thread Graham Dumpleton

On 7 July 2010 11:43, Graham Dumpleton graham.dumple...@gmail.com wrote:
 On 6 July 2010 22:56, Edgar Frank ef-li...@email.de wrote:
 Hi mod_fcgid developers,

 I'm currently exploring a potential problem with mod_fcgid.
 Let's assume a setup with mod_security and mod_fcgid
 (has nothing to do with mod_security itself - it just helps to
 trigger the problem).

 Now we have a large POST request which mod_security blocks
 (by SecRequestBodyLimit) with 413 Request Entity Too Large.

 Presumably it might also occur with LimitRequestBody and not even need
 mod_security. Unless that is that this is a problem with mod_security
 itself and how it handles a 413.

 So, is the issue with mod_fcgid. If there is an issue here with
 ErrorDocument for a 413 where the handler is a proxy of some form,
 then likely could affect other modules besides mod_fcgid.

 I would be investigating where ErrorDocument for 413 is handed off to
 URL implemented by CGI or even mod_proxy to see what happens.

 Graham

 The ErrorDocument for 413 is configured to a Location which
 mod_fcgid serves. (Please don't argue that it's this way - I know
 the problems and I'm not happy with it, but it's not my decision
 to do it that way.)

 HTTPD issues a GET subrequest for the ErrorDocument and
 mod_fcgid kicks in. But now it starts consuming the request body
 we just blocked - or if the request body size is larger than
 FcgidMaxRequestLen, ErrorDocument generation fails.

 I wonder how to circumvent this. In fcgid_bridge.c:bridge_request
 I found:

 if (role == FCGI_RESPONDER) {
  rc = add_request_body( [...] );
 }

 Could one change this to something like the following without
 causing trouble?

 if (role == FCGI_RESPONDER  !ap_is_HTTP_ERROR(r-status)) {
  rc = add_request_body( [...] );
 }

 Or maybe something like a HTTP method check? (Is there a
 reliable way to detect if we're in ErrorDocument generation
 anyway?)

Addressing this question, r-prev attribute should refer to original
request and so can possibly check r-prev-status.

Also, the following gets set:

apr_table_setn(new-subprocess_env, REDIRECT_STATUS,
   apr_itoa(r-pool, r-status));

before internal redirect. This is more of interest where CGI, or
FASTCGI script actually gets executed as that wouldn't have access to
the original request object.

Graham

 But at this point we have put the Content-Length header
 already into the stream to the FCGI backend, so one would also
 have to take action earlier.

 What do you think in general of handling this? I'd really
 appreciate an elaborate answer - if you find it fix-worthy,
 first ideas how to fix it - and if not, why not.

 Regards,
 Edgar

 FYI:
 mod_fcgid 2.3.5
 with httpd 2.2.15
 on CentOS 5.4 x64
 built from source with gcc 4.2.4

Re: What's next for 2.2 and 2.3/trunk?

2010-06-02 Thread Graham Dumpleton

On 3 June 2010 10:40, Sander Temme scte...@apache.org wrote:

 On Jun 1, 2010, at 9:08 AM, Jim Jagielski wrote:

 Considering that 2.3/trunk is back to limbo-land, I'd like
 to propose that we be more aggressive is backporting some
 items. Even if under experimental, it would be nice if slotmem
 and socache were backported. I also like the refactoring of
 the providers for proxy in trunk as compared to 2.2, but
 last time I suggested it, it looked like 2.3/2.4 was close(r)
 to reality...

 comments...?

 Amusingly (at least to me), I happened upon an old post by Joel Spolsky from 
 2002:

 http://www.joelonsoftware.com/articles/PickingShipDate.html

 For Systems With Millions of Customers and Millions of Integration Points, 
 Prefer Rare Releases.  You can do it like Apache: one release at the 
 beginning of the Internet Bubble, and one release at the end.  Perfect.

 I personally think we have enough to release for users to chew on:

 http://httpd.apache.org/docs/trunk/new_features_2_4.html

 PHP should largely move to FastCGI, so module compatibility should not be a 
 problem.  Any idea about other popular modules?  WSGI?  mod_perl?  Are they 
 ready for HEAD?

By you mentioning WSGI, are you asking whether mod_wsgi works against 2.3 trunk?

If you are, the answer is that mod_wsgi trunk should work. Ie.,
unreleased version.

Official tar ball releases of mod_wsgi will not work because of
changes a while back in 2.3 to eliminate ap_accept_lock_mech from
public API.

If 2.3 is going to start progressing again before mod_wsgi 4.0 is
released, I can always backport workaround for that to mod_wsgi 3.X
branch.

FWIW, although I haven't tried it, I suspect that even mod_python
trunk will not build against Apache 2.3 as it makes use of ap_requires
which vanished in 2.3.

Graham

Re: detecting .htaccess in a per-dir directive handler (control mod_fcgid FcgidWrapper use in htaccess via per-server config)

2010-05-17 Thread Graham Dumpleton

On 18 May 2010 05:13, Jeff Trawick traw...@gmail.com wrote:
 mod_fcgid unfortunately allows the FcgidWrapper directive to be
 overridden in htaccess when AllowOverride FileInfo is declared.  In
 all likelihood some users need that (the feature was contributed and
 added in mod_fcgid 2.1, it is especially handy to tweak PHP settings),
 but definitely some admins do not want them to use it.

 There's no obvious AllowOverride control for this directive, and
 there's the legacy compatibility concern too.  Given this, the best
 way to solve the problem AFAIK is to detect htaccess mode and consult
 a per-server setting to see if the directive should be allowed.
 BETTER SUGGESTIONS?

 The best way to detect htaccess mode that I know of is to maintain a
 flag in pre-config and post-config hooks which indicate whether we're
 processing the main config; if we're not processing the main config
 then assume we're processing htaccess.  BETTER SUGGESTIONS?

Could be wrong, but I was under the impression that cmd-config_file
is NULL when processing main Apache configuration file.

It was a long time ago, but the comment I wrote in mod_python in
respect of this was:

} else if (cmd-config_file != NULL) {
/* cmd-config_file is NULL when in main Apache
 * configuration file as the file is completely
 * read in before the directive is processed as
 * EXEC_ON_READ is not set in req_override field
 * of command_struct table entry. Thus know then
 * we are being used in a .htaccess file. */
...
}

To be honest though, cant remember what I even meant by the latter
part of the comment.

Anyway, the issue you are dealing with is an issue am having to deal
with at present in mod_wsgi. That is, don't want to always allow stuff
if AuthConfig or FileInfo overridden as allowing people to hook Python
script code in .htaccess can be a security issue in itself. I don't
though want it to be allowed or disallowed across whole Apache, but
still be able to say allowed for certain directory contexts if need be
where users are trusted.

I was looking at a new directive for mod_wsgi which could only be
specified in main Apache configuration, but still within Directory
context, which would allow one to say whether or not can do stuff in
.htaccess. Problem as I saw it, although haven't looked further, is
that to allow it to be done selectively, can only enforce it at time
of handling request and not at time of reading in configuration from
.htaccess.

Graham

Re: apache module's privileges

2009-12-15 Thread Graham Dumpleton

2009/12/16 Dan Poirier poir...@pobox.com:
 Jordi Prats jordi.pr...@gmail.com writes:

 If you start apache with root as usual, you realize that every module
 is able to run code with root privileges:
 ...
 Why is coded this way? Shouldn't run with lower privileges?

 No.  That's not the purpose of apache modules.

There is a lot more to it than that.

Parts of the code of an Apache module that are run in Apache parent
process will run as that user, normally root, but handling of actual
requests in an Apache worker process are done as less privileged user.

Suggest OP read:

  
http://www.fmc-modeling.org/category/projects/apache/amp/Apache_Modeling_Project.html

 to understand the whole life cycle of Apache configuration and
initialisation, and of separate per request life cycle.

Graham

Re: [mod_fcgid] Feedback / Suggestions

2009-11-25 Thread Graham Dumpleton

2009/11/25 Edgar Frank ef-li...@email.de:
 On Tue, Nov 24, 2009 at 05:07 PM, Jeff Trawick traw...@gmail.com wrote:
  Or otherwise, can someone explain the details to me why it is as it is?
  Especially in terms of not pipeling data directly (maybe after a little
  buffering to build proper FCGI packets)? The comment in
  fcgid_bridge.c:452 (add_request_body) left me clueless. Why would this
  keep the server in processing too long? Processing takes its time either
  way, I'd assume. Looking forward to enlightment. :)
 
  I can only guess that the problem at hand when this was implemented
  was that some backend application processes were so expensive that
  that they couldn't be tied up until all data had been read from slow
  clients.
 
  Yes, Jeff is right :)

 This is a reasonable feature; once streaming to the app is implemented
 this alternate mechanism can be enabled with a per-request envvar
 (e.g., SetEnv in the directory or location).

 Thanks for explaining this to me.

 While delving into the FCGI and CGI spec, I encountered another reason not to
 stream client data directly. CGI wants an explicitly set CONTENT_LENGTH and
 FCGI enforces than rather obsoletes this (last sentence in 6.2 of the FCGI
 spec).
 If the client sends for any reason a message body with no CONTENT_LENGTH set
 or CONTENT_LENGTH to be ignored as defined by RFC2616, you have to read the
 full message body to determine the correct content length which should be
 transferred to the backend.

Things can get worse. Even if CONTENT_LENGTH is sent, if you have
requests with compressed content which is decompressed by mod_deflate,
the amount of content will not actually match what CONTENT_LENGTH says
there will be as it reflects how things are before content is
decompressed.

Don't know about FASTCGI in general, but for WSGI (Python higher level
interface that can sit on CGI or FASTCGI) they have the stupid
requirement that you take CONTENT_LENGTH as being precise and that you
must not read more than CONTENT_LENGTH. If CONTENT_LENGTH isn't
provided, WSGI says you are supposed to take it as meaning no data.

For WSGI at least, means you can't have mutating input filters unless
the input filter buffers up all the request content after doing what
it does and recalculates CONTENT_LENGTH and sends through modified
value. In practice input filters don't do this.

Anyway, don't know if this is at all relevant to FASTCGI. As you point
out though, the CONTENT_LENGTH requirement does at least prevent
FASTCGI from handling chunked request content. WSGI specification has
same stupid limitation.

If things were defined so as to simply read until all input exhausted
and for CONTENT_LENGTH really only to be used as a hint or in
determining if original request body may be too large, wouldn't be
such a pain.

Graham

Re: MPM-Module perchild

2009-11-23 Thread Graham Dumpleton

2009/11/23  christian4apa...@lists.muthpartners.de:
 Hello,

 We have an internal project where we need the MPM module perchild. The
 Apache 2.0 documentation says that the development is not completed. I
 talked to my boss and he says I could take maybe any necessary residual
 activities, (depending on the size). Therefore, the following questions:

 * What is currently state of this module?
 * What would a collaboration?
 * How is the planning of this module in Apache 2.2. The link of 'user'
 (http://httpd.apache.org/docs/2.2/mod/mpm_common.html#user) and 'group'
 (http://httpd.apache.org/docs/2.2/mod/mpm_common.html#group) only brings
 a 404 (http://httpd.apache.org/docs/2.2/mod/perchild.html).

First off I would be asking what specific code are you wanting to run
which requires this MPM. There are other means of achieving process
separation and dropping of privileges to different users than this
MPM. Whether other solutions are suitable really depends on what you
are wanting to do though.

So, explain what the actual requirement is rather than than your
suspected solution and may be can save you some time by suggesting
other ways you can achieve the same which doesn't require as much
work.

Graham

Graham

Re: MPM-Module perchild

2009-11-23 Thread Graham Dumpleton

2009/11/23 Jeff Trawick traw...@gmail.com:
 On Mon, Nov 23, 2009 at 4:40 AM,
 christian4apa...@lists.muthpartners.de wrote:
 Hello,

 We have an internal project where we need the MPM module perchild. The
 Apache 2.0 documentation says that the development is not completed. I
 talked to my boss and he says I could take maybe any necessary residual
 activities, (depending on the size). Therefore, the following questions:

 * What is currently state of this module?
 * What would a collaboration?
 * How is the planning of this module in Apache 2.2. The link of 'user'
 (http://httpd.apache.org/docs/2.2/mod/mpm_common.html#user) and 'group'
 (http://httpd.apache.org/docs/2.2/mod/mpm_common.html#group) only brings
 a 404 (http://httpd.apache.org/docs/2.2/mod/perchild.html).

 perchild is no longer maintained here.

 See

 http://httpd.apache.org/docs/2.3/mod/mod_privileges.html (in future httpd 2.4)

FWIW, contrary to what is suggested by documentation for
mod_privileges, I would anticipate that modules which embed a Python
interpreter such as mod_python and mod_wsgi are not going to be
compatible with at least SECURE mode of mod_privileges. This is
because after a fork of a Python process special Python interpreter
core function has to be called to do some fixups. This is fine if fork
done from Python code as it will be done automatically, but not if
done from external C code in same process. Not sure how well things
will work if that fixup function isn't called.

So, in order for it to work, there would need to be optional hook
functions exposed by mod_privileges which would allow other modules to
run special actions after the fork. This though means that the
distinct modules would need to be customised to know about
mod_privileges.

BTW, what operating system feature does this use that means it is only
usable on Solaris?

Graham

Re: [VOTE] release 2.3.3 as alpha

2009-11-12 Thread Graham Dumpleton

2009/11/12 Paul Querna p...@querna.org:
 On Wed, Nov 11, 2009 at 10:33 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:
 2009/11/12 Paul Querna p...@querna.org:
 Test tarballs for Apache httpd 2.3.3-alpha are available at:
    http://httpd.apache.org/dev/dist/

 Your votes please;

  +/- 1
  [  ]  Release httpd-2.3.3 as Alpha

 Vote closes at 18:00 UTC on Sunday November 15 2009.

 Thanks,

 Paul


 What APR/APR-UTIL/PCRE versions are supposed to be used with this?

 Failing to build on MacOS X 10.5.8.


 Any modern 1.4.x APR should work.

 You can use the -deps version, by using '--with-included-apr'

Huh. There is no bundled APR/APR-UTIL in the alpha tarball, so how is
--with-included-apr work going to work?

 re: your build errors, it seems like your install of APR is kinda
 busted.  I'd guess it must be too old, and something about the build
 system isn't picking up everyone correctly.

Or too new. It was actually the head from subversion versions of 1.4
for both APR and APR-UTIL.

http://svn.apache.org/repos/asf/apr/apr/branches/1.4.x
http://svn.apache.org/repos/asf/apr/apr-util/branches/1.4.x

I'll try with last official tarballs of both instead.

Graham

 Configure line:

 ./configure --prefix=/usr/local/apache-2.3
 --with-apr=/usr/local/apr-1.4/bin/apr-1-config
 --with-apr-util=/usr/local/apr-util-1.4/bin/apu-1-config
 --with-pcre=/usr/local/pcre-8.00/bin/pcre-config

 Build error:

 /usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp    -I.
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
 -I/usr/local/apr-1.4/include/apr-1
 -I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/core
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/database
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/filters
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ldap
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/loggers
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/lua
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/proxy
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/session
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ssl
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/test
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/arch/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/dav/main
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/generators
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/mappers
 -prefer-non-pic -static -c htpasswd.c  touch htpasswd.lo
 /usr/local/apr-1.4/build-1/libtool --silent --mode=link gcc -g -O2
   -o htpasswd  htpasswd.lo
 /usr/local/apr-util-1.4/lib/libaprutil-1.la @LDADD_dbm_db@
 @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
 /usr/local/apr-1.4/lib/libapr-1.la -lpthread
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_db@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_gdbm@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_ndbm@: No such file or directory

 The config.log shows:

 AP_LIBS='$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
 $(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
 $(MOD_AUTHZ_USER_LDADD) $(MOD_AUTHZ_CORE_LDADD)
 $(MOD_ACCESS_COMPAT_LDADD) $(MOD_AUTH_BASIC_LDADD)
 $(MOD_AUTH_FORM_LDADD) $(MOD_SO_LDADD) $(MOD_BUFFER_LDADD)
 $(MOD_RATELIMIT_LDADD) $(MOD_REQTIMEOUT_LDADD) $(MOD_REQUEST_LDADD)
 $(MOD_INCLUDE_LDADD) $(MOD_FILTER_LDADD) $(MOD_LOG_CONFIG_LDADD)
 $(MOD_ENV_LDADD) $(MOD_SETENVIF_LDADD) $(MOD_VERSION_LDADD)
 $(MOD_HTTP_LDADD) $(MOD_MIME_LDADD) $(MOD_UNIXD_LDADD)
 $(MOD_STATUS_LDADD) $(MOD_AUTOINDEX_LDADD) $(MOD_ASIS_LDADD)
 $(MOD_CGID_LDADD) $(MOD_NEGOTIATION_LDADD) $(MOD_DIR_LDADD)
 $(MOD_ACTIONS_LDADD) $(MOD_USERDIR_LDADD) $(MOD_ALIAS_LDADD)
 /usr/local/apr-util-1.4/lib/libaprutil-1.la �...@ldadd_dbm_db@
 @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
 /usr/local/apr-1.4/lib/libapr-1.la -lpthread'

 but no other mention of those non substituted values.

 The config.status just reflects same thing:

 S[AP_LIBS]=$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
 $(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
 $(MOD_AUTHZ_USER_LDADD) $(MOD_AUTHZ_CORE_LDADD)\
  $(MOD_ACCESS_COMPAT_LDADD) $(MOD_AUTH_BASIC_LDADD)
 $(MOD_AUTH_FORM_LDADD) $(MOD_SO_LDADD) $(MOD_BUFFER_LDADD)
 $(MOD_RATELIMIT_LDADD) $(MOD_REQTIMEO\
 UT_LDADD) $(MOD_REQUEST_LDADD) $(MOD_INCLUDE_LDADD)
 $(MOD_FILTER_LDADD) $(MOD_LOG_CONFIG_LDADD) $(MOD_ENV_LDADD)
 $(MOD_SETENVIF_LDADD) $(MOD_VERSION\
 _LDADD) $(MOD_HTTP_LDADD) $(MOD_MIME_LDADD) $(MOD_UNIXD_LDADD)
 $(MOD_STATUS_LDADD) $(MOD_AUTOINDEX_LDADD) $(MOD_ASIS_LDADD)
 $(MOD_CGID_LDADD) $(MOD_\
 NEGOTIATION_LDADD) $(MOD_DIR_LDADD) $(MOD_ACTIONS_LDADD)
 $(MOD_USERDIR_LDADD) $(MOD_ALIAS_LDADD)
 /usr/local

Re: [VOTE] release 2.3.3 as alpha

2009-11-12 Thread Graham Dumpleton

2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 2009/11/12 Paul Querna p...@querna.org:
 On Wed, Nov 11, 2009 at 10:33 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:
 2009/11/12 Paul Querna p...@querna.org:
 Test tarballs for Apache httpd 2.3.3-alpha are available at:
    http://httpd.apache.org/dev/dist/

 Your votes please;

  +/- 1
  [  ]  Release httpd-2.3.3 as Alpha

 Vote closes at 18:00 UTC on Sunday November 15 2009.

 Thanks,

 Paul


 What APR/APR-UTIL/PCRE versions are supposed to be used with this?

 Failing to build on MacOS X 10.5.8.


 Any modern 1.4.x APR should work.

 You can use the -deps version, by using '--with-included-apr'

 Huh. There is no bundled APR/APR-UTIL in the alpha tarball, so how is
 --with-included-apr work going to work?

 re: your build errors, it seems like your install of APR is kinda
 busted.  I'd guess it must be too old, and something about the build
 system isn't picking up everyone correctly.

 Or too new. It was actually the head from subversion versions of 1.4
 for both APR and APR-UTIL.

 http://svn.apache.org/repos/asf/apr/apr/branches/1.4.x
 http://svn.apache.org/repos/asf/apr/apr-util/branches/1.4.x

 I'll try with last official tarballs of both instead.

Hmmm, no tarballs.

I tried trashing my APR/APR-UTIL install directories. Validated I was
up to date from those subversion branches, which I was, did make
distclean, ran configure again from scratch and reinstalled. Then
tried rebuilding 2.3 again and still had same problem.

Given that there are no 1.4 tarballs for APR/APR-UTIL, which
subversion trunk/branch are you meant to use?

Are the branches I used the correct thing to use. I noticed there was
no trunk for apr-util:

$ svn list http://svn.apache.org/repos/asf/apr/apr-util
branches/
tags/

and since had to use branch for that, assumed was supposed to use
branch for apr as well even though it has a trunk.

$ svn list http://svn.apache.org/repos/asf/apr/apr/
branches/
tags/
trunk/

Graham


 Configure line:

 ./configure --prefix=/usr/local/apache-2.3
 --with-apr=/usr/local/apr-1.4/bin/apr-1-config
 --with-apr-util=/usr/local/apr-util-1.4/bin/apu-1-config
 --with-pcre=/usr/local/pcre-8.00/bin/pcre-config

 Build error:

 /usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp    -I.
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
 -I/usr/local/apr-1.4/include/apr-1
 -I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/core
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/database
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/filters
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ldap
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/loggers
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/lua
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/proxy
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/session
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ssl
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/test
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/arch/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/dav/main
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/generators
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/mappers
 -prefer-non-pic -static -c htpasswd.c  touch htpasswd.lo
 /usr/local/apr-1.4/build-1/libtool --silent --mode=link gcc -g -O2
   -o htpasswd  htpasswd.lo
 /usr/local/apr-util-1.4/lib/libaprutil-1.la @LDADD_dbm_db@
 @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
 /usr/local/apr-1.4/lib/libapr-1.la -lpthread
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_db@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_gdbm@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_ndbm@: No such file or directory

 The config.log shows:

 AP_LIBS='$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
 $(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
 $(MOD_AUTHZ_USER_LDADD) $(MOD_AUTHZ_CORE_LDADD)
 $(MOD_ACCESS_COMPAT_LDADD) $(MOD_AUTH_BASIC_LDADD)
 $(MOD_AUTH_FORM_LDADD) $(MOD_SO_LDADD) $(MOD_BUFFER_LDADD)
 $(MOD_RATELIMIT_LDADD) $(MOD_REQTIMEOUT_LDADD) $(MOD_REQUEST_LDADD)
 $(MOD_INCLUDE_LDADD) $(MOD_FILTER_LDADD) $(MOD_LOG_CONFIG_LDADD)
 $(MOD_ENV_LDADD) $(MOD_SETENVIF_LDADD) $(MOD_VERSION_LDADD)
 $(MOD_HTTP_LDADD) $(MOD_MIME_LDADD) $(MOD_UNIXD_LDADD)
 $(MOD_STATUS_LDADD) $(MOD_AUTOINDEX_LDADD) $(MOD_ASIS_LDADD)
 $(MOD_CGID_LDADD) $(MOD_NEGOTIATION_LDADD) $(MOD_DIR_LDADD)
 $(MOD_ACTIONS_LDADD) $(MOD_USERDIR_LDADD) $(MOD_ALIAS_LDADD)
 /usr/local/apr-util-1.4/lib/libaprutil-1.la �...@ldadd_dbm_db@
 @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
 /usr/local/apr-1.4/lib/libapr-1.la -lpthread'

 but no other mention

Re: [VOTE] release 2.3.3 as alpha

2009-11-12 Thread Graham Dumpleton

FWIW, the @??@ symbols are coming from apu-1-config because they are
never expanded by configure script for apr-util. Ie., snippet from
apu-1-config is:

LIBS=-lexpat -liconv
INCLUDES=
LDFLAGS=
LDAP_LIBS=
DBM_LIBS=@LDADD_dbm_db@ @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@

APRUTIL_LIBNAME=aprutil-${APRUTIL_MAJOR_VERSION}

APU_SOURCE_DIR=/Users/grahamd/Projects/apr-util-1.4-trunk
APU_BUILD_DIR=/Users/grahamd/Projects/apr-util-1.4-trunk
APR_XML_EXPAT_OLD=@APR_XML_EXPAT_OLD@
APU_DB_VERSION=0

This has occurred because autoconf hadn't been run to regenerate
configure script last time I updated from subversion.

Must of missed it. Remember to do it for apr. :-(

Trying again now.

Graham

2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 2009/11/12 Paul Querna p...@querna.org:
 On Wed, Nov 11, 2009 at 10:33 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:
 2009/11/12 Paul Querna p...@querna.org:
 Test tarballs for Apache httpd 2.3.3-alpha are available at:
    http://httpd.apache.org/dev/dist/

 Your votes please;

  +/- 1
  [  ]  Release httpd-2.3.3 as Alpha

 Vote closes at 18:00 UTC on Sunday November 15 2009.

 Thanks,

 Paul


 What APR/APR-UTIL/PCRE versions are supposed to be used with this?

 Failing to build on MacOS X 10.5.8.


 Any modern 1.4.x APR should work.

 You can use the -deps version, by using '--with-included-apr'

 Huh. There is no bundled APR/APR-UTIL in the alpha tarball, so how is
 --with-included-apr work going to work?

 re: your build errors, it seems like your install of APR is kinda
 busted.  I'd guess it must be too old, and something about the build
 system isn't picking up everyone correctly.

 Or too new. It was actually the head from subversion versions of 1.4
 for both APR and APR-UTIL.

 http://svn.apache.org/repos/asf/apr/apr/branches/1.4.x
 http://svn.apache.org/repos/asf/apr/apr-util/branches/1.4.x

 I'll try with last official tarballs of both instead.

 Hmmm, no tarballs.

 I tried trashing my APR/APR-UTIL install directories. Validated I was
 up to date from those subversion branches, which I was, did make
 distclean, ran configure again from scratch and reinstalled. Then
 tried rebuilding 2.3 again and still had same problem.

 Given that there are no 1.4 tarballs for APR/APR-UTIL, which
 subversion trunk/branch are you meant to use?

 Are the branches I used the correct thing to use. I noticed there was
 no trunk for apr-util:

 $ svn list http://svn.apache.org/repos/asf/apr/apr-util
 branches/
 tags/

 and since had to use branch for that, assumed was supposed to use
 branch for apr as well even though it has a trunk.

 $ svn list http://svn.apache.org/repos/asf/apr/apr/
 branches/
 tags/
 trunk/

 Graham


 Configure line:

 ./configure --prefix=/usr/local/apache-2.3
 --with-apr=/usr/local/apr-1.4/bin/apr-1-config
 --with-apr-util=/usr/local/apr-util-1.4/bin/apu-1-config
 --with-pcre=/usr/local/pcre-8.00/bin/pcre-config

 Build error:

 /usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp    -I.
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
 -I/usr/local/apr-1.4/include/apr-1
 -I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/core
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/database
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/filters
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ldap
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/loggers
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/lua
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/proxy
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/session
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ssl
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/test
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/arch/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/dav/main
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/generators
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/mappers
 -prefer-non-pic -static -c htpasswd.c  touch htpasswd.lo
 /usr/local/apr-1.4/build-1/libtool --silent --mode=link gcc -g -O2
   -o htpasswd  htpasswd.lo
 /usr/local/apr-util-1.4/lib/libaprutil-1.la @LDADD_dbm_db@
 @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
 /usr/local/apr-1.4/lib/libapr-1.la -lpthread
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_db@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_gdbm@: No such file or directory
 i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_ndbm@: No such file or directory

 The config.log shows:

 AP_LIBS='$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
 $(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
 $(MOD_AUTHZ_USER_LDADD

Re: [VOTE] release 2.3.3 as alpha

2009-11-12 Thread Graham Dumpleton

2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 FWIW, the @??@ symbols are coming from apu-1-config because they are
 never expanded by configure script for apr-util. Ie., snippet from
 apu-1-config is:

 LIBS=-lexpat -liconv
 INCLUDES=
 LDFLAGS=
 LDAP_LIBS=
 DBM_LIBS=@LDADD_dbm_db@ @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@

 APRUTIL_LIBNAME=aprutil-${APRUTIL_MAJOR_VERSION}

 APU_SOURCE_DIR=/Users/grahamd/Projects/apr-util-1.4-trunk
 APU_BUILD_DIR=/Users/grahamd/Projects/apr-util-1.4-trunk
 APR_XML_EXPAT_OLD=@APR_XML_EXPAT_OLD@
 APU_DB_VERSION=0

 This has occurred because autoconf hadn't been run to regenerate
 configure script last time I updated from subversion.

 Must of missed it. Remember to do it for apr. :-(

 Trying again now.

Get past the problem with @??@ variables, but still get linking issues with:

/usr/local/apr-1.4/build-1/libtool --silent --mode=link gcc -g -O2
   -o ab  ab.lo
/usr/local/apr-util-1.4/lib/libaprutil-1.la -lexpat -liconv
/usr/local/apr-1.4/lib/libapr-1.la -lpthread
Undefined symbols:
  _apr_pollset_create, referenced from:
  _main in ab.o
  _apr_pollset_remove, referenced from:
  _set_conn_state in ab.o
  _apr_pollset_poll, referenced from:
  _main in ab.o
  _apr_pollset_add, referenced from:
  _set_conn_state in ab.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

There are undefined references to them in libapr, but no actual code for it.

No more time to look at it today.

Graham

 2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 2009/11/12 Graham Dumpleton graham.dumple...@gmail.com:
 2009/11/12 Paul Querna p...@querna.org:
 On Wed, Nov 11, 2009 at 10:33 PM, Graham Dumpleton
 graham.dumple...@gmail.com wrote:
 2009/11/12 Paul Querna p...@querna.org:
 Test tarballs for Apache httpd 2.3.3-alpha are available at:
    http://httpd.apache.org/dev/dist/

 Your votes please;

  +/- 1
  [  ]  Release httpd-2.3.3 as Alpha

 Vote closes at 18:00 UTC on Sunday November 15 2009.

 Thanks,

 Paul


 What APR/APR-UTIL/PCRE versions are supposed to be used with this?

 Failing to build on MacOS X 10.5.8.


 Any modern 1.4.x APR should work.

 You can use the -deps version, by using '--with-included-apr'

 Huh. There is no bundled APR/APR-UTIL in the alpha tarball, so how is
 --with-included-apr work going to work?

 re: your build errors, it seems like your install of APR is kinda
 busted.  I'd guess it must be too old, and something about the build
 system isn't picking up everyone correctly.

 Or too new. It was actually the head from subversion versions of 1.4
 for both APR and APR-UTIL.

 http://svn.apache.org/repos/asf/apr/apr/branches/1.4.x
 http://svn.apache.org/repos/asf/apr/apr-util/branches/1.4.x

 I'll try with last official tarballs of both instead.

 Hmmm, no tarballs.

 I tried trashing my APR/APR-UTIL install directories. Validated I was
 up to date from those subversion branches, which I was, did make
 distclean, ran configure again from scratch and reinstalled. Then
 tried rebuilding 2.3 again and still had same problem.

 Given that there are no 1.4 tarballs for APR/APR-UTIL, which
 subversion trunk/branch are you meant to use?

 Are the branches I used the correct thing to use. I noticed there was
 no trunk for apr-util:

 $ svn list http://svn.apache.org/repos/asf/apr/apr-util
 branches/
 tags/

 and since had to use branch for that, assumed was supposed to use
 branch for apr as well even though it has a trunk.

 $ svn list http://svn.apache.org/repos/asf/apr/apr/
 branches/
 tags/
 trunk/

 Graham


 Configure line:

 ./configure --prefix=/usr/local/apache-2.3
 --with-apr=/usr/local/apr-1.4/bin/apr-1-config
 --with-apr-util=/usr/local/apr-util-1.4/bin/apu-1-config
 --with-pcre=/usr/local/pcre-8.00/bin/pcre-config

 Build error:

 /usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp    -I.
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
 -I/usr/local/apr-1.4/include/apr-1
 -I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/core
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/database
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/filters
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ldap
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/loggers
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/lua
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/proxy
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/session
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ssl
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/test
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/arch/unix
 -I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/dav/main
 -I/Users/grahamd/Packages/httpd

Re: [VOTE] release 2.3.3 as alpha

2009-11-11 Thread Graham Dumpleton

2009/11/12 Paul Querna p...@querna.org:
 Test tarballs for Apache httpd 2.3.3-alpha are available at:
    http://httpd.apache.org/dev/dist/

 Your votes please;

  +/- 1
  [  ]  Release httpd-2.3.3 as Alpha

 Vote closes at 18:00 UTC on Sunday November 15 2009.

 Thanks,

 Paul


What APR/APR-UTIL/PCRE versions are supposed to be used with this?

Failing to build on MacOS X 10.5.8.

Configure line:

./configure --prefix=/usr/local/apache-2.3
--with-apr=/usr/local/apr-1.4/bin/apr-1-config
--with-apr-util=/usr/local/apr-util-1.4/bin/apu-1-config
--with-pcre=/usr/local/pcre-8.00/bin/pcre-config

Build error:

/usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp-I.
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
-I/usr/local/apr-1.4/include/apr-1
-I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/core
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/database
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/filters
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ldap
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/loggers
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/lua
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/proxy
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/session
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/ssl
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/test
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/arch/unix
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/dav/main
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/generators
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/mappers
-prefer-non-pic -static -c htpasswd.c  touch htpasswd.lo
/usr/local/apr-1.4/build-1/libtool --silent --mode=link gcc -g -O2
   -o htpasswd  htpasswd.lo
/usr/local/apr-util-1.4/lib/libaprutil-1.la @LDADD_dbm_db@
@LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
/usr/local/apr-1.4/lib/libapr-1.la -lpthread
i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_db@: No such file or directory
i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_gdbm@: No such file or directory
i686-apple-darwin9-gcc-4.0.1: @LDADD_dbm_ndbm@: No such file or directory

The config.log shows:

AP_LIBS='$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
$(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
$(MOD_AUTHZ_USER_LDADD) $(MOD_AUTHZ_CORE_LDADD)
$(MOD_ACCESS_COMPAT_LDADD) $(MOD_AUTH_BASIC_LDADD)
$(MOD_AUTH_FORM_LDADD) $(MOD_SO_LDADD) $(MOD_BUFFER_LDADD)
$(MOD_RATELIMIT_LDADD) $(MOD_REQTIMEOUT_LDADD) $(MOD_REQUEST_LDADD)
$(MOD_INCLUDE_LDADD) $(MOD_FILTER_LDADD) $(MOD_LOG_CONFIG_LDADD)
$(MOD_ENV_LDADD) $(MOD_SETENVIF_LDADD) $(MOD_VERSION_LDADD)
$(MOD_HTTP_LDADD) $(MOD_MIME_LDADD) $(MOD_UNIXD_LDADD)
$(MOD_STATUS_LDADD) $(MOD_AUTOINDEX_LDADD) $(MOD_ASIS_LDADD)
$(MOD_CGID_LDADD) $(MOD_NEGOTIATION_LDADD) $(MOD_DIR_LDADD)
$(MOD_ACTIONS_LDADD) $(MOD_USERDIR_LDADD) $(MOD_ALIAS_LDADD)
/usr/local/apr-util-1.4/lib/libaprutil-1.la  @LDADD_dbm_db@
@LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
/usr/local/apr-1.4/lib/libapr-1.la -lpthread'

but no other mention of those non substituted values.

The config.status just reflects same thing:

S[AP_LIBS]=$(MOD_AUTHN_FILE_LDADD) $(MOD_AUTHN_CORE_LDADD)
$(MOD_AUTHZ_HOST_LDADD) $(MOD_AUTHZ_GROUPFILE_LDADD)
$(MOD_AUTHZ_USER_LDADD) $(MOD_AUTHZ_CORE_LDADD)\
 $(MOD_ACCESS_COMPAT_LDADD) $(MOD_AUTH_BASIC_LDADD)
$(MOD_AUTH_FORM_LDADD) $(MOD_SO_LDADD) $(MOD_BUFFER_LDADD)
$(MOD_RATELIMIT_LDADD) $(MOD_REQTIMEO\
UT_LDADD) $(MOD_REQUEST_LDADD) $(MOD_INCLUDE_LDADD)
$(MOD_FILTER_LDADD) $(MOD_LOG_CONFIG_LDADD) $(MOD_ENV_LDADD)
$(MOD_SETENVIF_LDADD) $(MOD_VERSION\
_LDADD) $(MOD_HTTP_LDADD) $(MOD_MIME_LDADD) $(MOD_UNIXD_LDADD)
$(MOD_STATUS_LDADD) $(MOD_AUTOINDEX_LDADD) $(MOD_ASIS_LDADD)
$(MOD_CGID_LDADD) $(MOD_\
NEGOTIATION_LDADD) $(MOD_DIR_LDADD) $(MOD_ACTIONS_LDADD)
$(MOD_USERDIR_LDADD) $(MOD_ALIAS_LDADD)
/usr/local/apr-util-1.4/lib/libaprutil-1.la  @LDAD\
D_dbm_db@ @LDADD_dbm_gdbm@ @LDADD_dbm_ndbm@ -lexpat -liconv
/usr/local/apr-1.4/lib/libapr-1.la -lpthread

Had noticed this problem in subversion trunk couple of days before you
rolled this tarball but didn't have a chance to post anything.

If remove those values from build/config_vars.mk and continue build
then stops with:

/usr/local/apr-1.4/build-1/libtool --silent --mode=compile gcc -g -O2
  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp-I.
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/os/unix
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/include
-I/usr/local/apr-1.4/include/apr-1
-I/usr/local/apr-util-1.4/include/apr-1 -I/usr/local/pcre-8.00/include
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/aaa
-I/Users/grahamd/Packages/httpd-2.3.3-alpha/modules/cache

Re: mod_fcgid: different instances of the same program

2009-11-09 Thread Graham Dumpleton

2009/11/10 Jeff Trawick traw...@gmail.com:
 On Mon, Nov 9, 2009 at 5:16 PM, Danny Sadinoff danny.sadin...@gmail.com 
 wrote:
 Here are two details of mod_fcgid process management that I've just
 learned after a long debug session and squinting at the mod_fcgid
 code.

 1) symlinks  you.
 It seems that mod_fcgid identifies fcgid programs by inode and device,
 not by filename.  So two fcgid programs invoked by the webserver
 along different paths will be counted as the same if the two paths are
 hardlinks or softlinks to each other.

 Mostly yes.

 The path to the file doesn't matter; it is the file itself that matters.

 There are different requirements for how programs are distinguished.
 One possibility is changing from stat() to lstat() (i.e., distinguish
 symlinks but not hard links).  Another possibility is looking only at
 the basename.  This was discussed in this thread:
 http://www.mail-archive.com/dev@httpd.apache.org/msg45516.html

FWIW, in the mod_wsgi module for Python applications, by default
applications are distinguished based on mount point and host/port they
are running under. That is, combination of SERVER_HOST, SERVER_PORT
and SCRIPT_NAME values.

Well, actually it is a little bit more complicated than that because
ports 80/443 are treated effectively the same given usage would
normally be paired.

What it means is that can have one script file mounted multiple times
and for each to be treated as separate instance of application.

In mod_wsgi the separation by default is done based on Python sub
interpreters within a process rather than actual processes. This is
because mod_wsgi supports running in embedded mode, ie., within Apache
server child processes, or as distinct daemon processes like with
FASTCGI.

There is the flexibility in mod_wsgi though to override this however
and manually specific what named application group (Python sub
interpreter within process), or whether embedded or daemon mode used
for processes and if daemon mode which named daemon process group.

Anyway, thought the strategy of using SERVER_HOST, SERVER_PORT and
SCRIPT_NAME values may be of interest as an alternative to
distinguishing based on path to script.

Graham

Re: Httpd 3.0 or something else

2009-11-05 Thread Graham Dumpleton

2009/11/5 Graham Leggett minf...@sharp.fm:
 Jim Jagielski wrote:

 Let's get 2.4 out. And then let's rip it to shreds and drop
 buckets/brigades and fold in serf.

 I think we should decide on exactly what problem we're trying to solve,
 before we start thinking about how it is to be solved.

 I'm keen to teach httpd v3.0 to work asynchronously throughout - still
 maintaining the prefork behaviour as a sensible default[1], but being
 asynchronous and non blocking throughout.

 [1] The fact that dodgy module code can leak, crash and be otherwise
 unsociable, and yet the server remains functional, is one of the key
 reasons why httpd still endures.

Sorry, long post but it was inevitable that I was going to air all
this at some point. Now seems a good as time as any.

I'd like to see a more radical architecture change, one that
recognises that it isn't just about serving static files any more and
provides much better builtin support for safe hosting of content
generating web applications constructed using alternate languages.

Before anyone jumps to the conclusion that I want to start seeing even
more heavy weight applications being run direct in the Apache server
child processes that accept initial requests, know that I don't want
that and that I actually want to promote a model which is the opposite
and which would encourage people not to do that.

As first step, like Jim I would like to see the current Apache server
child processes (workers) being asynchronous. In addition to that
though, I would like to see as part of core Apache, and running in
parent process, a means for spawning and monitoring of distinct
processes outside of the set of worker processes.

There is currently support in APR and in part in Apache for 'other'
processes via 'apr_proc_other_child_???()' functions, but this is
quite basic and you still need to a large degree need to roll your own
management routines around that for (re)spawning etc. As a result, you
see modules such as mod_cgid, mod_fastcgi, mod_fcgid, mod_wsgi all
having their own process management code for managing either their
daemon processes and/or manager process.

Technically one could implement this as a distinct module called
mod_procd which had an API which could be utilised by other modules
and stop duplication of all this stuff, but perhaps needs to go a step
further than that as far as being integrated into core. This is
because at present any 'other' processes are dealt with rather harshly
on graceful restarts because they are still simply killed off after a
few seconds if they don't shutdown. Being able to extend graceful
restart semantics into other processes may be worthwhile for some
applications.

The next thing want to see is for the whole FASTCGI type ecosystem be
revisited and for a better version of this concept for hosting web
applications in disparate languages be developed which modernises it
and brings it in as a core feature of Apache. The intent here being to
simplify the task for implementers as well as those wish to deploy
applications.

An important part of this would be to switch away from the interface
being a socket protocol. Instead, let the web server control both
halves of the communication channel between Apache worker process and
the application daemon process. What would replace the socket protocol
as interface would be C API and instead of the application having to
implement the socket protocol as foreign process, specific language
support would provided as a way of a dynamically loaded plugin. That
plugin would then use embedding to access support for a particular
language and just execute code in the file that the enclosing code of
the web server system told it to execute.

By way of example, imagine languages such as Python, Perl or Ruby
which in turn now have simplified web server interfaces in the form of
WSGI, PSGI and RACK, or even PHP. In the Apache configuration one
would simply say that a specific file extension is implemented by a
specific named language plugin. One would also indicate that a
separate manager process should be started up for managing processes
for handling any requests for that language.

Only after that separate manager process had been spawned be it by
just straight fork or preferably fork/exec would the specific language
plugin be loaded. This eliminates the problems caused by complex
language modules being preloaded into Apache parent process and
causing conflicts with other languages. The existing mod_php module is
a good example for causing lots of problems because of it dragging in
libraries which aren't multithread safe.

That manager process would then spawn its own language specific worker
processes as configured for handling actual requests. When the main
asynchronous Apache worker processes receive a request and determines
that target resource file is related to specific language, it
determines then how to connect to those language specific worker
processes and proxies the request to them for

Re: [mod_fcgid] How to share between vhosts (and extensions)

2009-09-23 Thread Graham Dumpleton

FWIW, the Python specific hosting module called mod_wsgi for Apache
implements named daemon process groups, with ability to control how
WSGI applications are delegated to which process group. This includes
being able to optionally have process group selected based on value of
ENV value set by mod_rewrite. Various directives can optionally be
used to limit which process groups can be delegated when using this.

Anyway, not necessarily relevant but may still be of interest. Links
to some of the documentation are:

  http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIProcessGroup
  
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess
  
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIRestrictProcess

Graham

2009/9/23 Rainer Jung rainer.j...@kippdata.de:
 Sorry for the long mail, especially in case all is well-known.

 While looking for the right way to implement something like FCGIDGroup I
 stumbled about something a bit strange in mod_fcgid.

 When looking for an appropriate existing process to handle a request the
 following data is used by mod_fcgid:

 - inode
  inode num of the wrapper if one is used,
  of the request filename otherwise
 - deviceid
  device id of the wrapper if one is used,
  of the request filename otherwise
 - share_grp_id
  Always 0 if no wrapper is used,
  -1 for one of the auth wrappers,
  otherwise an incrementing id that is
  unique across each wrapper with the same path
  (but different vhosts or even only different extensions)
  ids between different paths can overlap
 - virtualhost: the pointer (!) to
  r-server-server_hostname
 - uid: retrieved via ap_run_get_suexec_identity
 - gid: retrieved via ap_run_get_suexec_identity

 If no process is found, a new one is created and carries the above data
 coming from the request (and wrapper) which ran when it was created.

 Going a bit through the history of mod_fcgid, it seems that share_grp_id
 was added to support a directive FCGIWrapperGroup (sic!). Later the
 directive was removed, and the id was automatically incremented to
 ensure separation.

 Again later there was a bug w.r.t. incomplete isolation of the default
 env between vhosts, so virtualhost was added, but seems only necessary
 for the non-wrapper case.

 It seems there's some potential for improvement concerning sharing of
 fastcgi applications.

 1) Non-wrapper case

 The applications are given by the filenames the requests resolve to. So
 the most likely thing that could happen, is that someone wants to share
 the process pools for those between vhosts. I guess for this it would be
 enough to add FCGIDGroup. It would be optionally used once in vhosts to
 set a symbolic name, which will be used to seperate the pools between
 vhosts instead of the server_hostname. By choosing the same FCGIDGroup
 in several vhosts, one can share the process pool.

 2) Wrapper case

 The wrapper is configured via its path name and the optional extension
 it applies to. At the moment independent pools are used for different
 vhosts, but also (!) for different extensions.

 So if we would add FCGIDGroup to a vhost, it would stop the extension
 separation. Another possibility would be, to apply FCGIDGroup only to
 the non-wrapper case and add some syntax to FCGIWrapper to allow
 expressing, which group it belongs to. Unfortunately that directive
 already has two optional arguments, one can be detected, because it
 starts with a dot, the other one is the fixed string virtual. The
 syntax might get ugly, if we need to safely distinguish three optional
 arguments

 Something like

 FCGIDWrapper wrapperpath [groupname] [.extension] [+flag ...]

 and the only flag implemented at the moment is virtual.

 Regards,

 Rainer

Re: DO NOT REPLY [Bug 47087] Incorrect request body handling with Expect: 100-continue if the client does not receive a transmitted 300 or 400 response prior to sending its body

2009-08-30 Thread Graham Dumpleton

2009/8/30 Nick Kew n...@webthing.com:

 On 27 Aug 2009, at 17:22, bugzi...@apache.org wrote:

 It appears that Apache is violating this paragraph from RFC 2616:

      - Upon receiving a request which includes an Expect request-header
        field with the 100-continue expectation, an origin server MUST
        either respond with 100 (Continue) status and continue to read
        from the input stream, or respond with a final status code. The
        origin server MUST NOT wait for the request body before sending
        the 100 (Continue) response. If it responds with a final status
        code, it MAY close the transport connection or it MAY continue
        to read and discard the rest of the request.  It MUST NOT
        perform the requested method if it returns a final status code.

 Looks like we have a problem with the sequence:
 Client asks for 100-continue
 We reply with a final status - e.g. 3xx
 [delay somewhere on the wire]
 Client sends a request body
 We read body as a new request - oops!

 It seems to me that keeping the connection open in this
 instance means inevitable ambiguity over interpretation
 of subsequent data, and the safe course of action is to
 close it.  Otherwise we can read subsequent data line-
 by-line and discard anything that isn't a valid request
 line, at the risk of encountering a false positive in a
 request body.

 +1 for closing the connection.  Any divergent opinions?

FWIW, this area of code was change somewhere between 2.2.6 and 2.2.9
already. Prior to the change if a handler sent a 20x response and then
only after sending it did it attempt to read request input, the 100
was being emitted as part of the response content.

The old code in http_filters.c was:

/* Since we're about to read data, send 100-Continue if needed.
 * Only valid on chunked and C-L bodies where the C-L is  0. */
if ((ctx-state == BODY_CHUNK ||
(ctx-state == BODY_LENGTH  ctx-remaining  0)) 
f-r-expecting_100  f-r-proto_num = HTTP_VERSION(1,1)) {
char *tmp;
apr_bucket_brigade *bb;

tmp = apr_pstrcat(f-r-pool, AP_SERVER_PROTOCOL,  ,
  ap_get_status_line(100), CRLF CRLF, NULL);
bb = apr_brigade_create(f-r-pool, f-c-bucket_alloc);
e = apr_bucket_pool_create(tmp, strlen(tmp), f-r-pool,
   f-c-bucket_alloc);
APR_BRIGADE_INSERT_HEAD(bb, e);
e = apr_bucket_flush_create(f-c-bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, e);

ap_pass_brigade(f-c-output_filters, bb);
}

and is now:

/* Since we're about to read data, send 100-Continue if needed.
 * Only valid on chunked and C-L bodies where the C-L is  0. */
if ((ctx-state == BODY_CHUNK ||
(ctx-state == BODY_LENGTH  ctx-remaining  0)) 
f-r-expecting_100  f-r-proto_num = HTTP_VERSION(1,1) 
!(f-r-eos_sent || f-r-bytes_sent)) {
if (!ap_is_HTTP_SUCCESS(f-r-status)) {
ctx-state = BODY_NONE;
ctx-eos_sent = 1;
} else {
char *tmp;

tmp = apr_pstrcat(f-r-pool, AP_SERVER_PROTOCOL,  ,
  ap_get_status_line(100), CRLF CRLF, NULL);
apr_brigade_cleanup(bb);
e = apr_bucket_pool_create(tmp, strlen(tmp), f-r-pool,
   f-c-bucket_alloc);
APR_BRIGADE_INSERT_HEAD(bb, e);
e = apr_bucket_flush_create(f-c-bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, e);

ap_pass_brigade(f-c-output_filters, bb);
}
}

The fix was needed for case where handler needs to start streaming a
response before it starts processing request content.

If there is no valid reason why for non 20x response that a handler
should be able to read request content after having sent a response,
then closing the connection seems logical thing to do to avoid Apache
having to read the whole request content and discard it, just to
handle a potential request over same connection after it.

Graham

Re: Catching graceful restart in apache2 module

2009-08-04 Thread Graham Dumpleton

2009/8/4 Petr Hracek phrac...@gmail.com:
 I have found in following link: (http://wiki.apache.org/httpd/ModuleLife)

 Race conditions during graceful restart

 During a graceful restart, old children are still serving old requests while
 new children are serving new requests. If the same lock must be used by old
 and new children, then the lock name must be the same and cannot be
 generated with tmpnam() or similar functions in the post_config hook.

 Which lock is means there. I have already found the in the post_config I
 have cleanuped procedure, but in the post_config is already mentioned
 function for killing all session.
 Is there any way how to detect if the restart of apache has been done as
 gracefull or as hard restart?

/**
 * predicate indicating if a graceful stop has been requested ...
 * used by the connection loop
 * @return 1 if a graceful stop has been requested, 0 otherwise
 * @deffunc int ap_graceful_stop_signalled(*void)
 */
AP_DECLARE(int) ap_graceful_stop_signalled(void);

If you haven't already seen it, see:

http://www.fmc-modeling.org/category/projects/apache/amp/4_3Multitasking_server.html

and everything else in that document.

Graham

 best regards
 Petr Hracek

 2009/8/1 Petr Hracek phrac...@gmail.com

 As you mentioned:
 The request pool is no good, because that's cleaned up at the end of the
 request. The connection pool is also no good, because that gets cleaned
 up after the connection dies. You're probably after the pool you're
 given during the post_config hook, which gets destroyed on server
 shutdown (graceful or otherwise).

 It means that in post_config can be handled the server has been shutdown
 with either restart or graceful command for specific pool?
 If I understand right then if pool is opened then it would not end because
 of apache2 has been restarted with option graceful, right?
 Is it behaviour the same when the server is going down in shel with the
 gracefull command?
 Is there any example how to implement in the post_config handler?

 Best regards
 Petr

 2009/7/31 Graham Leggett minf...@sharp.fm

 Petr Hracek wrote:

  Thank for the answer.
 
  Could you please explain in details how to do register save-sessions
  as
  a pool cleanup.

 You call a function that looks like this to register your cleanup:

    apr_pool_cleanup_register(pool, (void *) foo, foo_cleanup,
                foo_cleanup);

 The function foo_cleanup() is a function you write yourself, that does
 whatever you want the cleanup to do:

 static apr_status_t foo_cleanup(void *dummy) {
    foo_t *foo = (foo_t *)dummy;

    ... do stuff using foo ...

    return APR_SUCCESS;
 }

 The variable foo is a void pointer that points to whatever you want your
 cleanup to operate on, such as a pointer to your session config, or
 whatever you want.

 The cleanup gets run when the pool is deleted, ie when someone calls
 apr_pool_destroy() on that pool.

 What you need to do at this point is decide which pool you attach your
 cleanup to.

 The request pool is no good, because that's cleaned up at the end of the
 request. The connection pool is also no good, because that gets cleaned
 up after the connection dies. You're probably after the pool you're
 given during the post_config hook, which gets destroyed on server
 shutdown (graceful or otherwise).

 Regards,
 Graham
 --

Re: Catching graceful restart in apache2 module

2009-08-04 Thread Graham Dumpleton

2009/8/4 Ruediger Pluem rpl...@apache.org:


 On 08/04/2009 09:02 AM, Graham Dumpleton wrote:
 2009/8/4 Petr Hracek phrac...@gmail.com:
 I have found in following link: (http://wiki.apache.org/httpd/ModuleLife)

 Race conditions during graceful restart

 During a graceful restart, old children are still serving old requests while
 new children are serving new requests. If the same lock must be used by old
 and new children, then the lock name must be the same and cannot be
 generated with tmpnam() or similar functions in the post_config hook.

 Which lock is means there. I have already found the in the post_config I
 have cleanuped procedure, but in the post_config is already mentioned
 function for killing all session.
 Is there any way how to detect if the restart of apache has been done as
 gracefull or as hard restart?

 /**
  * predicate indicating if a graceful stop has been requested ...
  * used by the connection loop
  * @return 1 if a graceful stop has been requested, 0 otherwise
  * @deffunc int ap_graceful_stop_signalled(*void)
  */
 AP_DECLARE(int) ap_graceful_stop_signalled(void);

 Is this also true for graceful restarts?
 The comment only talks about graceful stops.

Hmmm, I presumed that the server child process wouldn't know the
difference and that 'stop' here meant 'stop' of an individual process
and not the server as a whole. I guess a bit of digging through code
is necessary to verify what actually happens.

I could also possibly be wrong in assuming they were wanting to know
about detecting in a server child process and not Apache parent
process. I haven't exactly been following the discussion in detail.

Graham

Re: Catching graceful restart in apache2 module

2009-08-04 Thread Graham Dumpleton

2009/8/4 Graham Dumpleton graham.dumple...@gmail.com:
 2009/8/4 Ruediger Pluem rpl...@apache.org:


 On 08/04/2009 09:02 AM, Graham Dumpleton wrote:
 2009/8/4 Petr Hracek phrac...@gmail.com:
 I have found in following link: (http://wiki.apache.org/httpd/ModuleLife)

 Race conditions during graceful restart

 During a graceful restart, old children are still serving old requests 
 while
 new children are serving new requests. If the same lock must be used by old
 and new children, then the lock name must be the same and cannot be
 generated with tmpnam() or similar functions in the post_config hook.

 Which lock is means there. I have already found the in the post_config I
 have cleanuped procedure, but in the post_config is already mentioned
 function for killing all session.
 Is there any way how to detect if the restart of apache has been done as
 gracefull or as hard restart?

 /**
  * predicate indicating if a graceful stop has been requested ...
  * used by the connection loop
  * @return 1 if a graceful stop has been requested, 0 otherwise
  * @deffunc int ap_graceful_stop_signalled(*void)
  */
 AP_DECLARE(int) ap_graceful_stop_signalled(void);

 Is this also true for graceful restarts?
 The comment only talks about graceful stops.

 Hmmm, I presumed that the server child process wouldn't know the
 difference and that 'stop' here meant 'stop' of an individual process
 and not the server as a whole. I guess a bit of digging through code
 is necessary to verify what actually happens.

 I could also possibly be wrong in assuming they were wanting to know
 about detecting in a server child process and not Apache parent
 process. I haven't exactly been following the discussion in detail.

In prefork that function returns false all the time anyway. :-(

Graham

Re: Events, Destruction and Locking

2009-07-08 Thread Graham Dumpleton

2009/7/8 Graham Leggett minf...@sharp.fm:
 Paul Querna wrote:

 It breaks the 1:1: connection mapping to thread (or process) model
 which is critical to low memory footprint, with thousands of
 connections, maybe I'm just insane, but all of the servers taking
 market share, like lighttpd, nginx, etc, all use this model.

 It also prevents all variations of the slowaris stupidity, because its
 damn hard to overwhelm the actual connection processing if its all
 async, and doesn't block a worker.

 But as you've pointed out, it makes our heads bleed, and locks slow us down.

 At the lowest level, the event loop should be completely async, and be
 capable of supporting an arbitrary (probably very high) number of
 concurrent connections.

 If one connection slows or stops (deliberately or otherwise), it won't
 block any other connections on the same event loop, which will continue
 as normal.

But which for a multiprocess web server screws up if you then have a
blocking type model for an application running on top. Specifically,
the greedy nature of accepting connections may mean a process accepts
more connections which it has high level threads to handle. If the
high level threads end up blocking, then any accepted connections for
the blocking high level application, for which request headers are
still being read, or are pending, will be blocked as well even though
another server process may be idle. In the current Apache model a
process will only accept connections if it knows it is able to process
it at that time. If a process doesn't have the threads available, then
a different process would pick it up instead. I have previously
commented how this causes problems with nginx for potentially blocking
applications running in nginx worker processes. See:

  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html

To prevent this you are forced to run event driven system for
everything and blocking type applications can't be run in same
process. Thus, anything like that has to be shoved out into a separate
process. FASTCGI was mentioned for that, but frankly I believed
FASTCGI is getting a bit crufty these days. It perhaps really needs to
be modernised, with the byte protocol layout simplified to get rid of
these varying size length indicator bytes. This may have been
warranted when networks were slower and amount of body data being
passed around less, but I can't see that that extra complexity is
warranted any more. FASTCGI also can't handle things like end to end
100-continue processing and perhaps has other problems as well in
respect of handling logging outside of request context etc etc.

So, I personally would really love to see a good review of FASTCGI,
AJP and any other similar/pertinent protocols done to distill what in
these modern times is required and would be a better mechanism. The
implementations of FASTCGI could also perhaps be modernised. Of
course, even though FASTCGI may not be the most elegant of systems,
probably too entrenched to get rid of it. The only way perhaps might
be if a improved version formed the basis of any internal
communications for a completely restructured internal model for Apache
3.0 based on serf which had segregation between processes handling
static files and applications, with user separation etc etc.

Graham

Re: Help with worker.c

2009-07-08 Thread Graham Dumpleton

In case you haven't already found it, ensure you have a read of:

  
http://www.fmc-modeling.org/category/projects/apache/amp/4_3Multitasking_server.html

It may not address the specific question, but certainly will give you
a better overall picture.

The rest of that book is also worth reading as well.

Graham

2009/7/8 ricardo13 ricardoogra...@gmail.com:

 Hi,

 I'm trying understand worker.c module.
 My doubt is about operation push() and pop().

 Push() add a socket in array fd_queue_t-data and Pop() retrieve a socket
 for processing.

 But what's the order of PUSH() ?? It adds in final queue ??
 And POP() ?? Retrieve a socket only before (elem =
 queue-data[--queue-nelts];) ??

 Thank you
 Ricardo
 --
 View this message in context: 
 http://www.nabble.com/Help-with-worker.c-tp24389140p24389140.html
 Sent from the Apache HTTP Server - Dev mailing list archive at Nabble.com.

Re: Events, Destruction and Locking

2009-07-08 Thread Graham Dumpleton

2009/7/9 Rainer Jung rainer.j...@kippdata.de:
 On 08.07.2009 15:55, Paul Querna wrote:
 On Wed, Jul 8, 2009 at 3:05 AM, Graham
 Dumpletongraham.dumple...@gmail.com wrote:
 2009/7/8 Graham Leggett minf...@sharp.fm:
 Paul Querna wrote:

 It breaks the 1:1: connection mapping to thread (or process) model
 which is critical to low memory footprint, with thousands of
 connections, maybe I'm just insane, but all of the servers taking
 market share, like lighttpd, nginx, etc, all use this model.

 It also prevents all variations of the slowaris stupidity, because its
 damn hard to overwhelm the actual connection processing if its all
 async, and doesn't block a worker.
 But as you've pointed out, it makes our heads bleed, and locks slow us 
 down.

 At the lowest level, the event loop should be completely async, and be
 capable of supporting an arbitrary (probably very high) number of
 concurrent connections.

 If one connection slows or stops (deliberately or otherwise), it won't
 block any other connections on the same event loop, which will continue
 as normal.
 But which for a multiprocess web server screws up if you then have a
 blocking type model for an application running on top. Specifically,
 the greedy nature of accepting connections may mean a process accepts
 more connections which it has high level threads to handle. If the
 high level threads end up blocking, then any accepted connections for
 the blocking high level application, for which request headers are
 still being read, or are pending, will be blocked as well even though
 another server process may be idle. In the current Apache model a
 process will only accept connections if it knows it is able to process
 it at that time. If a process doesn't have the threads available, then
 a different process would pick it up instead. I have previously
 commented how this causes problems with nginx for potentially blocking
 applications running in nginx worker processes. See:

  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html

 To prevent this you are forced to run event driven system for
 everything and blocking type applications can't be run in same
 process. Thus, anything like that has to be shoved out into a separate
 process. FASTCGI was mentioned for that, but frankly I believed
 FASTCGI is getting a bit crufty these days. It perhaps really needs to
 be modernised, with the byte protocol layout simplified to get rid of
 these varying size length indicator bytes. This may have been
 warranted when networks were slower and amount of body data being
 passed around less, but I can't see that that extra complexity is
 warranted any more. FASTCGI also can't handle things like end to end
 100-continue processing and perhaps has other problems as well in
 respect of handling logging outside of request context etc etc.

 So, I personally would really love to see a good review of FASTCGI,
 AJP and any other similar/pertinent protocols done to distill what in
 these modern times is required and would be a better mechanism. The
 implementations of FASTCGI could also perhaps be modernised. Of
 course, even though FASTCGI may not be the most elegant of systems,
 probably too entrenched to get rid of it. The only way perhaps might
 be if a improved version formed the basis of any internal
 communications for a completely restructured internal model for Apache
 3.0 based on serf which had segregation between processes handling
 static files and applications, with user separation etc etc.

 TBH, I think the best way to modernize FastCGI or AJP is to just proxy
 HTTP over a daemon socket, then you solve all the protocol issues...
 and just treat it like another reverse proxy.  The part we really need
 to write is the backend process manager, to spawn/kill more of these
 workers.

 Though there is one nice feature in the AJP protocol: since it knows
 it's serving via a reverse proxy, the back end patches some
 communication data like it were the front end. So if the context on the
 back end asks for port, protocol, host name etc. it automatically gets
 the data that looks like the one of the front end. That way cookies,
 self-referencing links etc. work right.

 Most of that can be simulated by appropriate configuration with HTTP to
 (yes, there are a lot of proxy options for this), but in AJP its
 automatic. Some parts are not configurable right now, like e.g. the
 client IP. You always have to introduce something that's aware e.g. of
 the X-Forwarded-For header. Another example would be whether the
 communication to the reverse proxy was via https. You can transport all
 that info va custom headers, but the backend usually doesn't know how to
 handle it.

Yes, these are the sort of things which would be nice to be
transparent. Paul's comment is valid though in that HTTP itself could
be used as the protocol. Right now you couldn't do that over a UNIX
socket though for local backend process, and you loose the ability to
feed back

Re: Where Do I Create Queues in MPM Worker

2009-07-07 Thread Graham Dumpleton

2009/7/7 ricardo13 ricardoogra...@gmail.com:

 Hi,

 Sorry, I didn't know that was in wrong forum. What's the best list to write
 this doubt ??

You may well be on the right list, but right now it isn't too clear
that you really need to be modifying the actual MPM code.

 I want to modify MPM Worker (worker.c) to develop some scheduling
 algorithms.

 A first scheduling algorithm would be implement priority. Two queues
 (worker_queue1 and worker_queue2) of sockets where threads (workers) get
 all requests from worker_queue1 first, afterget all requests from
 worker_queue2.

By what criteria would requests get delegated to each queue? In other
words, what is the high level outcome you are trying to achieve. For
example, are you trying to give priority to certain virtual hosts or
listener ports???

Graham

 That is what I wanted to do.

 Thank you
 Ricardo



 Graham Dumpleton-2 wrote:

 Rather than keep demanding an answer to how to do whatever it is you
 want, that you explain why you want to do it in the first place. Given
 what looks like a rather inadequate knowledge of Apache, it is quite
 likely you are going about it all the completely wrong way. So, give
 some context about why you need it and people may be able to give more
 informed answers. At which point we may also be able to suggest you
 are in the wrong forum anyway and that you can do it as a module and
 so should use modules-dev list and not the list for development of the
 core of httpd.

 Graham

 2009/7/7 ricardo13 ricardoogra...@gmail.com:

 Hi all,

 Can anybody explain what's doing the function worker_thread in worker.c ?

 I dont't know APR and don't undestood the following lines:

        worker_sockets[thread_slot] = csd;
        bucket_alloc = apr_bucket_alloc_create(ptrans);
        process_socket(ptrans, csd, process_slot, thread_slot,
 bucket_alloc); // Here processing the csd socket ??
        worker_sockets[thread_slot] = NULL;
        requests_this_child--; /* FIXME: should be synchronized - aaron */

 I need know it.

 Thank you
 Ricardo


 ricardo13 wrote:

 Anyone ??


 ricardo13 wrote:

 Hi all,

 I would like to know how I create other queue of requests ?? Where I
 create ?? worker.c ??

 Thank you
 Ricardo





 --
 View this message in context:
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24357634.html
 Sent from the Apache HTTP Server - Dev mailing list archive at
 Nabble.com.





 --
 View this message in context: 
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24370202.html
 Sent from the Apache HTTP Server - Dev mailing list archive at Nabble.com.

Re: Where Do I Create Queues in MPM Worker

2009-07-07 Thread Graham Dumpleton

2009/7/7 ricardo13 ricardoogra...@gmail.com:



 Graham Dumpleton-2 wrote:

 2009/7/7 ricardo13 ricardoogra...@gmail.com:

 Hi,

 Sorry, I didn't know that was in wrong forum. What's the best list to
 write
 this doubt ??

 You may well be on the right list, but right now it isn't too clear
 that you really need to be modifying the actual MPM code.

 I want to modify MPM Worker (worker.c) to develop some scheduling
 algorithms.

 A first scheduling algorithm would be implement priority. Two queues
 (worker_queue1 and worker_queue2) of sockets where threads (workers)
 get
 all requests from worker_queue1 first, afterget all requests from
 worker_queue2.

 By what criteria would requests get delegated to each queue? In other
 words, what is the high level outcome you are trying to achieve. For
 example, are you trying to give priority to certain virtual hosts or
 listener ports???

 Firstly, The requests would be classified (module of classify) by IP.

 For example:
    If IP = x then forward_queue_1();
    else if IP = y then forward_queue_2();


 I want explain. I'm studying graduate and my final test is a project about
 webservers.
 I choose subject about QoS in webservers (application-level). The concepts
 about QoS are apply in network-level.

Have you seen:

http://mod-qos.sourceforge.net/

Not sure how much it overlaps what you are wanting to do.

Graham

 Thank you
 Ricardo


 Graham

 That is what I wanted to do.

 Thank you
 Ricardo



 Graham Dumpleton-2 wrote:

 Rather than keep demanding an answer to how to do whatever it is you
 want, that you explain why you want to do it in the first place. Given
 what looks like a rather inadequate knowledge of Apache, it is quite
 likely you are going about it all the completely wrong way. So, give
 some context about why you need it and people may be able to give more
 informed answers. At which point we may also be able to suggest you
 are in the wrong forum anyway and that you can do it as a module and
 so should use modules-dev list and not the list for development of the
 core of httpd.

 Graham

 2009/7/7 ricardo13 ricardoogra...@gmail.com:

 Hi all,

 Can anybody explain what's doing the function worker_thread in worker.c
 ?

 I dont't know APR and don't undestood the following lines:

        worker_sockets[thread_slot] = csd;
        bucket_alloc = apr_bucket_alloc_create(ptrans);
        process_socket(ptrans, csd, process_slot, thread_slot,
 bucket_alloc); // Here processing the csd socket ??
        worker_sockets[thread_slot] = NULL;
        requests_this_child--; /* FIXME: should be synchronized - aaron
 */

 I need know it.

 Thank you
 Ricardo


 ricardo13 wrote:

 Anyone ??


 ricardo13 wrote:

 Hi all,

 I would like to know how I create other queue of requests ?? Where I
 create ?? worker.c ??

 Thank you
 Ricardo





 --
 View this message in context:
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24357634.html
 Sent from the Apache HTTP Server - Dev mailing list archive at
 Nabble.com.





 --
 View this message in context:
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24370202.html
 Sent from the Apache HTTP Server - Dev mailing list archive at
 Nabble.com.





 --
 View this message in context: 
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24370640.html
 Sent from the Apache HTTP Server - Dev mailing list archive at Nabble.com.

Re: Where Do I Create Queues in MPM Worker

2009-07-06 Thread Graham Dumpleton

Rather than keep demanding an answer to how to do whatever it is you
want, that you explain why you want to do it in the first place. Given
what looks like a rather inadequate knowledge of Apache, it is quite
likely you are going about it all the completely wrong way. So, give
some context about why you need it and people may be able to give more
informed answers. At which point we may also be able to suggest you
are in the wrong forum anyway and that you can do it as a module and
so should use modules-dev list and not the list for development of the
core of httpd.

Graham

2009/7/7 ricardo13 ricardoogra...@gmail.com:

 Hi all,

 Can anybody explain what's doing the function worker_thread in worker.c ?

 I dont't know APR and don't undestood the following lines:

        worker_sockets[thread_slot] = csd;
        bucket_alloc = apr_bucket_alloc_create(ptrans);
        process_socket(ptrans, csd, process_slot, thread_slot,
 bucket_alloc); // Here processing the csd socket ??
        worker_sockets[thread_slot] = NULL;
        requests_this_child--; /* FIXME: should be synchronized - aaron */

 I need know it.

 Thank you
 Ricardo


 ricardo13 wrote:

 Anyone ??


 ricardo13 wrote:

 Hi all,

 I would like to know how I create other queue of requests ?? Where I
 create ?? worker.c ??

 Thank you
 Ricardo





 --
 View this message in context: 
 http://www.nabble.com/Where-Do-I-Create-Queues-in-MPM-Worker-tp24354526p24357634.html
 Sent from the Apache HTTP Server - Dev mailing list archive at Nabble.com.

Re: httpd initd daemon

2009-06-29 Thread Graham Dumpleton

2009/6/29 Yahav bi...@lucent.com:

 i would like to set the httpd instance to run as standard linux daemon. the
 daemon should be controlled by the init daemon. the problem is that the
 apachectl that runs the httpd is starting the main server process then
 forking N StarServers and return 0 or something else. I would like it to be
 hang while it run i.e. right before exiting addin select command that will
 listen on some signal, like SIGTERM.
 is there any way to add it? if so can somebody recomands what is the best
 place to make the change? is there allready such feature?

Have you tried:

  httpd -DFOREGROUND

instead of apachectl.

Read the httpd manual page and Google search on that for more information.

Graham

Re: Mitigating the Slowloris DoS attack

2009-06-24 Thread Graham Dumpleton

2009/6/24 Kevin J Walters kevin.walt...@morganstanley.com:

 M == Matthieu Estrade mestr...@apache.org writes:

 M More granular timeout and maybe adaptative timeout is also IMHO a good
 M way to improve resistance to this kind of attack.

 The current 1.3, 2.0 and 2.2 documentation is in agreement too!

 I believe the ssl module also takes its timeout value from this
 setting. It would be great if that was separately configurable too to
 cater for those intent on doing partial ssl handshakes.


  The TimeOut directive currently defines the amount of time Apache will wait 
 for three things:

   1. The total amount of time it takes to receive a GET request.
   2. The amount of time between receipt of TCP packets on a POST or PUT 
 request.
   3. The amount of time between ACKs on transmissions of TCP packets in 
 responses.

  We plan on making these separately configurable at some point down the
  road. The timer used to default to 1200 before 1.2, but has been
  lowered to 300 which is still far more than necessary in most
  situations. It is not set any lower by default because there may still
  be odd places in the code where the timer is not reset when a packet
  is sent.

From what I understand, the server timeout value is also used to break
deadlocks in mod_cgi due to POST data being greater than the UNIX
socket buffer size and CGI script not reading POST data and then
returning a response greater than the UNIX socket buffer size. In
other words, CGI script blocks because Apache server child process
isn't reading response. The Apache server child process is blocked
still waiting for the CGI script to consume the response. The timeout
value breaks the deadlock. In this context, making the timeout too
small a value may have unintended consequences and affect how CGI
scripts work, so a separate timeout for mod_cgi would be preferable.

FWIW, the mod_cgid module doesn't appear to have this deadlock
detection so in practice this issue could in itself be used as a
denial of service vector when mod_cgid is used as it will completely
lock up the Apache child server thread with no failsafe to unblock it.
I have brought this issue up before on the list to get someone else to
analyse mod_cgid code and see if what I see is correct or not, but no
one seemed interested at the time, so took it that people didn't see
it as important. It may not have been seen as such a big issue as on
Linux systems the UNIX socket buffer size in the order of 220KB. On
MacOS X though, the UNIX socket buffer size is only 8KB, so much
easier to trigger. Unlike SendBufferSize and ReceiveBufferSize, there
are no directives to override these buffer sizes for mod_cgi and
mod_cgid.

Graham

Re: Mitigating the Slowloris DoS attack

2009-06-22 Thread Graham Dumpleton

2009/6/23 Weibin Yao nbubi...@gmail.com:
 William A. Rowe, Jr. at 2009-6-23 2:00 wrote:

 Andreas Krennmair wrote:


 * Guenter Knauf fua...@apache.org [2009-06-22 04:30]:


 wouldnt limiting the number of simultanous connections from one IP
 already help? F.e. something like:
 http://gpl.net.ua/modipcount/downloads.html


 Not only would this be futile against the Slowloris attack (imagine n
 connections from n hosts instead of n connections from 1 host), it would
 also potentially lock out groups of people behind the same NAT gateway.


 FWIW mod_remoteip can be used to partially mitigate the weakness of this
 class of solutions.

 However, it only works for known, trusted proxies, and can only be safely
 used for those with public IP's.  Where the same 10.0.0.5 on your private
 NAT backed becomes the same 10.0.0.5 within the apache server's DMZ, the
 issues like Allow from 10.0.0.0/8 become painfully obvious.  I haven't
 found a good solution, but mod_remoteip still needs one, eventually.



 I have an idea to mitigate the problem: put the Nginx as a reverse proxy
 server in the front of apache.

Although your comment is perhaps heresy here, it does highlight one of
the things that nginx is good at, even if you don't use it to serve
static files with Apache handling just the dynamic web application.
That is, that it can isolate Apache from slow clients, whether that be
an attack as in this case, or just normal users using slow networks.
The proxy module of nginx in the way it will buffer up request content
to disk before actually sending the request onto the backend also
helps by not tying up Apache's limited request handler threads until
the request content is completely available, although, nginx does have
an upper limit on this at some point and will still stream when the
post content is large enough.

The nginx server works better at avoiding problems with slow clients
because it is event driven rather than threaded and so can handle more
connections without needing to tie up expensive threads.
Unfortunately, trying to make socket accept handling in Apache be
event driven and for requests to only be handed off to a thread for
processing when ready can introduce its own problems. This is because
an event driven system can tend to greedily accept new socket
connections. In a multiprocess server configuration this can mean that
a single process may accept more than its fair share of socket
connections and by the time it has read the initial request headers,
may not have enough available threads to handle the requests. In the
mean time, another server process, which did not get in quick enough
to accept some of the connections could be sitting their idle. How you
mediate between multiple servers to avoid this sort of problem would
be tricky if it can be done.

Anyway, now for a hair brained suggestion that could bring some of
this nginx goodness to Apache. Although no doubt it would have various
limitations which to solve properly and be integrated seamlessly into
Apache would require some changes in the core.

The idea here is to have an Apache module which spawns off its own
child process which implements a very small lightweight event driven
proxy that listens on the real listener sockets you want to expose.
This processes sole job would then be to handle reading in the request
headers, and perhaps optionally buffering up request content, and then
squirt it across to real Apache child server processes to be handled
when it has all the information it needs. To that end, it wouldn't be
a general purpose proxy but quite customised. As such, it could even
perhaps be made more efficient than nginx in the way it is used to
protect Apache from such things as slow clients.

For HTTP at least, this probably wouldn't be too hard to do and
wouldn't likely need any changes to the core. You could even have
whether you use it be optional to the extent of it only applying to
certain virtual hosts. Where it does though all get a lot harder is
virtual hosts which use HTTPS.

So, that is my crazy thought for the day and am sure that it will be
derided for what is is worth.

I still find the thought interesting though and it falls into that
class of things I find interesting due to the challenge it presents.
:-)

Graham

Re: Mitigating the Slowloris DoS attack

2009-06-21 Thread Graham Dumpleton

2009/6/22 Guenter Knauf fua...@apache.org:
 Hi Andreas,
 Andreas Krennmair schrieb:
 For those who are still unaware of the Slowloris attack, it's a
 denial-of-service attack that consumes Apache's resources by opening up
 a great number of parallel connections and slowly sending partial
 requests, never completing them. Since Apache limits the number of
 parallel clients it serves (the MaxClients setting), this blocks further
 requests from being completed. Unlike other traditional TCP DoS
 attacks, this HTTP-based DoS attack requires only very little network
 traffic in order to be effective.  Information about the Slowloris
 attack including a PoC tool was published here:
 http://ha.ckers.org/slowloris/

 I thought for some time about the whole issue, and then I developed a
 proof-of-concept patch for Apache 2.2.11 (currently only touches the
 prefork MPM), which you can download here:
 http://synflood.at/tmp/anti-slowloris.diff
 wouldnt limiting the number of simultanous connections from one IP
 already help? F.e. something like:
 http://gpl.net.ua/modipcount/downloads.html

Not if the attack is launched from a botnet, which is the more likely
scenario for people who really want to hide their tracks.

BTW, focus here seems to be on the reading of the request headers
themselves. Can't trickling of actual request content data to a URL
equally tie up handler threads. Either in the case where request
handler is doing the reads of request content, or for the case of
success status, by ap_discard_request_body() at the end of the request
and where HTTP/1.1 and keep alive requested.

The only difference really is that if done with request headers,
nothing would be logged about it in access logs, so not easy to track.

Graham

Re: Some ramblings on httpd config

2009-06-09 Thread Graham Dumpleton

2009/6/9 Akins, Brian brian.ak...@turner.com:
 On 6/5/09 11:31 PM, Graham Dumpleton graham.dumple...@gmail.com wrote:

 This last example wasn't even related to driving configuration. It was
 in practice an actual handler hook implementation for request
 processing, not configuration phases.

 The way I see it, we have artificially separated configuration from request
 processing.

Well, I believe that separation already exists. How you would do
things in each context would be different. See mod_perl for example:

  http://perl.apache.org/docs/2.0/api/Apache2/PerlSections.html

How you write its configuration stuff is quite different to an actual handler.

 If you squint and tilt your head just right, you can see that
 virtualhosts today are really just syntactical sugar over the if/else logic
 inside of core:

Except that for handler the calculation is more dynamic and not driven
through static defined data structures setup in configuration phase.

 Some pseudo request processing code to do same thing:
  if listening_port == 80 then
     if r.hostname == 'www.foo.com' then
         
     elseif r.hostname =~ /www\d.bar.[org|net]/
     end
  end


 Of course this could be further hidden from users with
 macros/functions/libraries/modules...

 Now, on the practical side, do we completely ditch the current config
 system.  Part of me says yes, but I know that will be -1'd to death.  So,
 I'd just like the ability to do something like this:

 LoadModule lua_module mod_lua.so
 Listen 80
 LuaRequestHandler /path/to/my/lua/handler.lua

Huh, are you sure you can't do that now somehow. The vhost module uses
translate name hook so if you use LuaHookTranslateName I would presume
it would be possible to do something equivalent in lua.

 (or it can be inline Lua but have found that to be somewhat cumbersome)

 Because I don't want to rewrite mod_proxy in lua, it'd be nice to have just
 a little bit of glue that would allow me to use it in a more scripty sort
 of way:

 LoadModule proxy_module mod_proxy.so
 LoadModule proxy_http_module mod_proxy_http.so
 
  require httpd.proxy -- provided by mod_proxy glue

  p = httpd.proxy.get_url('http://blah')

Again, are you sure there aren't already ways of doing that with
mod_lua. Setting up proxying is simple enough to do in mod_python and
would be possible with mod_perl as well. For mod_python see example
in:

  http://issues.apache.org/jira/browse/MODPYTHON-141

If mod_lua provides equivalent wrappers for request object, would be
done in same way.

 (Of course, that example could be handled like we do in mod_rewrite)

 Currently, we can sorta do most request processing in lua. (FWIW, do the
 request phases make any sense in a world where the entire request process is
 handled by a script??)  What is missing is the glue to the other, useful
 parts of httpd - like cache, mod_dbd, proxy, etc.

Getting a bit confused here. You acknowledge you know about request
processing phases, but at the same time say you would like to see
stuff that I would have thought was already possible.

Do not I haven't used mod_lua/mod_wombat, so maybe it doesn't give you
level of programmability as modules such as mod_perl and mod_python.

 Sure, one of us could hack together some example glue here and there, but
 until we as a whole get why this is useful/important, it will be just
 another list of patches waiting to be reviewed.

 --
 Brian Akins
 Chief Operations Engineer
 Turner Digital Media Technologies

Re: Some ramblings on httpd config

2009-06-09 Thread Graham Dumpleton

2009/6/9 Jorge Schrauwen jorge.schrau...@gmail.com:
 As long as the current system isn't replaced by an entire runtime like
 program approach I'd be okay with it.

 But why not take it a step further than just lua?
 Wouldn't it be possible to expose a standardized set of commands,
 functions, objects, whatnot to any language?

The standard objects are things like request_rec, server_rec etc etc.
Ie., Apache's own internal data structures. Modules such as mod_perl
and mod_python provide wrappers for these. The standardised set of
commands are the Apache and APR functions which are also wrapped.

Or are you talking about higher level functionality. A lot of higher
level functionality is I understand provided in modules associated
with mod_perl already. In other words, at the moment any higher levels
encapsulation of functionality up to the specific script language
module. Some provide more than others.

Graham

 That start with mod_lua as the initial implementation but if at a
 later date someone makes mod_lisp/mod_java/.. they all share about the
 same objects where just the syntax would be different. Also the glue
 would then also extend to not just lua.

 That would also help to make it documenting it language independently
 more doable. (wow thats a word?)

 you could then talk about a request object having properties x,y and
 z and it doesn't matter if you manipulate it via lua,java,perl

 Then again this would probably cause a whole lot of overhead and would
 force mod_lua to be rewriting a lot I guess.


 ~Jorge



 On Tue, Jun 9, 2009 at 2:49 PM, Akins, Brianbrian.ak...@turner.com wrote:
 On 6/5/09 11:31 PM, Graham Dumpleton graham.dumple...@gmail.com wrote:

 This last example wasn't even related to driving configuration. It was
 in practice an actual handler hook implementation for request
 processing, not configuration phases.

 The way I see it, we have artificially separated configuration from request
 processing.  If you squint and tilt your head just right, you can see that
 virtualhosts today are really just syntactical sugar over the if/else logic
 inside of core:

 Some pseudo request processing code to do same thing:
  if listening_port == 80 then
     if r.hostname == 'www.foo.com' then
         
     elseif r.hostname =~ /www\d.bar.[org|net]/
     end
  end


 Of course this could be further hidden from users with
 macros/functions/libraries/modules...

 Now, on the practical side, do we completely ditch the current config
 system.  Part of me says yes, but I know that will be -1'd to death.  So,
 I'd just like the ability to do something like this:

 LoadModule lua_module mod_lua.so
 Listen 80
 LuaRequestHandler /path/to/my/lua/handler.lua

 (or it can be inline Lua but have found that to be somewhat cumbersome)

 Because I don't want to rewrite mod_proxy in lua, it'd be nice to have just
 a little bit of glue that would allow me to use it in a more scripty sort
 of way:

 LoadModule proxy_module mod_proxy.so
 LoadModule proxy_http_module mod_proxy_http.so
 
  require httpd.proxy -- provided by mod_proxy glue

  p = httpd.proxy.get_url('http://blah')


 (Of course, that example could be handled like we do in mod_rewrite)

 Currently, we can sorta do most request processing in lua. (FWIW, do the
 request phases make any sense in a world where the entire request process is
 handled by a script??)  What is missing is the glue to the other, useful
 parts of httpd - like cache, mod_dbd, proxy, etc.

 Sure, one of us could hack together some example glue here and there, but
 until we as a whole get why this is useful/important, it will be just
 another list of patches waiting to be reviewed.

 --
 Brian Akins
 Chief Operations Engineer
 Turner Digital Media Technologies

Re: Some ramblings on httpd config

2009-06-05 Thread Graham Dumpleton

2009/6/6 Rich Bowen rbo...@rcbowen.com:

 On Jun 4, 2009, at 22:53, Graham Leggett wrote:

 This approach doesn't require any changes to httpd itself as the
 ability to do this becomes a feature of just the module supporting
 that scripting language, eg, mod_lua. The same could also be done for
 other scripting languages.

 So, if your aim is to be able to do everything within the one Apache
 configuration file, rather than out in separate scripts, this would
 seem in part to satisfy the requirement.

 I like this.

 In theory, you could have a mod_java, or anything really.


 As one of the folks who answers the How Do I questions every day, while
 that would indeed be neato and nifty, it behooves us to pick a configuration
 file syntax, not say do it in whatever language amuses you.

 We've had configuration in Perl for years, but we don't push it because most
 of our audience looks at us like aliens when we suggest it. The folks savvy
 enough to use the Perl configuration can go find it and do it that way, and
 can indeed do powerful things with it.

 But the vast majority of the folks that actually admin the server don't want
 to be told to script their configuration in the programming language of
 their choice. They want a howto recipe, and they want it to work without
 having fiddle about with learning complicated syntax.

 I'd also humbly request that we *not* put Lua in the configuration
 directives. If folks are configuring a virtual host, they aren't going to be
 looking for directives starting with Lua*. Over the years we seem to have
 moved from giving configuration directives whatever name sprang to mind, to
 giving them function-based names that people will actually find in the
 documentation. Let's not scrap that. Our users don't care that it's
 implemented in Lua, and shouldn't have to care.

This last example wasn't even related to driving configuration. It was
in practice an actual handler hook implementation for request
processing, not configuration phases.

The intent is not to replace current Apache configuration mechanism
but leave it as is. Was just highlighting for the convenience factor,
for simple stuff, scripting modules allowing handler implementations
to be defined in the Apache configuration file itself rather than in a
separate file/module. Whether a particular scripting module does that
would be up to that module. No intention to change the core of Apache.

Thus, you might have this ability for request handler phase hook
implementations to be in configuration file with mod_lua, mod_wsgi,
mod_python, mod_tcl, mod_other_scripting language.

I don't use mod_perl, but If I remember correctly, mod_perl only
allows this for configuration generation and not for request handler
phase hook implementations as talking about here.

Anyway, this would all be a power user thing and not the only way of
doing things and most certainly kept away from inexperienced people.

Graham

Re: Some ramblings on httpd config

2009-06-04 Thread Graham Dumpleton

2009/6/4 Akins, Brian brian.ak...@turner.com:
 On 6/3/09 7:50 PM, Plüm, Rüdiger, VF-Group ruediger.pl...@vodafone.com
 wrote:

 1. There are many and large and complex configurations out in the world.

 Which is exactly why I want/need a better way to do them.  I'm currently
 using a template system to generate them.  However I wind up with dozens
 (and dozens) of vhosts that sometimes only vary by a statement or too.

Would mod_macro help.

http://www.coelho.net/mod_macro/

Graham

 A contrived example:

  -- lots of stuff
  if r.hostname == www.domain.com then
     -- a couple of things specific to this domain
  elseif r.hostname == www.domain2.com then
 ...


 Of course, this example may could have a little lua glue to handle this
 situation.


 Also, I'm not just talking about lua being the config language, I want lua
 to drive the httpd process.  Ie, the above code gets ran on every request.

 2. I admit that some improvements are needed. How about an approach that
 allows
    to embed a macro / scripting language into the current configuration 
 system
    that allows you to do more advanced things if you need to.

 If we provided enough glue within our modules for lua, this this would be
 fairly easy.  I already fake this a bit with mod_lua and handlers that do
 most of the work.
 --
 Brian Akins
 Chief Operations Engineer
 Turner Digital Media Technologies

Re: Some ramblings on httpd config

2009-06-04 Thread Graham Dumpleton

2009/6/4 Akins, Brian brian.ak...@turner.com:
 On 6/4/09 8:14 AM, Jorge Schrauwen jorge.schrau...@gmail.com wrote:

 Like Graham mentioned mod_macro can be of some use here. however since
 I'm looping in  perl I may as well push the 4 lines needed to httpd
 instead of a one line macro replacemen.

 Okay, I'm not explaining my self well.  I'm not just talking about
 generating the internal configuration from lua (or xml, or macro, etc.)

 But actually running httpd using lua.  The request handling is done in lua
 - or rather driven by lua.

 See http://redmine.lighttpd.net/projects/lighttpd/wiki/AbsoLUAtion

 A side note: I think the vhost concept we have now is lacking.  The
 separation that we have just isn't necessary and makes some common tasks
 hard.  I'm not really willing to fight for this one though ;)  I can fake
 what I want if I can just load up a lua handler to handle the request from
 post_read to handler.

Since you are talking here about runtime decisions based on a specific
request, rather than auto generation of the static configuration in
once off phase, it almost sounds like all you need is a way of having
the lua code which would be associated with a specific handler in the
configuration file rather than having to be in a separate file. So,
instead of:

LuaHookTranslateName lib/hooks.lua trans_name

allowing something like:

  LuaHookTranslateName trans_name
  function trans_name(r)
 ...
  end
  /LuaHookTranslateName

This approach doesn't require any changes to httpd itself as the
ability to do this becomes a feature of just the module supporting
that scripting language, eg, mod_lua. The same could also be done for
other scripting languages.

So, if your aim is to be able to do everything within the one Apache
configuration file, rather than out in separate scripts, this would
seem in part to satisfy the requirement.

Graham

Re: Some ramblings on httpd config

2009-06-03 Thread Graham Dumpleton

2009/6/4 Plüm, Rüdiger, VF-Group ruediger.pl...@vodafone.com:
 2. I admit that some improvements are needed. How about an approach that 
 allows
   to embed a macro / scripting language into the current configuration system
   that allows you to do more advanced things if you need to.
   (OK, yes this proposal contradicts some of the downsides I mentioned in
    1.1. So I am not consistent here :-)).

My memory could be rusty, but I thought mod_perl had a way of doing
that already, at least for when mod_perl is being used. Thus, there is
perhaps precedent for that unless I am going senile.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-09 Thread Graham Dumpleton

2009/4/9 KaiGai Kohei kai...@ak.jp.nec.com:
 William A. Rowe, Jr. wrote:
 KaiGai Kohei wrote:
 However, SElinux does not allow to revert its privilege (security context)
 unconditionally, even if it is dynamically changed.
 If we want to revert it, the security policy has to allow B-A in addition
 to A-B, but it is generally nonsense.
 It is also the reason why we need a one-time thread or process to assign
 individual privileges for each requests.

 Sounds like it's time for you to hack up an alternate, selinux based mpm.

 I also think a selinux based (or possible for other secure os) mpm
 is a reasonable candidate.

 Due to the above limitation, this mpm need to create a process or
 thread for each requests, and not to allow keep-alive mode.

 If the approach can be acceptable, I will switch to develop the new
 mpm approach.

Which gets back to the old perchild MPM perhaps being in part
relevant. The difference is that you need a more dynamic system
whereby which specific user process is used might be based on URL or
authentication credentials as well as host. Another aspect worth
consideration is a means to dynamically create additional processes
for new users, rather than everything being static, with an idle
timeout mechanism to shutdown user processes which haven't had to
handle requests for some amount of time. This approach obviously need
not even involve SELinux specifically as separation achieved at
process
level.

FWIW, this dynamic user process creation is something which is being
implemented in Apache module I develop. That though is being done at
higher level and only applies to the web applications written in the
specific scripting language that the module supports, and isn't a
generic mechanism applicable to all Apache modules.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-09 Thread Graham Dumpleton

2009/4/9 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 2009/4/9 KaiGai Kohei kai...@ak.jp.nec.com:
 William A. Rowe, Jr. wrote:
 KaiGai Kohei wrote:
 However, SElinux does not allow to revert its privilege (security context)
 unconditionally, even if it is dynamically changed.
 If we want to revert it, the security policy has to allow B-A in addition
 to A-B, but it is generally nonsense.
 It is also the reason why we need a one-time thread or process to assign
 individual privileges for each requests.
 Sounds like it's time for you to hack up an alternate, selinux based mpm.
 I also think a selinux based (or possible for other secure os) mpm
 is a reasonable candidate.

 Due to the above limitation, this mpm need to create a process or
 thread for each requests, and not to allow keep-alive mode.

 If the approach can be acceptable, I will switch to develop the new
 mpm approach.

 Which gets back to the old perchild MPM perhaps being in part
 relevant. The difference is that you need a more dynamic system
 whereby which specific user process is used might be based on URL or
 authentication credentials as well as host. Another aspect worth
 consideration is a means to dynamically create additional processes
 for new users, rather than everything being static, with an idle
 timeout mechanism to shutdown user processes which haven't had to
 handle requests for some amount of time. This approach obviously need
 not even involve SELinux specifically as separation achieved at
 process
 level.

 I also think the mpm is not necessary to focus on SELinux.
 If it just create a one-time thread or process for each request,
 an pluggable module can set privileges of the execution context.
 It gives a chance for users to assign SELinux's privileges.
 In addition, someone may choose other operating system.

 FWIW, this dynamic user process creation is something which is being
 implemented in Apache module I develop. That though is being done at
 higher level and only applies to the web applications written in the
 specific scripting language that the module supports, and isn't a
 generic mechanism applicable to all Apache modules.

 Hmm... what I would like to achieve is a bit different.

Although what I am working on is for a specific scripting language,
the concept is just as applicable at the MPM level and so would
achieve what you want if implemented at that level. The existing
perchild MPM already in part illustrates that.

 The reason why I would like to set privilege prior to the invocation
 of contents handler is to apply consistent access controls independent
 from what kind of script languages are used.

I understand that, but you seem to be focused on the idea of using
threads within a process and thus require SELinux security contexts,
with its limitations. If distinct processes are used, and they need
not be created for just a single request but held around while ever
their is activity related to that user, then you do not need SELinux
and so it is portable across more platforms. SELinux security contexts
would only be relevant to a process oriented solution if for some
reason you wanted to set greater constraints than what a user would
normally have imposed on them for a normal process run as that user.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-09 Thread Graham Dumpleton

2009/4/9 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 2009/4/9 KaiGai Kohei kai...@ak.jp.nec.com:
 The reason why I would like to set privilege prior to the invocation
 of contents handler is to apply consistent access controls independent
 from what kind of script languages are used.
 I understand that, but you seem to be focused on the idea of using
 threads within a process and thus require SELinux security contexts,
 with its limitations. If distinct processes are used, and they need
 not be created for just a single request but held around while ever
 their is activity related to that user, then you do not need SELinux
 and so it is portable across more platforms. SELinux security contexts
 would only be relevant to a process oriented solution if for some
 reason you wanted to set greater constraints than what a user would
 normally have imposed on them for a normal process run as that user.
 Yes, the thread level privilege is a characteristic in SELinux,
 thus it may prevent to port the feture to other platforms.
 At least, I don't have any preference between them to implement
 the security focused mpm. I can agree the mpm should create
 a process rather than a thread.

 However, I don't think it is reasonably possible a process to handle
 multiple requests, because authentication header is come for each
 times, so we cannot assume the second request should be handled with
 same privilege of the first one.

 I am not talking about successive requests over a keepalive socket,
 but totally distinct requests, where distinct decision is made as
 whether those requests should be passed through to the appropriate
 user process.

 You really need to look at how perchild is implemented.

 Now I'm looking at the perchild implementation...
 Since I could not find it at the 2.2.x tree, I refer the 2.0.x tree.
 Is it correct for what you intend?

 From the quick overview, if I can understand correctly, it seems to me
 the perchild uses longjmp() to rewind the steps when an unexpected process
 receives a request for other virtual host. It makes decision at the
 post_read_request hook after the ap_update_vhost_from_headers(), but
 we need to do same thing at more deep stage (fixups?).

It has been a very long time since I looked at the code that
implements it. Either way, the intent in pointing it out was more the
architecture it uses rather than exactly how it is implemented.

 It is unclear for me whether it really has an advantage rather than
 per-request creation design.

For per request you mean a new process, then it should as far as
performance is concerned as the cost of a CGI like model of a new
process per request is significantly higher than using a persistent
process. This is one of the reasons FASTCGI came about in the first
place.

 Please tell me, if I'm looking at something pointless.

Only you would know that. But then, I could be pointing you at the
wrong MPM. There is from memory another by another name developed
outside of ASF which intends to do the same think. The way it is
implemented is probably going to be different and may be the one I am
actually thinking of. I can't remember the name of it right now.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-08 Thread Graham Dumpleton

2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 Explain first why using FASTCGI and suexec wouldn't be a better option?
 Thease are limited to cgi applications, so we cannot apply such kind
 of restriction on the built-in script languages and references on
 static documents (like *.html).

 FASTCGI is not restricted to CGI applications. At least in the sense
 that FASTCGI allows persistent processes rather than one off processes
 like CGI. FASTCGI bindings are available for many different languages,
 including scripting languages, so what 'built-in script languages' are
 you talking about? The suexec mechanism comes into play as it allows
 FASTCGI processes to run as a different user than Apache process.

 Hmm... I'll try to search for more details of features of FastCGI.

 If you have a hint, could you tell for the questions currently I have?
 IIRC, the CGI version of PHP cannot handle applications which write
 out special HTTP headers, such as WWW-Authenticate: or Location:.
 Is it possible to handle correctly in FastCGI?
 I could not find FastCGI support for WebDav. Is it possible to control
 accesses on files using SELinux?

FASTCGI is effectively a wire protocol. Something like WebDav wouldn't
target FASTCGI directly. Instead, WebDav would be implemented on top
of some web framework system. That web framework system just may so
happen to support use of FASTCGI for hosting. For example, there are
Python modules available for doing WebDav stuff and these might
technically be used in a WSGI application hosted on top of FASTCGI
using flup adapter. Wouldn't be surprised if there was WebDav stuff
available for Perl as well.

Suggest you go and read about FASTCGI and get a clearer understanding
of what it is and isn't.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-08 Thread Graham Dumpleton

2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 KaiGai Kohei wrote:
 Graham Dumpleton wrote:
 2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 Explain first why using FASTCGI and suexec wouldn't be a better option?
 Thease are limited to cgi applications, so we cannot apply such kind
 of restriction on the built-in script languages and references on
 static documents (like *.html).
 FASTCGI is not restricted to CGI applications. At least in the sense
 that FASTCGI allows persistent processes rather than one off processes
 like CGI. FASTCGI bindings are available for many different languages,
 including scripting languages, so what 'built-in script languages' are
 you talking about? The suexec mechanism comes into play as it allows
 FASTCGI processes to run as a different user than Apache process.

 Hmm... I'll try to search for more details of features of FastCGI.

 If you have a hint, could you tell for the questions currently I have?
 IIRC, the CGI version of PHP cannot handle applications which write
 out special HTTP headers, such as WWW-Authenticate: or Location:.
 Is it possible to handle correctly in FastCGI?
 I could not find FastCGI support for WebDav. Is it possible to control
 accesses on files using SELinux?

 Hmm... It seems to me FastCGI has same limitation.
 The online document introduces that an authenticator program can
 be performed to handle authentication phase, but it may require
 web applications to be modified.
  http://fastcgi.coremail.cn/configuration.htm#Authenticator

 If we don't hesitate to create a new process for each requests,
 I have one another idea which does not require new hooks.
 In the traditional client-server model, the server process forks
 a child process to handle a request come from clients.
 If we have such kind of MPM module, a security module can set
 an individual privilege at the head of ap_run_handler hook.

 Needless to say, it has performance tradeoff, but we assume users
 don't give the highest priority on the performance.

See experimental MPM from Apache 2.0.

  http://httpd.apache.org/docs/2.0/mod/perchild.html

Didn't get carried through to later Apache versions.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-08 Thread Graham Dumpleton

2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 KaiGai Kohei wrote:
 Graham Dumpleton wrote:
 2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 Explain first why using FASTCGI and suexec wouldn't be a better option?
 Thease are limited to cgi applications, so we cannot apply such kind
 of restriction on the built-in script languages and references on
 static documents (like *.html).
 FASTCGI is not restricted to CGI applications. At least in the sense
 that FASTCGI allows persistent processes rather than one off processes
 like CGI. FASTCGI bindings are available for many different languages,
 including scripting languages, so what 'built-in script languages' are
 you talking about? The suexec mechanism comes into play as it allows
 FASTCGI processes to run as a different user than Apache process.
 Hmm... I'll try to search for more details of features of FastCGI.

 If you have a hint, could you tell for the questions currently I have?
 IIRC, the CGI version of PHP cannot handle applications which write
 out special HTTP headers, such as WWW-Authenticate: or Location:.
 Is it possible to handle correctly in FastCGI?
 I could not find FastCGI support for WebDav. Is it possible to control
 accesses on files using SELinux?
 Hmm... It seems to me FastCGI has same limitation.
 The online document introduces that an authenticator program can
 be performed to handle authentication phase, but it may require
 web applications to be modified.
  http://fastcgi.coremail.cn/configuration.htm#Authenticator

 If we don't hesitate to create a new process for each requests,
 I have one another idea which does not require new hooks.
 In the traditional client-server model, the server process forks
 a child process to handle a request come from clients.
 If we have such kind of MPM module, a security module can set
 an individual privilege at the head of ap_run_handler hook.

 Needless to say, it has performance tradeoff, but we assume users
 don't give the highest priority on the performance.

 See experimental MPM from Apache 2.0.

   http://httpd.apache.org/docs/2.0/mod/perchild.html

 Didn't get carried through to later Apache versions.

 If I can understand correctly, the perchild mpm assigns individual
 userid per virtual host, so it means all the requests handled by
 a certain virtual host shares same privilege set.
 The purpose of my efforts is to set individual privileges for each
 web users of the given request.

 Thanks for your information, but it is not suitable for us...

Rather than you say what changes you think need to be done at low
level that you actually explain better at a high level what you are
trying to do. It is really hard to respond to you when we have no real
idea of the outcome you are trying to achieve.

Specifically, what Apache modules are you trying to use that you
specifically want requests to be handled as separate users. You
mentioned 'built in scripting languages' before, but when I asked you
what scripting languages, you didn't answer. As I have already pointed
out, if you are trying to provide scripting language support for
writing Python web applications then FASTCGI can be used, but then you
have dismissed that based on authentication reasons but not really
explaining what the problem is.

Graham

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-07 Thread Graham Dumpleton

Explain first why using FASTCGI and suexec wouldn't be a better option?

It concerns me that in your plans, even though you are changing the
security context of a single thread within an existing process, that
that thread may still has access to all the process memory and so
could read or modify memory in use by threads running in a different
security context. I am assuming here that SELinux cannot prevent that
happening.

Graham

2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Hello,

 I've posted my idea to improve web-application security a few times
 however, it could not interest folks unfortunatelly. :(
 So, I would like to offer another approach for the purpose.
 The attached patch is a proof of the concept of newer idea.
 Any comments are welcome, and please feel free.


 The attached patch adds the following hook:
  AP_DECLARE_HOOK(int,invoke_handler,(request_rec *r))

 The server/core.c registers the ap_invoke_handler() as a default
 fallback, and all the ap_invoke_handler() invocations are replaced
 by ap_run_invoke_handler(), so we don't have any compatibility
 issue as far as no modules uses the new hooks.

 The purpose of this new hooks is to give modules a chance to assign
 an appropriate privilege set before contents handler launched.

 The mod_selinux.c is a typical example.
 It acquires a control via the invoke_handler hook whenever someone
 tries to invoke contents handler, then it compute what privilege
 (called as security context) should be assigned during the contents
 handler execution. If the computed privilege is same as the current
 one, it just returns DECLINES. But, if the computed one is different
 from the current, it creates a one-time worker thread and wait for
 its completion. The worker thread set a new privilege on itself and
 invokes ap_invoke_handler() with restricted privilege.

 In the previous design proposal, I added hooks just before
 ap_process_(async_)request(), but I noticed it cannot handle a case
 of internal redirection.

 BTW, Please note that the purpose of our efforts is to launch web
 applications with individual privilege set, not to add new hooks.
 Now I think the idea is the shortest distance to the goal, but
 is there any other ideas? If you have anything, I would like to
 see it.

 Thanks,
 --
 OSS Platform Development Division, NEC
 KaiGai Kohei kai...@ak.jp.nec.com

Re: [RFC] A new hook: invoke_handler and web-application security

2009-04-07 Thread Graham Dumpleton

2009/4/8 KaiGai Kohei kai...@ak.jp.nec.com:
 Graham Dumpleton wrote:
 Explain first why using FASTCGI and suexec wouldn't be a better option?

 Thease are limited to cgi applications, so we cannot apply such kind
 of restriction on the built-in script languages and references on
 static documents (like *.html).

FASTCGI is not restricted to CGI applications. At least in the sense
that FASTCGI allows persistent processes rather than one off processes
like CGI. FASTCGI bindings are available for many different languages,
including scripting languages, so what 'built-in script languages' are
you talking about? The suexec mechanism comes into play as it allows
FASTCGI processes to run as a different user than Apache process.

The only reason for doing what you want in the Apache server child
processes is if they need to work directly with the internal Apache C
APIs to do stuff. You haven't yet demonstrated that that is what you
really need though and why FASTCGI couldn't be used instead.

Graham

Using unicode host names with Apache.

2009-04-02 Thread Graham Dumpleton

Is Apache capable of hosting sites with a unicode host name? Is it
just a matter of listing the IDNA(RFC3490) variant of the name in
ServerName or ServerAlias?

Is this the only way it can be done or if configuration files are
written as UTF-8, could the host name be listed in its UTF-8 form?

Graham

Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-17 Thread Graham Dumpleton

2009/2/17 Mladen Turk mt...@apache.org:
 Graham Dumpleton wrote:

 2009/2/17 Joe Orton jor...@redhat.com:

 I did used to perform a dup, but was told that this would cause
 problems with file locking. Specifically was told:

 I'm getting lost here.  What has file locking got to do with it?  Does
 mod_wscgi rely on file locking somehow?


 I'm lost as well :)

Consider:

  fd1 = 

  lock(fd1)

  fd2 = dup(fd1)

  close(fd2) # will release the lock under some lock APIs even though
not last reference to underlying file object

  write(fd1) # lock has already been released so not gauranteed that only writer

  close(fd1)

At least that is how I understand it from what is being explained to
me and pointed out in various documentation.

So, if fd2 is the file descriptor created for file bucket in Apache,
if it gets closed before application later wants to write to file
through fd1, then application has lost its exclusive ownership
acquired by way of the lock and something else could have acquired
lock and started modifying it on basis that it has exclusive onwership
at that time.

 In WSGI applications, it is possible for the higher level Python web
 application to pass back a file object reference for the response with
 the intent that the WSGI adapter use any optimised methods available
 for sending it back as response. This is where file buckets come into
 the picture to begin with.

 Now it looks that you are trying to intermix the third party
 maintained native OS file descriptors and file buckets.
 You can create the apr_file_t from apr_os_file_t

Which is what it does. Simplified code below:

  apr_os_file_t fd = -1;
  apr_file_t *tmpfile = NULL;

  fd = PyObject_AsFileDescriptor(filelike);

  apr_os_file_put(tmpfile, fd, APR_SENDFILE_ENABLED, self-r-pool);

 (Think you'll have platform portability issues there)

The optimisation is only supported on UNIX systems.

 but the major problem would be to ensure the life cycle
 of the object, since Python has it's own GC and httpd has
 it's pool.
 IMHO you will need a new apr_bucket provider written in
 Python and C for something like that.

CPython uses reference counting. What is referred to as GC in Python
is actually just a mechanism that kicks in under certain circumstances
to break cycles between reference counted objects.

Having a special bucket type which holds a reference to the Python
file object will not help anyway. This is because the close() method
of the Python file object can be called prior to the file bucket being
destroyed. This closing of the Python file object would occur before
the delayed write of file bucket resulting due to the EOS
optimisation. So, same problem as when using naked file descriptor.

Also, using a special bucket type opens another can of works. This is
because multiple interpreters are supported as well as multithreading.
Thus it would be necessary to track the named interpreter in use
within the bucket and have to reaquire the lock on the interpreter
being used and ensure thread state is correctly reinstated. Although
possible to do, it gets a bit messy.

Holding onto the file descriptor to allow the optimisation isn't
really desirable for other reasons as well. This is because the WSGI
specification effectively requires the response content to have been
flushed out to the client before the final call back into the
application to clean up things. In the final call back into the
application to perform cleanup and close stuff like files, it could
technically rewrite the content of the file. If Apache has not
finished writing out the contents of the file, presuming the Python
file object hadn't been closed, then Apache would end up writing
different content to what was expected and possibly truncated content
if file resized.

Summary, you need to have a way of knowing that when you flush
something that it really has been flushed and that Apache is all done
with it.

Graham

Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-17 Thread Graham Dumpleton

2009/2/17 Mladen Turk mt...@apache.org:
 Graham Dumpleton wrote:

 2009/2/17 Mladen Turk mt...@apache.org:

 Graham Dumpleton wrote:

 2009/2/17 Joe Orton jor...@redhat.com:

 I did used to perform a dup, but was told that this would cause
 problems with file locking. Specifically was told:

 I'm getting lost here.  What has file locking got to do with it?  Does
 mod_wscgi rely on file locking somehow?

 I'm lost as well :)

 Consider:

  fd1 = 

  lock(fd1)

  fd2 = dup(fd1)

  close(fd2) # will release the lock under some lock APIs even though
 not last reference to underlying file object

  write(fd1) # lock has already been released so not gauranteed that only
 writer

  close(fd1)

 At least that is how I understand it from what is being explained to
 me and pointed out in various documentation.

 So, if fd2 is the file descriptor created for file bucket in Apache,
 if it gets closed before application later wants to write to file
 through fd1, then application has lost its exclusive ownership
 acquired by way of the lock and something else could have acquired
 lock and started modifying it on basis that it has exclusive onwership
 at that time.


 Well, like said that won't work, neither is portable
 (eg, apr_os_file_t is HANDLE on win32)

I already said I only support the optimisation on UNIX. I don't care
about Windows.

 What you will need is the code that will take the Python
 object and invoke Python file api feeding the apr_bucket.
 (Basically writing the apr_bucket_python_file).

As I already tried to explain, even for the case of the bucket being
used to hold a reference to the Python object, that will not work
because of the gaurantees that WSGI applications require regarding
data needing to be flushed.

 However the simplest thing might be an intermediate temp file, in
 which case httpd could reference the file name not the file
 object itself.

Which would likely be slower than using existing fallback streaming
mechanism available that reads file into memory in blocks and pushes
them through as transient buckets.

 Not sure how woule that work with dynamic file
 since apr and python might use different platform locking
 mechanisms.

Python uses operating system locking mechanisms, just like APR library would.

Graham

Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-16 Thread Graham Dumpleton

2009/2/16 Joe Orton jor...@redhat.com:
 On Sat, Feb 14, 2009 at 10:25:08AM +1100, Graham Dumpleton wrote:
 ...
 What the end result of the code is, is that if you have a file bucket
 getting this far where length of file is less than 8000 and an EOS
 follows it, then the actual file bucket is held over rather than data
 being read and buffered. This is as commented is to avoid doing an
 mmap+memcpy. What it means though is that the file descriptor within
 the file bucket must be maintained and cannot be closed as soon as
 ap_pass_brigade() has been called.

 The call to:

ap_save_brigade(f, ctx-b, b, ctx-deferred_write_pool);

 in that code path should result in the FILE bucket and the contained fd
 being dup()ed.  (Though if that is failing, you wouldn't know because of
 the lack of error checking)

 You say:

 For me this is an issue as the file descriptor has been supplied from
 a special object returned by a higher level application and it would
 be hard to maintain the file as open beyond the life of the request,
 up till end of keep alive or a subsequent request over same
 connection. Doing a dup on the file decriptor is also not necessarily
 an option.

 can you explain why a dup shouldn't work?

I did used to perform a dup, but was told that this would cause
problems with file locking. Specifically was told:

I am not sure, but it looks like mod_wsgi duplicates the original file
descriptor, sends the file's data, closes the duplicated file
descriptor, and then calls the iterable's close method. That may be
problematic for applications that are using fcntl-based locking; the
closing of the duplicate file descriptor will release the lock before
the iterable's close method gets a chance to execute, so the close
method can no longer depend on the file remaining locked. Ideally, the
duplicate file descriptor shouldn't be closed until *after* the
iterable's close() method has been called.

I never researched it properly as didn't have time at that point, so
simply believed what was being told. A quick check of flock manual
page says:

Locks created by flock() are associated with an open file table
entry. This means that duplicate file descriptors (created by, for
example, fork(2) or dup(2)) refer to the same lock, and this lock may
be modified or released using any of these descriptors. Furthermore,
the lock is released either by an explicit LOCK_UN operation on any of
these duplicate descriptors, or when all such descriptors have been
closed.

So for flock() there is no problem. As to fcntl locking which person
mentioned, I haven't been able to find yet any description of the
behaviour of locks in context of dup'd file descriptors.

If you know of source of information which explains what happens for
fcntl locking and dup'd file descriptors that would help clear things
up. In the mean time I'll keep looking for any information about it.

Graham

Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-16 Thread Graham Dumpleton

2009/2/17 Joe Orton jor...@redhat.com:
 On Mon, Feb 16, 2009 at 10:52:15PM +1100, Graham Dumpleton wrote:
 2009/2/16 Joe Orton jor...@redhat.com:
  You say:
 
  For me this is an issue as the file descriptor has been supplied from
  a special object returned by a higher level application and it would
  be hard to maintain the file as open beyond the life of the request,
  up till end of keep alive or a subsequent request over same
  connection. Doing a dup on the file decriptor is also not necessarily
  an option.
 
  can you explain why a dup shouldn't work?

 I did used to perform a dup, but was told that this would cause
 problems with file locking. Specifically was told:

 I'm getting lost here.  What has file locking got to do with it?  Does
 mod_wscgi rely on file locking somehow?

The mod_wsgi package is a gateway for hosting Python web applications.
Modern version of mod_python, but implementing generic/portable Python
WGSI interface rather than being Apache specific. It is what all the
Python people ditching mod_python are moving to.

In WSGI applications, it is possible for the higher level Python web
application to pass back a file object reference for the response with
the intent that the WSGI adapter use any optimised methods available
for sending it back as response. This is where file buckets come into
the picture to begin with.

Now, this file object was created by the Python application and it is
still the owner of it. If it is a file that it had first been
modifying and it needed exclusivity on that, it could well have used
file locks on it. Because locks are involved, the order in which files
contents are used for response and the closure and unlocking of the
file are important.

It appears that fcntl locking on some platforms has the behaviour that
if file descriptor is dup'd, closure of the first reference to the
file will cause release of the lock. That is, lock will not be
released only when last reference to the file is closed. Problems
therefore can arise if you have to dup the file descriptor, because if
the dup'd file descriptor gets closed before Python application had
finished with the file object, possibly involving it having to modify
the file after contents sent, something else could lock it and start
modifying the file before it was done.

In simple terms, the mod_wsgi module doesn't internally open the file
in the first place. Instead it comes from a higher level application
and one can't do things at the lower level that would change the state
of something like the locks associated with the file.

Graham

Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-13 Thread Graham Dumpleton

In ap_core_output_filter() there exists the code starting with:

/* Completed iterating over the brigade, now determine if we want
 * to buffer the brigade or send the brigade out on the network.
 *
 * Save if we haven't accumulated enough bytes to send, the connection
 * is not about to be closed, and:
 *
 *   1) we didn't see a file, we don't have more passes over the
 *  brigade to perform,  AND we didn't stop at a FLUSH bucket.
 *  (IOW, we will save plain old bytes such as HTTP headers)
 * or
 *   2) we hit the EOS and have a keep-alive connection
 *  (IOW, this response is a bit more complex, but we save it
 *   with the hope of concatenating with another response)
 */
if (nbytes + flen  AP_MIN_BYTES_TO_WRITE
 !AP_BUCKET_IS_EOC(last_e)
 ((!fd  !more  !APR_BUCKET_IS_FLUSH(last_e))
|| (APR_BUCKET_IS_EOS(last_e)
 c-keepalive == AP_CONN_KEEPALIVE))) {

This is some sort of optimisation which in the case of a keep alive
connection, will hold over sending data out on connection until later
if EOS is present and amount of data is less than nominal minimum
bytes to send.

Later in this section of code it has:

/* Do a read on each bucket to pull in the
 * data from pipe and socket buckets, so
 * that we don't leave their file descriptors
 * open indefinitely.  Do the same for file
 * buckets, with one exception: allow the
 * first file bucket in the brigade to remain
 * a file bucket, so that we don't end up
 * doing an mmap+memcpy every time a client
 * requests a 8KB file over a keepalive
 * connection.
 */
if (APR_BUCKET_IS_FILE(bucket)  !file_bucket_saved) {
file_bucket_saved = 1;
}
else {
const char *buf;
apr_size_t len = 0;
rv = apr_bucket_read(bucket, buf, len,
 APR_BLOCK_READ);
if (rv != APR_SUCCESS) {
ap_log_cerror(APLOG_MARK, APLOG_ERR, rv,
  c, core_output_filter:
   Error reading from bucket.);
return HTTP_INTERNAL_SERVER_ERROR;
}
}

What the end result of the code is, is that if you have a file bucket
getting this far where length of file is less than 8000 and an EOS
follows it, then the actual file bucket is held over rather than data
being read and buffered. This is as commented is to avoid doing an
mmap+memcpy. What it means though is that the file descriptor within
the file bucket must be maintained and cannot be closed as soon as
ap_pass_brigade() has been called.

For me this is an issue as the file descriptor has been supplied from
a special object returned by a higher level application and it would
be hard to maintain the file as open beyond the life of the request,
up till end of keep alive or a subsequent request over same
connection. Doing a dup on the file decriptor is also not necessarily
an option.

The end result then is that later when file bucket is processed, the
file descriptor has already been closed and one gets the error:

  (9)Bad file descriptor: core_output_filter: writing data to the network

I know that I can circumvent the EOS optimisation by inserting a flush
bucket, but based on documentation it isn't gauranteed that a flush
bucket will always propagate down the filter chain and actually push
out data.

/**
 * Create a flush  bucket.  This indicates that filters should flush their
 * data.  There is no guarantee that they will flush it, but this is the
 * best we can do.
 * @param list The freelist from which this bucket should be allocated
 * @return The new bucket, or NULL if allocation failed
 */
APU_DECLARE(apr_bucket *) apr_bucket_flush_create(apr_bucket_alloc_t *list);

How can one gaurantee that the file bucket will actually be flushed
and not held over by a filter?

If it gets to the core output filter, another way to avoid EOS
optimisation is to forcibly set keepalive to AP_CONN_CLOSE, which for
the application concerned is probably reasonable, but is obviously a
bit of a hack.

Finally, my problem only arises because I insert an eos after file
bucket and before calling ap_pass_brigade(). If one uses ap_send_fd(),
it doesn't insert eos before calling ap_pass_brigade(), with something
else obviously later inserting eos. If ap_pass_brigade() is called
without eos first time and only later with eos, will that help in
ensuring it is flushed out, or can

Re: changing mod_wombat's name

2008-12-16 Thread Graham Dumpleton

2008/12/17 Brian McCallister bri...@skife.org:
 Actually, -1

 Calling it luau is begging for mass user confusion via misspelings in
 the LoadModule directive.

 How about:

 ap_lua, moon, or just bite the bullet and use mod_lua

Given that there could be a class of such scripting language modules
over time, why not:

  mod_script_lua

to make it more clear for what purpose it may be. Then fits in with
auth and proxy related modules having common prefix.

Graham

 -Brian

 2008/12/8 Justin Erenkrantz jus...@erenkrantz.com:
 On Mon, Dec 8, 2008 at 5:04 PM, Roy T. Fielding field...@gbiv.com wrote:
 mod_luau  http://en.wikipedia.org/wiki/Luau

 +1.  -- justin

Re: [VOTE] Release Apache HTTP server 2.3.0-alpha

2008-12-08 Thread Graham Dumpleton

2008/12/9 William A. Rowe, Jr. [EMAIL PROTECTED]:
 Paul Querna wrote:

 The change fixed velocity.apache.org, but broke www.apache.org.

 All of this sub-request + output filter stuff started in r620133 kinda
 needs some more thought.

 My thought is that fast_internal_subrequest (which I last refactored, but
 was bogusly introduced for mod_negotiation) must die, now.

 Votes?

 +1 here to kill fast_internal_subrequest and provide the one fastest
 mechanism that we can safely craft.

Are you referring to ap_internal_fast_redirect()?

Still also used by mod_dir, with its use in the latter in the past
causing odd issues with duplication of information in tables. For
where this showed up in mod_python see:

  http://issues.apache.org/jira/browse/MODPYTHON-146

Was that duplication of information ever addressed?

Graham

Re: Dyanamic usage of Apache hash table.

2008-12-02 Thread Graham Dumpleton

2008/12/3 Jayasingh Samuel [EMAIL PROTECTED]:
 Hai,

 I have a hash map which takes its input and key from a file.. The file
 content will be changed automatically and i want to reload the hash map
 automatically after hitting some handler.. What i see is, after reloading
 the hash map, iam able to see the new Contents and the old contents
 alternatively..
 May be the old threads is having the old Contents. Not sure.

 Please reply me if you have solution for this to reload the hash map
 automatically. also how to kill the old threads which is having the old hash
 map.

Is your issue that you aren't considering that Apache is multi process
and so your trigger request is only handled by one process and not all
active processes.

You may be better off looking at how underlying table in RewriteMap's
in mod_rewrite are implemented. These are cached but reread when file
change is detected.

BTW, this is not really the right list for this. Perhaps use
modules-dev list instead.

Graham

Re: [VOTE] move all mod_*.h with public APIs to ./include folder

2008-04-12 Thread Graham Dumpleton

2008/4/13 Guenter Knauf [EMAIL PROTECTED]:
 Hi,

  Please specify which headers specifically you consider to be public.
  at least:

  mod_cache.h
  mod_core.h
  mod_dav.h
  mod_dbd.h
  mod_proxy.h
  mod_session.h

Also:

  mod_auth.h

So it doesn't get missed out of Windows installers like it has in the past.

Graham

Re: overview of MPMs?

2008-04-10 Thread Graham Dumpleton

2008/4/11 Geoff Thorpe [EMAIL PROTECTED]:
 Hi all,

  Just wondering if anyone has a link or howto that would give me some
  background info on the interface with the different MPM
  modes/implementations? I'm not even sure where the different
  implementations are in the source tree, but I'm curious to take a look
  if someone could point me in the right direction. TIA.

If wanting to understand how each work, then you might start by looking at:

  
http://www.fmc-modeling.org/category/projects/apache/amp/4_3Multitasking_server.html

This was based on Apache 2.0, but should principally be the same.

Graham

Re: Question: how to change the request in input filter and pass it to proxy

2008-04-02 Thread Graham Dumpleton

On 03/04/2008, Olexandr Prokhorenko [EMAIL PROTECTED] wrote:
 Hi everyone,

  I am working on the input filter which is going to catch on input requests,
  find the bucket with Host: , modify it and pass it through.  I will modify
  it to something that does not belong to my httpd server, so I need to pass
  it through the proxy module (my guess ;).  I can't use either the static
  ProxyPass or ProxyReversePass, because the host will be modified dynamically
  and it will depend on what is called and substitute it from the database
  call.

  It wasn't a big deal to catch on the Host: (well, I may also need to look
  for something like GET http://blablabla.com/, but this is not the highest
  priority now).  I have created a new HEAP bucket, put it instead of an
  original one, however, a) it looks to me that Apache makes a call and gives
  an error saying file wasn't found, however the Web page displayed is the
  correct one, like not being rewritten, and the httpd child crashes; and b) I
  need to send it to proxy somehow and pass the call to it.

  I am not very good on concept, my book on Apache modules is still on the
  way, but I'd very appreciate any hints on this.

  Thank you.  I'd very thankful for cc: me as well.

I think you may perhaps be going about this the wrong way. One can
cause a request to be proxied by doing something like the following.
This example uses mod_python, but could be done in C code or mod_perl
as well.

import posixpath

from mod_python import apache

def fixuphandler(req):

  if req.proxyreq:
return apache.DECLINED

  normalised_uri = posixpath.normpath(req.uri)

  if normalised_uri:
if normalised_uri != '/' and req.uri[-1] == '/':
  normalised_uri += '/'

  length = len(req.filename)
  length -= len(req.hlist.directory) - 1
  length += len(req.path_info or '')

  baseurl = normalised_uri[:-length]
  path = normalised_uri[len(baseurl):]

  # THIS IS THE IMPORTANT BIT WHICH SETS UP PROXY.

  req.proxyreq = apache.PROXYREQ_REVERSE
  req.uri = 'http://www.dscpl.com.au' + path
  req.filename = 'proxy:%s' % req.uri
  req.handler = 'proxy-server'

  return apache.OK

If you didn't want to proxy a particular request, just return DECLINED
when you know so.

Graham

Re: Question: how to change the request in input filter and pass it to proxy

2008-04-02 Thread Graham Dumpleton

This technique is taken from modules/proxy/mod_proxy.c in Apache
source code. See the proxy_detect() function in that file for a C code
example.

One could change request headers on way through in same handler
function as setup the proxy. Still need an output filter if you want
to change handlers in response. I haven't used input/output filters
that modify headers, so can't comment on that.

Graham

On 03/04/2008, Olexandr Prokhorenko [EMAIL PROTECTED] wrote:
 Hi,

  Do I understand you right and you're proposing to have just an Apache module
  (not a hook on either output or input filter) and modify the request then?

  I'd be very thankful if you can point me to any typical C code example doing
  that.  I'm not very good at Python and never used to code with it.  While I
  think that I understand what you're talking about, I'd be much comfortable
  with being able to sneak at the example ;)

  What I will also need is to rewrite cookies.   Will I be able to do that
  this way as well?  I decided to try on input filter just because I thought I
  may get in troubles rewriting the cookies.  I'll also need to modify the
  proxy response, so it's a place for output filter, isn't it?

  Thanks, your reply was very quick.

  On Wed, Apr 2, 2008 at 9:39 PM, Graham Dumpleton [EMAIL PROTECTED]
  wrote:


   On 03/04/2008, Olexandr Prokhorenko [EMAIL PROTECTED] wrote:
Hi everyone,
   
 I am working on the input filter which is going to catch on input
   requests,
 find the bucket with Host: , modify it and pass it through.  I will
   modify
 it to something that does not belong to my httpd server, so I need to
   pass
 it through the proxy module (my guess ;).  I can't use either the
   static
 ProxyPass or ProxyReversePass, because the host will be modified
   dynamically
 and it will depend on what is called and substitute it from the
   database
 call.
   
 It wasn't a big deal to catch on the Host: (well, I may also need to
   look
 for something like GET http://blablabla.com/, but this is not the
   highest
 priority now).  I have created a new HEAP bucket, put it instead of an
 original one, however, a) it looks to me that Apache makes a call and
   gives
 an error saying file wasn't found, however the Web page displayed is
   the
 correct one, like not being rewritten, and the httpd child crashes; and
   b) I
 need to send it to proxy somehow and pass the call to it.
   
 I am not very good on concept, my book on Apache modules is still on
   the
 way, but I'd very appreciate any hints on this.
   
 Thank you.  I'd very thankful for cc: me as well.
  
   I think you may perhaps be going about this the wrong way. One can
   cause a request to be proxied by doing something like the following.
   This example uses mod_python, but could be done in C code or mod_perl
   as well.
  
   import posixpath
  
   from mod_python import apache
  
   def fixuphandler(req):
  
if req.proxyreq:
  return apache.DECLINED
  
normalised_uri = posixpath.normpath(req.uri)
  
if normalised_uri:
  if normalised_uri != '/' and req.uri[-1] == '/':
normalised_uri += '/'
  
length = len(req.filename)
length -= len(req.hlist.directory) - 1
length += len(req.path_info or '')
  
baseurl = normalised_uri[:-length]
path = normalised_uri[len(baseurl):]
  
# THIS IS THE IMPORTANT BIT WHICH SETS UP PROXY.
  
req.proxyreq = apache.PROXYREQ_REVERSE
req.uri = 'http://www.dscpl.com.au' + path
req.filename = 'proxy:%s' % req.uri
req.handler = 'proxy-server'
  
return apache.OK
  
   If you didn't want to proxy a particular request, just return DECLINED
   when you know so.
  
   Graham
  




 --
  Alexander Prohorenko.

Re: Reading of input after headers sent and 100-continue.

2008-01-30 Thread Graham Dumpleton

On 31/01/2008, Brian Smith [EMAIL PROTECTED] wrote:

  -Original Message-
  From: Graham Dumpleton [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, January 29, 2008 4:29 PM
  To: modules-dev@httpd.apache.org
  Subject: Reading of input after headers sent and 100-continue.

  The HTTP output filter will send a 100 result back to a
  client when the first attempt to read input occurs and an
  Except header with 100-continue was received. Ie., from
  http_filters.c we have:

  if ((ctx-state == BODY_CHUNK ||
(ctx-state == BODY_LENGTH  ctx-remaining  0)) 
f-r-expecting_100  f-r-proto_num = HTTP_VERSION(1,1)) {

 This is from ap_http_filter(). If you look at http_core.c, you can see
 that it is registered as an input filter, not an output filter.

I knew what I meant, it just didn't come out right. I blame the keyboard. :-)

 So, if
 you never read from the input brigade, the 100 continue will never be
 sent. I'm not sure if the module needs to just ignore the input brigade,
 or actively throw it away, though.

  The problem then is if only after having sent some response
  content and triggering the response headers to be sent one
  actually goes to read the input, then the HTTP output filter
  above is still sending the 100 status response string. In
  other words, the 100 response status string is appearing in
  the middle of the actual response content.

 Doctor, it hurts when I do this! :)

 If a module is sending a response before a 100 continue has been sent,
 then it shouldn't read from the input brigade, because it is going
 against the HTTP spec.

Can you point to the specific bit of the HTTP specification which says that.

Section 8.2.3 would to me appear to have slightly conflicting statements.

In particular it says:

Because of the presence of older implementations, the protocol
allows ambiguous situations in which a client may send Expect: 100-
continue without receiving either a 417 (Expectation Failed) status
or a 100 (Continue) status. Therefore, when a client sends this header
field to an origin server (possibly via a proxy) from which it has
never seen a 100 (Continue) status, the client SHOULD NOT wait for an
indefinite period before sending the request body.

Effectively, if a 200 response came back, it seems to suggest that the
client still should send the request body, just that it 'SHOULD NOT
wait for an indefinite period'. It doesn't say explicitly for the
client that it shouldn't still send the request body if another
response code comes back.

This is what I have seen with curl as a client. If one sends back a
200 response without reading any input, curl still sends the request
content, but one does notice a slight pause as some timeout occurs
only at which point it sends the request content. In other words, curl
doesn't send it as soon as it sees the 200 response, but it does still
send it.

So technically, if the client has to still send the request content,
something could still read it. It would not be ideal that there is a
delay depending on what the client does, but would still be possible
from what I read of this section.

But then, later it says:

Upon receiving a request which includes an Expect request-header
field with the 100-continue expectation, an origin server MUST
either respond with 100 (Continue) status and continue to read
from the input stream, or respond with a final status code. The
origin server MUST NOT wait for the request body before sending
the 100 (Continue) response. If it responds with a final status
code, it MAY close the transport connection or it MAY continue
to read and discard the rest of the request.  It MUST NOT
perform the requested method if it returns a final status code.

The critical bit here I guess is:

   If it responds with a final status
code, it MAY close the transport connection or it MAY continue
to read and discard the rest of the request.

This suggests that the server can discard the request body if handler
didn't try and read it before returning a response. What it means by:

 It MUST NOT
perform the requested method if it returns a final status code.

I am not quite sure because if the response headers was returned by
the handler you are already in the process of performing the requested
method, so how can you not now do it.

What is also a bit worrying to me is that what might be allowed by a
handler for a request can be changed based on the presence of
100-continue, something which is out of the control of the handler and
the web server receiving the request.

Specifically, if 100-continue is not present and the client therefore
sent the request body anyway, then technically nothing to stop the
handler reading the input after the response headers have been sent.
For example, the handler may generate response headers for same
content length and only then starting reading input and returning it
as the response body

Re: Reading of input after headers sent and 100-continue.

2008-01-30 Thread Graham Dumpleton

For those on the Python web sig who might be thinking they missed part
of the conversation, you have. This is the second half of a
conversation started on Apache modules-dev list about Apache
100-continue processing. If interested, you can see the first half of
the conversation at:

  http://mail-archives.apache.org/mod_mbox/httpd-modules-dev/200801.mbox/browser

Graham

On 31/01/2008, Brian Smith [EMAIL PROTECTED] wrote:
 Graham Dumpleton wrote:
  Effectively, if a 200 response came back, it seems to suggest
  that the client still should send the request body, just that
  it 'SHOULD NOT wait for an indefinite period'. It doesn't say
  explicitly for the client that it shouldn't still send the
  request body if another response code comes back.

 This behavior is to support servers that don't understand the Expect:
 header.

 Basically, if the server responds with a 100, the client must send the
 request body. If the server responds with a 4xx or 5xx, the client must
 not send the request body. If the server responds with a 2xx or a 3xx,
 then the client should must send (the rest of) the request body, on the
 assumption that the server doesn't understand Expect:. To be
 completely compliant, a server should always respond with a 100 in front
 of a 2xx or 3xx, I guess. Thanks for clarifying that for me. I guess the
 rules make sense after all.

  So technically, if the client has to still send the request
  content, something could still read it. It would not be ideal
  that there is a delay depending on what the client does, but
  would still be possible from what I read of this section.

 You are right. To avoid confusion, you should probably force mod_wsgi to
 send a 100-continue in front of any 2xx or 3xx response.

  It MUST NOT perform the requested method if it returns a final status
 code.

 The implication is that the only time it will avoid sending a 100 is
 when it is sending a 4xx, and it should never perform the requested
 method if it already said the method failed. The only excuse for not
 sending a 100 is that you don't know about Expect: 100-continue. But,
 that can't be true if you are reading this part of the spec!

 If it responds with a final status
  code, it MAY close the transport connection or it MAY continue
  to read and discard the rest of the request.

 If the client receives a 2xx or 3xx without a 100 first, it has to send
 the request body (well, depending on which 3xx it is, that is not true).
 But, the server doesn't have to read it! But, again, the assumption is
 that the server will only send a response without a 100 if it is a 4xx
 or 5xx.

  It seems by what you are saying that if 100-continue is
  present this wouldn't be allowed, and that to ensure correct
  behaviour the handler would have to read at least some of the
  request body before sending back the response headers.

 You are right, I was wrong.

   Since ap_http_filter is an input filter only, it should be
  enough to
   just avoid reading from the input brigade. (AFAICT, anyway.)
 
  In other words block the handler from reading, potentially
  raise an error in the process. Except to be fair and
  consistent, you would have to apply the same rule even if
  100-continue isn't present. Whether that would break some
  existing code in doing that is the concern I have, even if it
  is some simple test program that just echos back the request
  body as the response body.

 Technically, even if the server returns a 4xx, it can still read the
 request body, but it might not get anything or it might only get part of
 it. I guess, the change to the WSGI spec that is needed is to say that
 the gateway must not send the 100 continue if it has already sent some
 headers, and that it should send a 100 continue before any 2xx or 3xx
 code, which is basically what James Knight suggested (sorry James). The
 gateway must indicate EOF if only a partial request body was received. I
 don't think the gateway should be required to provide any of the partial
 request content on a 4xx, though.

 - Brian

Reading of input after headers sent and 100-continue.

2008-01-29 Thread Graham Dumpleton

A question about HTTP output filter and 100-continue.

The HTTP output filter will send a 100 result back to a client when
the first attempt to read input occurs and an Except header with
100-continue was received. Ie., from http_filters.c we have:

/* Since we're about to read data, send 100-Continue if needed.
 * Only valid on chunked and C-L bodies where the C-L is  0. */
if ((ctx-state == BODY_CHUNK ||
(ctx-state == BODY_LENGTH  ctx-remaining  0)) 
f-r-expecting_100  f-r-proto_num = HTTP_VERSION(1,1)) {
char *tmp;
apr_bucket_brigade *bb;

tmp = apr_pstrcat(f-r-pool, AP_SERVER_PROTOCOL,  ,
  ap_get_status_line(100), CRLF CRLF, NULL);
bb = apr_brigade_create(f-r-pool, f-c-bucket_alloc);
e = apr_bucket_pool_create(tmp, strlen(tmp), f-r-pool,
   f-c-bucket_alloc);
APR_BRIGADE_INSERT_HEAD(bb, e);
e = apr_bucket_flush_create(f-c-bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, e);

ap_pass_brigade(f-c-output_filters, bb);
}

Now, if one is generating response content prior to having read any
input, if one hasn't buffered the response output, by virtue of
explicitly flushing it in some way, this will trigger any response
headers which have been set up to be sent.

The problem then is if only after having sent some response content
and triggering the response headers to be sent one actually goes to
read the input, then the HTTP output filter above is still sending the
100 status response string. In other words, the 100 response status
string is appearing in the middle of the actual response content.

My question then is, what should a handler do if it is trying to
generate response content (non buffered), before having attempted to
read any input, ie., what is the correct way to stop Apache still
sending the 100 status response for the 100-continue header? I know
that setting r-expecting_100 to 0 at time that first response content
is being sent will prevent it, but is there something else that should
be done instead.

BTW, this is partly theoretical in that have no actual code that is
doing this, but technically in systems like mod_python or mod_wsgi
where one doesn't know what the Python application code running on top
is doing, a user could trigger this situation.

This is occurring when testing with Apache 2.2.4.

Any ideas appreciated.

Graham

Re: Is there are any way to know if the request is regular (http) or SSL (https) in a module?

2007-12-17 Thread Graham Dumpleton

On 18/12/2007, Sander Temme [EMAIL PROTECTED] wrote:

 On Dec 17, 2007, at 6:36 PM, Eric Covener wrote:

  I would like to know the request type in my module
  (handler/filter), is there any way to know that (HTTP
  vs HTTPS)?
 
  apr_table_get(r-subprocess_env, HTTPS)  might be  what you want

 That gets set in the Fixup hook, relatively late in the request
 processing process.  You could call ap_http_scheme(r) to run the
 request scheme hook, which will return https if SSL is disabled or
 optional on the vhost that accepted the request.

Or better still, use the function registered by the ssl module for the purpose:

#if AP_SERVER_MAJORVERSION_NUMBER = 2
APR_DECLARE_OPTIONAL_FN(int, ssl_is_https, (conn_rec *));
static APR_OPTIONAL_FN_TYPE(ssl_is_https) *wsgi_is_https = NULL;
#endif

   ...

   if (!wsgi_is_https)
wsgi_is_https = APR_RETRIEVE_OPTIONAL_FN(ssl_is_https);

if (wsgi_is_https  wsgi_is_https(r-connection))
apr_table_set(r-subprocess_env, HTTPS, 1);

Ie., HTTPS is set by the above sort of process, but as pointed out
that is done only in fixup phase, but you can call the ssl_is_https()
function direct if in earlier phase.

Graham

Where is Timeout configuration directive value stored?

2007-11-16 Thread Graham Dumpleton

The function in server/core.c called for the Timeout directive is:

static const char *set_timeout(cmd_parms *cmd, void *dummy, const char *arg)
{
const char *err = ap_check_cmd_context(cmd,
NOT_IN_DIR_LOC_FILE|NOT_IN_LIMIT);

if (err != NULL) {
return err;
}

cmd-server-timeout = apr_time_from_sec(atoi(arg));
return NULL;
}

Ie., the Timeout directive value is stored in 'timeout' parameter of
the 'server_rec' for host context that directive appears in.

Presumably, the Timeout directive in global part of Apache
configuration would result in that being put in server_rec for main
server context.

When ap_fixup_virtual_hosts() is later run, presumably that is then
copied into vhost server_rec presuming that a vhost didn't contain its
own Timeout directive setting. Ie., that function contains:

if (virt-timeout == 0)
virt-timeout = main_server-timeout;

Based on that I am presuming that if either r-server-timeout or
r-connection-base_server-timeout is accessed, they should have a
non zero value corresponding to what the Timeout directive was being
set to, or the default of 300 seconds if not set, but that isn't want
I am seeing in my own handler code and instead 'timeout' is 0.

I am a bit confused at this point as mod_cgi uses:

apr_file_pipe_timeout_set(*script_out, r-server-timeout);

but if r-server-timeout is 0, then a write would return straight
away if it blocked instead of waiting default of 300 seconds.

What am I missing? I have seen this on Apache 2.2.4 and 2.2.6 on two
different Mac OS X systems.

BTW, a long time ago, either here on modules-dev, I queried about code
in mod_cgid and whether it was correct, namely it says:

/* Keep writing data to the child until done or too much time
 * elapses with no progress or an error occurs.
 */
rv = apr_file_write_full(tempsock, data, len, NULL);

My concern was that this wouldn't actually ever timeout as tempsock
never had a timeout value set for it. I never got an answer at the
time

In some code I modeled on how mod_cgid was working, am now seeing
problems which suggests that not only is mod_cgid possibly broken in
not having a timeout, but one can actually deadlock the whole CGI
process and the thread in the Apache child.

I'll admit that my own code doesn't exactly mirror how the cgid
process launcher works, but the issue I see is that if a request has
POST content which is greater in size than the socket buffers can
hold, and a CGI script doesn't consume the POST content before sending
a response, also greater than the socket buffer sizes, then
cgid_handler() can block indefinitely in apr_file_write_full().

This will occur because the CGI process will in turn be blocked in
trying to send its response. It can't return until some data is read
by cgid_handler() in Apache child process, that can't happen as
reading response is only done after the POST content is sent by calls
to apr_file_write_full(). Because a 'timeout' wasn't set on tempsock
though, it will also never actually return, even after default 300
seconds, and will just deadlock.

Now, in comparing mod_cgi to mod_cgid, in mod_cgi it does make call to
set timeout on the pipe for forked process, but as shown above it uses
r-server-timeout, which keeps showing as 0 for me in my own handler
code.

Anyone like to comment on whether my analysis is correct so as to help
me understand the code and determine if mod_cgi and/or mod_cgid is
broken?

BTW, using a simple test CGI script which returns 16Kb response
without consuming POST content, and using POST with 16Kb of data
against it, I have now duplicated with CGI module what I was seeing in
my own code. Namely, the CGI script and Apache child process thread
both block indefinitely, not even recovering after timeout specified
by Timeout directive. My server for this test was configured with
static mod_cgid module.

Graham

Re: Where is Timeout configuration directive value stored?

2007-11-16 Thread Graham Dumpleton

Hmmm, sorted out why timeout was showing as 0. Should have just used
APR_TIME_T_FMT in the first place instead of guessing what format was
supposed to be.

Anyway, issue still stands as detailed at end of mail that mod_cgid
seems to lack setting of timeout on socket, plus the issue of deadlock
with both mod_cgi and mod_cgid. After recompiling Apache to use
mod_cgi instead of mod_cgid, showed that deadlock still occurs, but at
least for that case the timeout occurs and it recovers after period
specified by Timeout directive.

Is it just accepted that CGI scripts will always just consume all
their input, or that when they don't they will not generate a response
more than socket buffer size?

Graham

On 17/11/2007, Graham Dumpleton [EMAIL PROTECTED] wrote:
 The function in server/core.c called for the Timeout directive is:

 static const char *set_timeout(cmd_parms *cmd, void *dummy, const char *arg)
 {
 const char *err = ap_check_cmd_context(cmd,
 NOT_IN_DIR_LOC_FILE|NOT_IN_LIMIT);

 if (err != NULL) {
 return err;
 }

 cmd-server-timeout = apr_time_from_sec(atoi(arg));
 return NULL;
 }

 Ie., the Timeout directive value is stored in 'timeout' parameter of
 the 'server_rec' for host context that directive appears in.

 Presumably, the Timeout directive in global part of Apache
 configuration would result in that being put in server_rec for main
 server context.

 When ap_fixup_virtual_hosts() is later run, presumably that is then
 copied into vhost server_rec presuming that a vhost didn't contain its
 own Timeout directive setting. Ie., that function contains:

 if (virt-timeout == 0)
 virt-timeout = main_server-timeout;

 Based on that I am presuming that if either r-server-timeout or
 r-connection-base_server-timeout is accessed, they should have a
 non zero value corresponding to what the Timeout directive was being
 set to, or the default of 300 seconds if not set, but that isn't want
 I am seeing in my own handler code and instead 'timeout' is 0.

 I am a bit confused at this point as mod_cgi uses:

 apr_file_pipe_timeout_set(*script_out, r-server-timeout);

 but if r-server-timeout is 0, then a write would return straight
 away if it blocked instead of waiting default of 300 seconds.

 What am I missing? I have seen this on Apache 2.2.4 and 2.2.6 on two
 different Mac OS X systems.

 BTW, a long time ago, either here on modules-dev, I queried about code
 in mod_cgid and whether it was correct, namely it says:

 /* Keep writing data to the child until done or too much time
  * elapses with no progress or an error occurs.
  */
 rv = apr_file_write_full(tempsock, data, len, NULL);

 My concern was that this wouldn't actually ever timeout as tempsock
 never had a timeout value set for it. I never got an answer at the
 time

 In some code I modeled on how mod_cgid was working, am now seeing
 problems which suggests that not only is mod_cgid possibly broken in
 not having a timeout, but one can actually deadlock the whole CGI
 process and the thread in the Apache child.

 I'll admit that my own code doesn't exactly mirror how the cgid
 process launcher works, but the issue I see is that if a request has
 POST content which is greater in size than the socket buffers can
 hold, and a CGI script doesn't consume the POST content before sending
 a response, also greater than the socket buffer sizes, then
 cgid_handler() can block indefinitely in apr_file_write_full().

 This will occur because the CGI process will in turn be blocked in
 trying to send its response. It can't return until some data is read
 by cgid_handler() in Apache child process, that can't happen as
 reading response is only done after the POST content is sent by calls
 to apr_file_write_full(). Because a 'timeout' wasn't set on tempsock
 though, it will also never actually return, even after default 300
 seconds, and will just deadlock.

 Now, in comparing mod_cgi to mod_cgid, in mod_cgi it does make call to
 set timeout on the pipe for forked process, but as shown above it uses
 r-server-timeout, which keeps showing as 0 for me in my own handler
 code.

 Anyone like to comment on whether my analysis is correct so as to help
 me understand the code and determine if mod_cgi and/or mod_cgid is
 broken?

 BTW, using a simple test CGI script which returns 16Kb response
 without consuming POST content, and using POST with 16Kb of data
 against it, I have now duplicated with CGI module what I was seeing in
 my own code. Namely, the CGI script and Apache child process thread
 both block indefinitely, not even recovering after timeout specified
 by Timeout directive. My server for this test was configured with
 static mod_cgid module.

 Graham

Re: repeatable SystemError: bad argument to internal function

2007-10-06 Thread Graham Dumpleton

On 07/10/2007, Aaron Swartz [EMAIL PROTECTED] wrote:
 re: http://www.modpython.org/pipermail/mod_python/2007-June/023854.html

 I've found a way to make this happen repeatedly. Occurs in both 2.4.2
 and 2.5.1. I have a file where every time I read it in, I get it.

If it isn't too large, could you attach the file to:

  http://issues.apache.org/jira/browse/MODPYTHON-234

Only problem I can see though is that so far this seems to relate more
to what size reads are occurring and so suspect that on a different OS
or machine setup it may not trigger the same problem for someone else.
You might at least therefore gives some details on machine setup but
also if possible details of what is happening on client/server side
for the request which is causing it so test harness can be produced to
try and trigger the same situation.

Maybe even just discuss it on mod_python developers list first where I
have cc'd this mail.

Thanks.

Graham

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1006 matches

Mail list logo