Re: apche rewrite and reverse proxy help

2012-10-14 Thread Ben Noordhuis
On Sun, Oct 14, 2012 at 1:12 AM, Abdel  wrote:
> I have back end website running on Tomcat with the following url
> http://local.domain.com/app. External user access the website through apache
> proxy with the following url http://www.domain.com/user1 (user1, user2,
> etc... It’s uri specific to each user). I want to use apache  rewrite or/
> and reverse proxy  directive to translate the url like
> http://www.domain.com/user1 into http://local.domain.com/app?user=user1
> Please can someone help me please?

You're on the wrong mailing list. Try users.


Re: RPC over HTTP

2012-10-03 Thread Ben Noordhuis
On Wed, Oct 3, 2012 at 2:06 PM, Evgeny Shvidky  wrote:
> Hi,
>
> ap_setup_client_block() returns OK.
>
> I think the problem is that "Content-Length" header value is 1073741824 (1 
> GB) and probably apache tries to receive the whole content before it passed 
> to my module.
> Am I right?
> If yes, is there any way to tell apache to send all received data till now?

ap_get_client_block() tries to read up to the number of bytes you
requested. Now, if you passed in the value of the Content-Length
header, then it will either stall for a long time or simply fail.


Re: RPC over HTTP

2012-10-03 Thread Ben Noordhuis
On Wed, Oct 3, 2012 at 11:57 AM, Evgeny Shvidky  wrote:
> Hi,
>
> I am developing a new module on C.
> One of the requirements of my module is to receive and handle RPC over HTTP 
> protocol.
> RPC over HTTP opens two HTTP/1.1 requests:
> One with request method RPC_IN_DATA to send data to the server, and second 
> one with method RPC_OUT_DATA to send data back to the client. The body 
> consists of raw binary data, and the connections are apparently re-used for 
> several RPCs.
> Here's an example of an IN connection header:
> RPC_IN_DATA /rpc/rpcproxy.dll?:6002 HTTP/1.1
> Content-Length: 1073741824
> ...
> After connections are established client sends on "IN" channel RPC message 
> with 104 bytes.
>
> I use the following apache API's in order to read these client message.
> ap_setup_client_block(userReq, REQUEST_CHUNKED_ERROR);
> ap_should_client_block(userReq)
> ap_get_client_block(userReq, buf, size)
>
> The problem is "ap_should_client_block" function returns "1" (means there is 
> message to read) but "ap_get_client_block" returns error (-1) and nothing 
> read.

Do you check that  ap_setup_client_block() returns OK?

Does ap_get_client_block() return -1 on the first call? If yes, you
may want to step through the function in a debugger to see what the
error condition is.

> How should I read client's data?
> Is there any other API for it?

The filter API?


Re: aprmemcache question

2012-09-27 Thread Ben Noordhuis
On Thu, Sep 27, 2012 at 4:29 PM, Joshua Marantz  wrote:
> Thanks Ben,
>
> That might be an interesting hack to try, although I wonder whether some of
> our friends running mod_pagespeed on FreeBSD might run into trouble with
> it.  I did confirm that my prefork build has APR built with
> APR_HAS_THREADS, which for some reason I had earlier thought was not the
> case.

It should work, provided you linked against libapr. The FreeBSD man
page says this:

  If dlsym() is called with the special handle NULL, it is interpreted as a
  reference to the executable or shared object from which the call is being
  made.  Thus a shared object can reference its own symbols.

And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD as well.

> Do you have a feel for the exact meaning of that TTL parameter to
> apr_memcache_server_create?

You mean what units it uses? Microseconds (at least, in 2.4).


Re: aprmemcache question

2012-09-27 Thread Ben Noordhuis
On Thu, Sep 27, 2012 at 4:05 AM, Joshua Marantz  wrote:
> RE "failing the build of my module" -- the dominant usage is via
> precompiled binaries we supply.  Is there an apr query for determining
> whether apr was compiled with threads I could do on startup?

I don't think there's an official way but you know apr was compiled
with APR_HAS_THREADS when dlsym(NULL, "apr_os_thread_current") !=
NULL.

Using dlsym() like that is not quite compatible with POSIX but it
works on all the major Unices.


Re: Input filters that alter content-length

2012-09-25 Thread Ben Noordhuis
On Tue, Sep 25, 2012 at 2:51 PM, Ivan Prostran  wrote:
> Do you suggest chunked transfer encoding?

Yep. There's nary a proxy problem you can't solve with chunked encoding.


Re: Input filters that alter content-length

2012-09-25 Thread Ben Noordhuis
On Tue, Sep 25, 2012 at 1:15 PM, Ivan Prostran  wrote:
> Hi,
>
> I have the following scenario :
>
> - Apache/2.2.19 (Solaris 10 SPARC)
>
>  SetInputFilter alterxmlbody (AP_FTYPE_RESOURCE)
>  SetHandler weblogic-handler
>
>
> The handler forwards requests to multiple weblogic servers
> and the filter itself analyzes and/or changes the POST data
> in a way that the length of the incoming request may be changed.
>
> The question is :
>
> What is the proper and legitimate way to deal with this situation?
>
>
> Should the input filter set the new CL value in "headers_in" before
> returning from the callback, or just remove it from the list?
>
>
> apr_off_t length;
> apr_brigade_length (bb,1,&length);
> apr_table_setn(f->r->headers_in, "Content-Length", apr_off_t_toa(f->r->pool,
> length));
>
> Unfortunately, this approach breaks the pipeline because sometimes
> the filter needs to buffer data over more than one call, but I do not
> consider this as a problem.
>
> or
>
> apr_table_unset(f->r->headers_in, "Content-Length");

You didn't tell how your handler works (how it forwards the request)
but removing the Content-Length header is probably easiest. Does the
upstream server understand HTTP/1.1 or HTTP/1.0 with a TE header?


Re: Bucket brigade & filter thread safety

2012-09-11 Thread Ben Noordhuis
On Mon, Sep 10, 2012 at 3:53 PM, Alex Bligh  wrote:
> Ben,
>
>
>> No, but the documentation omits some crucial details.
>> apr_pool_create() is thread-safe only if:
>>
>> 1. libapr is compiled with APR_HAS_THREADS
>> 2. APR_POOL_DEBUG is turned off
>> 3. the parent pool has a thread-safe allocator (which is true for the
>> global allocator that is used when parent=NULL, provided conditions #1
>> and #2 are satisfied)
>>
>> The pools you get from httpd core satisfy #3 but a module may replace
>> e.g. r->pool with another pool that doesn't. Ergo, don't rely on a
>> pool being thread safe unless you explicitly make it so.
>
>
> Ah, OK. I had thought that was only an issue as when the pool create
> was running (at which point I am single threaded). But I can fix
> that - thanks.
>
>
 That won't solve all your problems though. Bucket brigades are not
 thread safe, you will need something to synchronize on.
>>>
>>>
>>>
>>> So what I was trying to do was to use
>>> a) the input bucket brigade in thread #1 (main thread)
>>> b) the output bucket brigade in thread #2
>>> in an attempt to avoid synchronization
>>>
>>> But what I don't understand is whether thread #2, in writing
>>> to the output filters (which presumably have a reference
>>> to r->pool) will need synchronisation.
>>
>>
>> Yes. It's not just because r->pool may or may not be synchronized, the
>> internal structure of the bucket brigade is not protected by any locks
>> either.
>
>
> Oh I understand that, but I thought in the example above only
> thread #1 would be accessing the input bucket brigade and
> only thread #2 would be accessing the output bucket brigade,
> so there would be no need for synchronisation as they were
> thread local.
>
>
>>> And if I have to synchronize, how do I do that in practice?
>>> Thread #2 does and ap_fwrite/ap_flush so I can hold a mutex
>>> there. But what do I do in thread #1, which calls ap_brigade_get
>>> and blocks? I can't hold a mutex during that. I can make it
>>> a non-blocking ap_brigade_get (if I understood how to do it)
>>
>>
>> Non-blocking reads are pretty straightforward:
>>
>>   apr_thread_mutex_lock(&mutex);
>>   rv = ap_get_brigade(f->next, bb, AP_MODE_READBYTES, APR_NONBLOCK_READ,
>> len);   if (APR_STATUS_IS_EAGAIN(rv)) apr_thread_cond_wait(&cond, &mutex);
>>   rv = ap_get_brigade(f->next, bb, AP_MODE_READBYTES, APR_NONBLOCK_READ,
>> len);   apr_thread_mutex_unlock(&mutex);
>>
>> The other thread wakes up this thread with apr_thread_cond_signal(&cond).
>
>
> I think I may not have explained what I am doing clearly. One thread
> is doing input (from the apache client), and the other output (to
> the apache client). The ap_brigade_get is in the input thread (the
> main thread) and is blocking on the client sending more data. So
> the thing that would need to wake the thread up is more data becoming
> ready from the client - nothing to do with the other
> thread. I don't know how to detect that.
>
>
>>> but what I really need is the equivalent of a select() which
>>> I can do with the mutex not held (or some way to drop the mutex
>>> during the raw reads). Any ideas?
>>
>>
>> You could set up a pollset in the main thread and funnel incoming data
>> into your bucket brigade. Not terribly efficient (lots of context
>> switches) but the real world impact may very well be negligible and
>> you can support multi-process setups with zero changes to your code.
>
>
> Hmm, well if I could use a pollset in my main thread I wouldn't
> need bucket brigades at all. But as this is data coming from the
> client, surely it's going to be in a bucket brigade already as
> it will have passed through all the input filters etc., having
> been read by apache itself?
>
> Diagramatically:
>
>
> | APACHE  |   ... MODULE ... |
>
>  Client <==> Apache > ap_get_brigade > do_something  [thread1]
>   ^
>   |-- ap_fwrite <--- do_something_else   [thread2]
>
> Each thread has a different bucket brigade with a different allocator.
>
> ap_get_brigade either needs to block, or if it's non-blocking it
> needs to wait on a condwait or something for /apache/ to produce
> more data from the client, not on the other thread.

Right, I think I see what you mean.

Apache may not be a perfect fit. I once had to solve a similar issue
and I eventually settled on sending the socket to another process.
Managing mostly dormant connections turned out to be too much of a
headache to do from within httpd.


Re: Bucket brigade & filter thread safety

2012-09-10 Thread Ben Noordhuis
On Mon, Sep 10, 2012 at 10:47 AM, Alex Bligh  wrote:
> Ben,
>
> Thanks for your reply.
>
>
>>> I am suffering from very occasional corruption of the bucket brigade
>>> which normally shows up as a corrupted pool pointer or a bogus bucket
>>> entry in thread #1 (for instance a SEGV in apr_brigade_length).
>>> Interestingly this is the relatively quiet input brigade which is only
>>> ever touched by the main apache thread. It's almost as if an allocator
>>> is not thread safe.
>>
>>
>> That's because it isn't unless you explicitly make it so (which no MPM
>> does).
>
>
> What I'm trying to do is to use a separate allocator per thread.
>
>
>>> However, I'm using a separate bucket allocator (see code above) and (at
>>> least in my code) a separate pool.
>>
>>
>> They're not really separate, the sub pool is created off r->pool. You
>> should probably use apr_pool_create_ex() here with parent=NULL and an
>> allocator that you created with apr_allocator_create() +
>> apr_allocator_mutex_set().
>
>
> OK I based this on the docs for apr_pool_create which says at
> http://apr.apache.org/docs/apr/1.4/group__apr__pools.html#ga918adf3026c894efeae254a0446aed3b
>
> : This function is thread-safe, in the sense that multiple threads can
> : safely create subpools of the same parent pool concurrently. Similarly, a
> : subpool can be created by one thread at the same time that another thread
> : accesses the parent pool.
>
> Have I understood that wrong?

No, but the documentation omits some crucial details.
apr_pool_create() is thread-safe only if:

1. libapr is compiled with APR_HAS_THREADS
2. APR_POOL_DEBUG is turned off
3. the parent pool has a thread-safe allocator (which is true for the
global allocator that is used when parent=NULL, provided conditions #1
and #2 are satisfied)

The pools you get from httpd core satisfy #3 but a module may replace
e.g. r->pool with another pool that doesn't. Ergo, don't rely on a
pool being thread safe unless you explicitly make it so.

>> That won't solve all your problems though. Bucket brigades are not
>> thread safe, you will need something to synchronize on.
>
>
> So what I was trying to do was to use
> a) the input bucket brigade in thread #1 (main thread)
> b) the output bucket brigade in thread #2
> in an attempt to avoid synchronization
>
> But what I don't understand is whether thread #2, in writing
> to the output filters (which presumably have a reference
> to r->pool) will need synchronisation.

Yes. It's not just because r->pool may or may not be synchronized, the
internal structure of the bucket brigade is not protected by any locks
either.

> And if I have to synchronize, how do I do that in practice?
> Thread #2 does and ap_fwrite/ap_flush so I can hold a mutex
> there. But what do I do in thread #1, which calls ap_brigade_get
> and blocks? I can't hold a mutex during that. I can make it
> a non-blocking ap_brigade_get (if I understood how to do it)

Non-blocking reads are pretty straightforward:

  apr_thread_mutex_lock(&mutex);
  rv = ap_get_brigade(f->next, bb, AP_MODE_READBYTES, APR_NONBLOCK_READ, len);
  if (APR_STATUS_IS_EAGAIN(rv)) apr_thread_cond_wait(&cond, &mutex);
  rv = ap_get_brigade(f->next, bb, AP_MODE_READBYTES, APR_NONBLOCK_READ, len);
  apr_thread_mutex_unlock(&mutex);

The other thread wakes up this thread with apr_thread_cond_signal(&cond).

> but what I really need is the equivalent of a select() which
> I can do with the mutex not held (or some way to drop the mutex
> during the raw reads). Any ideas?

You could set up a pollset in the main thread and funnel incoming data
into your bucket brigade. Not terribly efficient (lots of context
switches) but the real world impact may very well be negligible and
you can support multi-process setups with zero changes to your code.


Re: Bucket brigade & filter thread safety

2012-09-09 Thread Ben Noordhuis
On Sun, Sep 9, 2012 at 2:31 PM, Alex Bligh  wrote:
> I am trying to work out how to develop a thread-safe module with two
> threads, one thread reading and one thread writing. I'm using mpm-prefork
> on apache 2.2.14-5ubuntu8.9 in case that matters. The module is a websocket
> proxy.
>
> My main thread (#1) is doing (in essence - error handling removed)
>
> apr_pool_create(&opool, r->pool);
> apr_bucket_alloc_t *oallocator = apr_bucket_alloc_create(opool);
> apr_bucket_brigade *obb = apr_brigade_create(opool, oallocator);
> apr_thread_create (... otherthread ...);
> while (1) {
>bb = apr_brigade_create(r->pool, r->connection->bucket_alloc);
>ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES,
>   APR_BLOCK_READ, bufsiz));
>apr_brigade_flatten(bb, buffer, &bufsiz);
>apr_brigade_destroy(bb);
>do_something_with(buffer);
> }
>
> The other thread (#2) is doing (in essence)
>
> while (1) {
>   apr_pollset_poll(recvpollset, timeout, ... );
>   apr_socket_recv(socket, buf, ...)
>   do_something_else_with(buf);
>   ap_fwrite(r->connection->output_filters, obb, ...);
>   ap_fflush(r->connection->output_filters, obb, ...);
> }
>
> I am suffering from very occasional corruption of the bucket brigade which
> normally shows up as a corrupted pool pointer or a bogus bucket entry in
> thread #1 (for instance a SEGV in apr_brigade_length). Interestingly this
> is the relatively quiet input brigade which is only ever touched by the
> main apache thread. It's almost as if an allocator is not thread safe.

That's because it isn't unless you explicitly make it so (which no MPM does).

> However, I'm using a separate bucket allocator (see code above) and (at
> least in my code) a separate pool.

They're not really separate, the sub pool is created off r->pool. You
should probably use apr_pool_create_ex() here with parent=NULL and an
allocator that you created with apr_allocator_create() +
apr_allocator_mutex_set().

That won't solve all your problems though. Bucket brigades are not
thread safe, you will need something to synchronize on.

> Could the output filter chain somehow be using the allocator attached to
> the request (i.e. thread #1)? If so, how can I stop this? I can't run the
> ap_fwrite/ap_flush under a mutex and have the same mutex held during
> ap_get_brigade, as the latter blocks (and I can't see how to use the
> non-blocking version without spinning).
>
> [apologies for the partial dupe on the apache-users mailing list - I think
> I've got nearer the problem since then and this list seems more
> appropriate]
>
> --
> Alex Bligh


Re: problem with ap_md5 in custom module

2012-08-15 Thread Ben Noordhuis
On Wed, Aug 15, 2012 at 5:13 PM, nik600  wrote:
> Dear all
>
> i'm having a problem with ap_md5, i just want to write a custom module
> that compute che md5 checksum of the requested url and give it back to
> the user.
>
> This is my code:
> *
> *
> static int kcache_handler(request_rec* r)
> {
> if (!r->handler || strcmp(r->handler, "kcache"))
> return DECLINED;
>
> if (r->method_number != M_GET)
> return HTTP_METHOD_NOT_ALLOWED;
>
> char* kcache_md5;
> kcache_md5 = (char *)ap_md5(r->pool,r->unparsed_uri);
>
> ap_set_content_type(r, "text/html;charset=ascii");
> ap_rputs("", r);
> ap_rputs("K-Hello World!", r);
> ap_rprintf(r,"K-Hello
> World!%s=%s", r->unparsed_uri,kcache_md5);
> return OK;
> }
> *
> *
>
> i've got a warning during compilation:
>
> src/mod_kcache.c:18:15: warning: cast to pointer from integer of
> different size [-Wint-to-pointer-cast]
>
> Is quite strange to me that ap_md5 returns an int, as in the
> documentation it is reported to return a char *
>
> http://ci.apache.org/projects/httpd/trunk/doxygen/group__APACHE__CORE__MD5.html
>
> By the way, if i try to run it i get a segfault, if i comment the line
> that prints kcache_md5 with   ap_rprintf the module doesn't segfault.
>
> So, where i'm wrong?

It seems you don't include util_md5.h so the compiler defaults the
function prototype to `int ap_md5()`. Compile with -Wall -Wextra and
you'll get warnings about things like that.


Re: setting proxypass value

2012-08-15 Thread Ben Noordhuis
On Wed, Aug 15, 2012 at 10:11 AM, Ian B  wrote:
> I have a module that retrieves virtualhost config from LDAP backend
> (mod_vhost_ldap by Ondrej Sury). It works well however I'd like to extend it
> to also get and set a ProxyPass / ProxyPassMatch value for each virtualhost.
>
> Reading the value from LDAP is the easy part, however I'm stuck knowing how
> to set the value within Apache (I'm new to Apache development). I'm guessing
> that I will need to link to mod_proxy somehow and call one of it's
> functions??
>
> Could someone please point me in the right direction?
>
> Thanks,
> Ian

I don't think mod_proxy lets you do that, you'll have to patch it. The
relevant code is in add_pass() in modules/mod_proxy/mod_proxy.c.


Re: Anyone have some example code doing simple HTTP GET request from within a module?

2012-06-22 Thread Ben Noordhuis
On Sat, Jun 23, 2012 at 4:47 AM,   wrote:
> Per earlier threads on this list, I've been working on an Apache module.  For 
> the time being, I'm kind of stuck because of the problems that I've run into 
> with trying to integrate my module with a 3rd party library, so just for my 
> module, which is mainly a proof-of-concept, I'd like to have my module do an 
> HTTP GET request.
>
> So, I was wondering if anyone has some simple example code for doing that 
> from within a module, maybe using libcurl, or just natively using sockets?
>
> I'm trying to do this myself, and I've been looking at using libcurl, but 
> most of the examples that I've seen use the "easy" setup, so if someone has 
> something like that that can be shared, it'd be a big help.  Conversely, if I 
> figure it out, I'll post some working snippets here :)...

I suggest you reuse the existing infrastructure where possible. Have a
look at how mod_proxy makes HTTP requests.


Re: UNSOLVED was Re: SOLVED was Re: How to compiling/link/use Apache module that uses shared library?

2012-06-21 Thread Ben Noordhuis
On Fri, Jun 22, 2012 at 3:32 AM,   wrote:
> Program received signal SIGSEGV, Segmentation fault.
> 0x003518d6c1e1 in BN_num_bits () from /lib64/libcrypto.so.4
>
> So, it's actually blowing up in "BN_num_bits()" in /lib64/libcrypto.so.4?

Type `bt full` and you'll get a backtrace + locals. That should tell
you where the error originates from and maybe why it happens.


Re: How to compiling/link/use Apache module that uses shared library?

2012-06-21 Thread Ben Noordhuis
On Thu, Jun 21, 2012 at 8:43 PM,   wrote:
> I tried that, which allowed me to start Apache, but am getting a segfault.

Run it through gdb and inspect the backtrace. Compiling with debug
symbols and optimizations disabled (-g -O0) will help.


Re: Best (safest) way to edit char string (from envvars)?

2012-06-20 Thread Ben Noordhuis
On Wed, Jun 20, 2012 at 4:35 PM,   wrote:
> Hi,
>
> I am working on a module, and I get one of the SSL envvars, SSL_CLIENT_CERT, 
> using apr_table_get() into a const char *.
>
> The client cert char string returned has the extra beginning line (-BEGIN 
> CERTIFICATE-) and ending line (-END CERTIFICATE-), but I need to 
> remove both of those lines for a call that I need to make.
>
> I have to admit, I'm a bit (a lot) rusty with 'C', and I guess I could do 
> something like:
>
> strpy(original_cert, original_cert+27);
>
> and then set the ending position to \0 (to terminate the char string early), 
> but since with this is a module, and I'm working with a pointer to the memory 
> pool, I'm kind of worried that doing stuff like that would mess things up 
> (e.g., garbage collection, since the string is now shorter by 'x' bytes.
>
> So, from an Apache module development standpoint, what would be the safest 
> way to do this (strip a string of chars from the beginning and end)?

Make a copy with apr_strdup(), then mutate the copy.

APR has utility functions for manipulating strings, like apr_strtok().
Have a look at apr_strings.h.


Re: How to access client certificate PEM and incoming request headers in a module?

2012-06-18 Thread Ben Noordhuis
On Mon, Jun 18, 2012 at 8:53 AM,   wrote:
> I added a call to header_request_env_var(r, "REMOTE_URI"), just to see what 
> it got (running Apache in single-process mode):
>
> printf("REMOTE_URI=[%s]\n", header_request_env_var(r, "REMOTE_URI") );
>
> Then I pointed a browser to http:///test, where /test was a 
>  with a RequestHeader (to trigger mod_headers) but I got:
>
> REMOTE_URI=[(null)]
>
> Shouldn't that be showing:
>
> REMOTE_URI=[/test]
>
> ??

Did you mean REMOTE_USER or REQUEST_URI? I don't think there's such a
thing as REMOTE_URI.


Re: How to access client certificate PEM and incoming request headers in a module?

2012-06-17 Thread Ben Noordhuis
On Mon, Jun 18, 2012 at 5:45 AM,   wrote:
> I haven't actually tried your suggestion yet, but, re. the SSL variables, I 
> was looking at mod_headers.c, and in there, there are two separate functions:
>
> static const char *header_request_env_var(request_rec *r, char *a)
> {
>    const char *s = apr_table_get(r->subprocess_env,a);
>
>    if (s)
>        return unwrap_header(r->pool, s);
>    else
>        return "(null)";
> }
>
> static const char *header_request_ssl_var(request_rec *r, char *name)
> {
>    if (header_ssl_lookup) {
>        const char *val = header_ssl_lookup(r->pool, r->server,
>                                            r->connection, r, name);
>        if (val && val[0])
>            return unwrap_header(r->pool, val);
>        else
>            return "(null)";
>    }
>    else {
>        return "(null)";
>    }
> }
>
> So, it seems like the method to get the SSL variables is different than the 
> other environment variables?
>
> Or, does setting SSLOptions the way that you suggested cause the SSL variable 
> so also exist in apr_table_get(r->subprocess_env, )?

Oh, I forgot about that. It's the ssl_var_lookup optional function,
that might even work without having to tweak SSLOptions.


Re: How to access client certificate PEM and incoming request headers in a module?

2012-06-17 Thread Ben Noordhuis
On Sun, Jun 17, 2012 at 9:46 PM,   wrote:
> Hi,
>
> I am starting to look into implementing an Apache module that can use 
> information from an incoming request, including several headers and the 
> subject string from a client certificate to do authentication.
>
> I've been looking at the source for mod_auth_certificate, from 
> https://modules.apache.org/, as a starting point.
>
> However, it looks like the way that mod_auth_certificate works is that it 
> requires that there's an SSLUserName directive to put the client certificate 
> DN into the Apache REMOTE_USER attribute, whereas I need the entire PEM for 
> the client cert to do authentication that I'm trying to do.
>
> So I was wondering if it's possible for a module to access the 
> SSL_CLIENT_S_DN and SSL_CLIENT_CERT environment variables, and if so, how?

They should be set in r->subprocess_env provided `SSLOptions
+StdEnvVars +ExportCertData` is set in the server or per-directory
config.

> Also, as mentioned my module would need to access several HTTP headers that 
> are in the incoming requests.  How can it do that?

Look them up with `apr_table_get(r->headers_in, "X-Header-Name")`.


Re: Get the directory of the module

2012-06-11 Thread Ben Noordhuis
On Mon, Jun 11, 2012 at 10:10 PM, Bart Wiegmans  wrote:
> Hello everybody,
>
> For a project I'm doing, I need to install a few bytecode files
> alongside my module. I was planning on placing them in the modules
> directory but I realised that at runtime I do not know where that is.
> What is more, that directory may (and will, as a matter of fact) vary
> during installation and testing. But most importantly, the server
> knows where the module is kept as it specified by LoadModule.
>
> So in short, how can I determine the directory of my module at runtime?

You can't, the file path is internal to mod_so.c. I don't think it
even stores it.


Re: Change Request-Header before mod_rewrite

2012-06-04 Thread Ben Noordhuis
On Mon, Jun 4, 2012 at 10:53 PM, Marc apocalypse17  wrote:
> Hi all,
>
> I just developed my first apache module following the tutorial on the apache 
> website. The module is responsible for adding one header value to the active 
> request which must be checked in a mod_rewrite ReWriteCondition.
> The problem is, that this value never reaches the mod_rewrite Rule. The 
> Header just behaves the same as the original request. Does anyone know why? 
> What am I doing wrong?
>
> My module looks like this:
>
> static int helloworld_handler(request_rec* r){
>    if (!r->main) {
>        apr_table_setn(r->headers_in, "X-CUSTOM-HEADER", "1");
>    }
>    return DECLINED;
> }
>
> static void register_hooks(apr_pool_t* pool){
>    ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_FIRST);
> }
>
> module AP_MODULE_DECLARE_DATA helloworld_module = {
>    STANDARD20_MODULE_STUFF,
>    NULL,
>    NULL,
>    NULL,
>    NULL,
>    example_directives,
>    register_hooks
> };
>
> The .htacces file looks like this:
>
> RewriteEngine on
> RewriteCond %{HTTP:X-CUSTOM-HEADER} 1 [NC]
> RewriteRule from.html to.html
>
> The Rewrite-Rule is never executes fine. It always show the content of 
> from.html.
>
>
> Thank you in advance,
> Marc

mod_rewrite.c does most of the interesting work in a translate_name
hook. By the time your handler hooks run, it's probably too late.


Re: mod_ssl ignores connection->aborted & eos_sent

2012-03-07 Thread Ben Noordhuis
On Tue, Mar 6, 2012 at 13:27, Daniil A Megrabjan
 wrote:
> Anyway, as far as I understood I'm not allowed to change the default
> behavior of mod_ssl. In this case there is the other question -  how to
> register my hook to be really before  mod_ssl? and even if request has
> been received on 443 TCP port process the request as usual HTTP.
>
> Something like:
> *
> *
> *static* *const* *char* * *const* aszPre[] = { "mod_ssl.c", NULL };
>
> ap_hook_handler(ixcell_init_handler, aszPre, NULL, APR_HOOK_REALLY_FIRST);
>
> doesn't help.

Try ap_hook_pre_connection().


Re: mod_ssl ignores connection->aborted & eos_sent

2012-03-05 Thread Ben Noordhuis
On Mon, Mar 5, 2012 at 22:34, Daniil A Megrabjan
 wrote:
> Hello,
>
> I'm writing a module which serves a special URL.
> In cases when URL-string matches the special pattern my module sends the 
> connection(SCM_RIGHTS) between HTTP client and Apache to another process. 
> Furthermore, Apache child has been told to forget about this connection in 
> this way:
> r->connection->aborted = 1;
> r->eos_sent = 1;
>
> After that my process communicates with HTTP-client by itself without 
> Apache's assistance.
>
> Everything is fine with this scheme inside basic HTTP, but when I'm switching 
> to HTTPS I can guess that mod_ssl ignores "aborted" and "eos_sent" properties 
> and eventually drops the connection.
>
> How to persuade mod_ssl not to touch the connection?

I don't think you can - or should. How will you decrypt the traffic?
The SSL/TLS session parameters are private to mod_ssl.


Re: thread ID

2012-03-01 Thread Ben Noordhuis
On Thu, Mar 1, 2012 at 17:29,   wrote:
> Hello,
>
> I would need a memory buffer associated per worker thread (in the worker
> MPM) or to each process (in the prefork MPM).
>
> In order to do that, I would need a map thread<->buffer. So, I would
> need a sort of thread ID/key/handle that stays the same during the
> lifetime of the thread and no two threads in the same process can have
> the same ID/key/handle.
>
> What is the most portable way to get this thread ID?
>
> I thought of r->connection->id. It works but it is not very portable as
> it is not guaranteed that two connections created by the same thread
> will have the same id. They do for now.
>
> If r->connection->sbh was not opaque it would be great, because
> sbh->thread_num would be exactly what I need.
>
> I could also use pthread_self. It works too but, in general, it is not
> guaranteed that the worker threads are pthreads.
>
>
> Thank you for your help.
>
> Sorin

What about apr_os_thread_current()? It returns a opaque value that's a
pthread_t on Unices and a pseudo-HANDLE on Windows. Read this[1] to
understand what that means.

As a recovering standards lawyer I should probably point out that
pthread_t is an opaque type that's not guaranteed to be convertible to
a numeric value (or to anything, really). That said, I've never seen a
pthreads implementation where that wasn't the case.

[1] 
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683182%28v=vs.85%29.aspx


Re: Performance Evaluation of a Module

2012-02-08 Thread Ben Noordhuis
On Wed, Feb 8, 2012 at 14:01, Oğuzhan TOPGÜL  wrote:
> Hi all,
> I have developed an apache module and i want to evaluate the performance of
> my module.
> I want to see how my module increases the load. I want to measure the
> effect of my module on processor and memory.
> I decided to set an evaluation environment using snmp and cacti.
> I'm sending thousands of request from a laptop to the server which my
> module is installed by using apache benchmark (ab). And i'm measuring load
> average, memory usage data from cacti in two different cases (my module is
> active and passive)
>
> Do you think this is a good evaluation environment or do you have any
> ideas, suggestions???
>
> sincerely
>
> Oguzhan

That's a pretty sensible approach. Measuring CPU usage is usually
better done with a profiler unless a coarse-grained indicator like the
load average is good enough.


Re: Developing Authn/Authz Modules

2011-10-01 Thread Ben Noordhuis
On Sat, Oct 1, 2011 at 23:05, Suneet Shah  wrote:
> Hello,
>
> I am trying to build my apache module which needs to carry out
> authentication and authorization functions based on the value of a cookie.
> To start with, I have just created a shell with the intent that I wanted the
> functions for authentication and authorization being called.
> However, it does not appear that these functions are being called. I have
> pasted by configuration and code below.
>
> When I try to access  http://localhost/test_rpc/ I get the login.html that
> is defined in my ErrorDocument below.
> But when I look in the log file, I see the following.
> Since its looking for a userId, I am wondering if there is an error in my
> configuration
>
> [Sat Oct 01 16:37:29 2011] [debug] prefork.c(996): AcceptMutex: sysvsem
> (default: sysvsem)
> [Sat Oct 01 16:38:08 2011] [error] [client 127.0.0.1] access to
> /test_rpc/header.jsp failed, reason: verification of user id '' not
> configured
>
> Any guidance on what I am doing wrong would be greatly appreciate.
>
> Regards
> Suneet
>
>
> -- Configuration in Httpd.conf
>
> 
>   IAM_CookieName IAM_PARAM
>   IAM_TokenParam tkn
>   IAM_Service_base_url "http://localhost:8080/";
>   ErrorDocument 401 "/login.html"
>   AuthType IAMToken
>   AuthName "IAM Login"
>   AuthCookie_Authoritative On
>  
>
> 
>    ProxyPass http://localhost:9080/test_rpc
>
>    require tkn
> 
>
> - Module Code
> static int authz_dbd_check(request_rec *r) {
>
>    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server, "authz_dbd_check
> called");
>    return HTTP_OK;
> }
>
> static int check_token(request_rec *r) {
>
>     ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server, "chedk_token
> called.");
>    return OK;
> }
>
> static void authz_dbd_hooks(apr_pool_t *p)
> {
>    ap_hook_auth_checker(check_token, NULL, NULL, APR_HOOK_MIDDLE);
>    ap_hook_auth_checker(authz_dbd_check, NULL, NULL, APR_HOOK_MIDDLE);
> }
> module AP_MODULE_DECLARE_DATA authz_dbd_module =
> {
>    STANDARD20_MODULE_STUFF,
>    authz_dbd_cr_cfg,
>    NULL,
>    NULL,
>    NULL,
>    authz_dbd_cmds,
>    authz_dbd_hooks
> };

You probably need a `Satisfy all` in your httpd config.


Re: cross-process and cross-thread file locking

2011-09-29 Thread Ben Noordhuis
On Thu, Sep 29, 2011 at 16:54, thomas bonfort  wrote:
> Hi all, sorry in advance if this is a dumb question.
>
> The apr documentation for apr_file_lock states "Locks are established
> on a per-thread/process basis; a second lock by the same thread will
> not block." but this is not the behavior I am seeing. As apr_file_lock
> on unix uses fcntl by default, a second lock by another thread of the
> same process will not lock either.

You're probably running into (POSIX mandated!) behaviour that requires
that when a process closes a file descriptor for file X, *all* locks
for X held by that process are released.

Absolutely brain dead. I can't begin to fathom the mind that thought it up.

> I was using apr_file_lock as I need all my httpd threads/process to be
> synchronized on an named ressource, and chose to create a lockfile
> who's filename matches my named ressource. This does not work as with
> a multi-threaded mpm the threads of the same process that created the
> lockfile will not block on the call to apr_file_lock call.
>
> From my readings, it seems that file locking is a hazardous task to
> get right, so what are my options to attain my goal:
>
> - use my own implementation mimicking apr_file_lock, but that
> unconditionnaly uses flock() instead of fcntl() ? I suspect that this
> would not be a safe solution as some platforms fall back to fcntl for
> flock.

flock() is not available on SunOS and it has nasty fork() semantics:
process acquires lock, forks, child releases lock, parent loses lock
(without getting told). Once again, brain dead.

You also cannot rely on it working correctly (or at all) on NFS
mounts. That's not really flock()'s fault, it's a shortcoming of the
NFS protocol. fcntl() and lock() have the same issue.

In my experience, the most reliable and portable approach is to create
a lock file with open(O_CREAT|O_EXCL) that you unlink() afterwards. On
EEXIST, sleep for a bit and try again.

> - I tried using a posix semaphore which worked quite well, except in
> the cases where either the process crashed or was terminated by httpd
> because of a Timeout, and in that case the semaphore is never released
> until a server reboot or manually messing in /dev/shm. If I attach a
> cleanup call to the request pool, will it be called in the case where
> the process is terminated after the Timeout delay ?

I don't think you can guarantee that your cleanup action always runs.
If a worker process hangs, the master will eventually send it a
SIGKILL.


Re: Infinite data stream from a non-HTTPD external process via HTTPD

2011-09-20 Thread Ben Noordhuis
On Tue, Sep 20, 2011 at 11:13, Henrik Strand  wrote:
> I would like to send an infinite data stream from a non-HTTPD external
> process via HTTPD to the client connection. Both HTTP and HTTPS must be
> supported.

What kind of external process are we talking here? Something that
prints to stdout, listens on a UNIX/TCP socket, something else?


Re: Question about malloc / realloc in module

2011-09-14 Thread Ben Noordhuis
On Wed, Sep 14, 2011 at 14:01, Christoph Gröver  wrote:
> In a module I have to allocate and reallocate a chunk of memory.
> Because (AFAIK) Apache or its libraries do not support reallocation
> I use the standard functions malloc/realloc (and free), of course.

Don't do that. Use apr_palloc() and copy over the data.

If you're worried about memory wastage, create a child pool and
apr_palloc() from there. apr_pool_destroy() the pool to reclaim the
memory.


Re: po

2011-09-01 Thread Ben Noordhuis
On Thu, Sep 1, 2011 at 13:52, Joshua Marantz  wrote:
> Hi,
>
> I've been load-testing our module
> (mod_pagespeed)
> with httpd 2.2.16 built with these options:
>     --enable-pool-debug --with-mpm=worker
> I've been getting periodic aborts from apr_table_addn that don't look like
> they are from my module.  These do not occur when using 'prefork'.
>
> Here's a stack-trace recovered from a core file:
>
> Program terminated with signal 6, Aborted.
> #0  0x7fdd3bbd9a75 in raise (sig=) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
> 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
>  in ../nptl/sysdeps/unix/sysv/linux/raise.c
> (gdb) where
> #0  0x7fdd3bbd9a75 in raise (sig=) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
> #1  0x7fdd3bbdd5c0 in abort () at abort.c:92
> #2  0x7fdd3c9a2e57 in apr_table_addn (t=0xe84980, key=0xd51d60
> "Accept-Encoding", val=0xd51d71 "identity") at tables/apr_tables.c:813
> #3  0x00433e36 in ap_get_mime_headers_core (r=0xf27de0, bb=0xdda3a0)
> at protocol.c:799
> #4  0x0043456b in ap_read_request (conn=0xe51620) at protocol.c:918
> #5  0x0047f772 in ap_process_http_connection (c=0xe51620) at
> http_core.c:183
> #6  0x00446e28 in ap_run_process_connection (c=0xe51620) at
> connection.c:43
> #7  0x004b3297 in process_socket (thd=,
> dummy=) at worker.c:544
> #8  worker_thread (thd=, dummy=)
> at worker.c:894
> #9  0x7fdd3c1339ca in start_thread (arg=) at
> pthread_create.c:300
> #10 0x7fdd3bc8c70d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>
>
> Questions
>
> 1. Is this a bug in httpd?

Probably not.

> 2. Or could I somehow have caused this with a programming error in my
> module?

That seems more likely.

> 3. Or is enable-pool-debug simply incompatible with the Worker MPM?

Not that I know.

That assertion is triggered when you add a string from pool A to a
table in pool B where A is a child of B (adding headers from the
request_rec to a conn_rec table, for example). It's a lifecycle issue.


Re: mutex permission denied

2011-08-17 Thread Ben Noordhuis
On Wed, Aug 17, 2011 at 15:20, Jason Funk  wrote:
> I am trying to implement an apr proc mutex in my module. When I created the
> mutex with APR_LOCK_DEFAULT the mutex is successfully created but I am
> getting "Permission Denied" when I try to acquire the lock. I ran
> "apr_proc_mutex_defname" to get the name of the default mutex type and it
> is APR_LOCK_SYSVSEM. Without changing anything else, I changed
> APR_LOCK_DEFAULT to APR_LOCK_POSIXSEM when creating the mutex, just to see
> what happens and everything works fine. Of course, this isn't portable. Any
> ideas why APR_LOCK_SYSVSEM doesn't work, but APR_LOCK_POSIXSEM does work?

I suspect that you see shmget() raising EACCES when you strace apache?
That's what happens when you create the mutex as root and try to
acquire it after httpd's dropped privileges, apr creates the semaphore
with mode 0600.


Re: mod_proxy and modified headers in filters.

2011-08-08 Thread Ben Noordhuis
On Mon, Aug 8, 2011 at 10:29, Zaid Amireh  wrote:
> I'm writing a module for Apache 2.2 that changes the content and thus needs 
> to set a new C-L header, all is working perfectly for static files and 
> content generated from content handlers (PHP & Ruby Passenger Phusion), an 
> issue arose when testing with mod_proxy, it seems that any changes the module 
> does to the HTTP headers are being ignored by mod_proxy.
>
> mod_proxy keeps serving the headers it first got from the backend source and 
> disregards any changes my module does, is it possible to change the headers 
> in this case?

Yes. Have a look at proxy_hook_fixups() in mod_proxy.h.


Re: Sharing information between threads and processes.

2011-07-21 Thread Ben Noordhuis
On Thu, Jul 21, 2011 at 13:25, Zaid Amireh  wrote:
> On Jul 21, 2011, at 1:53 PM, Nick Kew wrote:
>> How near do the socache modules come to meeting your needs?
>
> mod_disk_cache would unfortunately make my code pretty complex and maybe slow 
> as I'm not caching documents but rather tokens and strings.
> mod_mem_cache is a per-process cache which simply doesn't meet the 
> requirements.

Not the same thing. The socache API is a 2.3.0 addition that allows
you to store blobs in shared memory, memcache, distcache or DBM.


Re: creating reverse proxy workers dynamically

2011-07-07 Thread Ben Noordhuis
On Thu, Jul 7, 2011 at 07:19, Jodi Bosa  wrote:
> It seems I may need to create HTTPS reverse proxy workers DYNAMICALLY - what
> is best way to do this?
>
> In other words, from manual I see config directive:
>
>    ProxyPass /example http://backend.example.com connectiontimeout=5
> timeout=30
>
> However, I will have several origin servers that aren't necessarily known
> during config or startup.  How can I create such workers as needed?

I don't think there is a reliable way to do that right now.


Re: illegal instruction 4

2011-07-07 Thread Ben Noordhuis
On Thu, Jul 7, 2011 at 16:33, MK  wrote:
> I have a mod_perl based module running a service on an openVZ slice.
> It was working fine for a few weeks, but when I went to use it today I
> get delivered an empty page and in the apache error.log:
>
> child exit signal Illegal instruction (4)
>
> Which AFAIK is a very strange thing (SIGILL); actual perl errors are
> usually explicit, and passed on from the interpreter. To make sure the
> problem wasn't in my code, I replaced the module with a one liner:
>
> sub handler {
>       return SERVER_ERROR;
> }
>
> Same thing.  Ie, suddenly perl modules are working.  I did not compile
> apache or mod_perl myself.
>
> Anyone have any ideas about how I can solve this or debug it further?
> I have been playing around with small max stack sizes (ulimit -s 256),
> but resetting that to 8192 and restarting apache did not alleviate the
> problem.

I'm not sure if this is the right mailing list for you but if you want
to debug Apache, start it in single-process mode (`httpd -X`) and
attach `gdb` to it. That may be only nominally useful if your Apache
is compiled without debug symbols (unless you get a kick out of
stepping through assembly code).


Re: proxying another protocol to http/https

2011-07-07 Thread Ben Noordhuis
On Thu, Jul 7, 2011 at 02:56, Jodi Bosa  wrote:
> I would like to leverage mod_proxy and mod_proxy_http to proxy client
> requests (from another protocol).
> Assuming I have input & output filters that handle the other protocol with
> the client, shouldn't I simply be able to:
>
>
> Handler
> {
>    r->filename = apr_psprintf(r->pool, "proxy:https://%s";, hostname);
>    r->proxyreq = PROXYREQ_PROXY;
>    return DECLINED; /* to allow mod_proxy to kick in and do it's thing */
> }

That should work if your handler runs before mod_proxy. Hook it at
APR_HOOK_REALLY_FIRST.


Re: Module External Configuration

2011-06-21 Thread Ben Noordhuis
On Tue, Jun 21, 2011 at 23:26, Jason Funk  wrote:
> One last question about shared memory...
>
> I have my configuration now being loaded successfully into a shared memory
> segment.. now my problem is that someone could change the config so that the
> resulting structure wouldn't fit in the shared memory segment. Is it
> possible to in the child replace my current shared memory segment with a
> bigger one? I tried destroy()ing and then create()ing but that resulted in a
> segfault. Should it have worked? Is there a different way?

As I've said before, no, you cannot portably resize a shared memory
segment. APR doesn't even expose that functionality.

If you're targeting Unices only, you can use Sys V or POSIX IPC: you
open() or shm_open() a memory segment, then ftruncate() it to the
desired size. Make sure to wrap the call to ftruncate() in an
exclusive lock or bad things will happen.


Re: Socket transfer from Apache httpd to a non-httpd process

2011-06-16 Thread Ben Noordhuis
On Thu, Jun 16, 2011 at 10:32, Henrik Strand  wrote:
> I've tried writing data to the socket directly after my non-httpd daemon
> process receives the socket descriptor and this results in that the
> client receives this data. However, very shortly afterwards the
> connections is closed and I'm not able to write to the socket anymore.

You probably want to dup() the socket fd before passing it to the
external process.


Re: Vary:User-Agent, best practices, and making the web faster.

2011-06-06 Thread Ben Noordhuis
On Sun, Jun 5, 2011 at 21:37, Joshua Marantz  wrote:
> Does Magento actually vary the content of CSS & JS based on user-agent?  Or
> does it only vary the content of HTML?

I don't know. I'm by no means a Magento expert, I only run into it
from time to time. That site I broke? That was in the summer of 2009
while beefing up the security and performance of a large retailer's
web shop, mostly by putting stuff behind reverse proxies.


Re: Add a variable on http header

2011-06-06 Thread Ben Noordhuis
On Mon, Jun 6, 2011 at 14:15, Maurizio Totti  wrote:
> I would try to set an http header variable "on the fly" but without success 
> :-(
> I write this simple module to test the functionality
>
> http://pastebin.com/b8hcZTtb
>
> I don't understand my error, someone can help me?

- Is your module supposed to be a filter or a handler? Your example is
a bit of both.
- What error (if any) do you get and when?
- Have you looked at the mod_headers source?


Re: Vary:User-Agent, best practices, and making the web faster.

2011-06-05 Thread Ben Noordhuis
On Sun, Jun 5, 2011 at 13:42, Joshua Marantz  wrote:
> This is a case where the content varies based on user-agent.  The
> recommendation on the mod_deflate doc page is add vary:user-agent for any
> non-image.  Can you think of a case where the absence of a vary:user-agent
> header causes broken behavior when the content doesn't vary?
>
> I'm not objecting to setting vary:user-agent when content varies: that's
> what it's for.  I'm objecting to setting vary:user-agent when content does
> *not* vary.  The mod_deflate documentation unambiguously recommends setting
> vary:user-agent, and my feeling is that this is to work around a bug that
> exists only in IE5 or pre-2007 patch of IE6.

Sorry, Joshua, we're conflating things. You raised two issues in your
original post:

1. Updating the mod_deflate documentation. Seems reasonable. The Vary:
UA recommendation was added in 2002 during a general clean-up of the
mod_deflate documentation and the commit log doesn't tell why. You
could open a bugzilla issue or raise it on the httpd-dev mailing list
(the former is the proper channel but the bugzilla is something of a
graveyard).

2. mod_pagespeed second-guessing the user's intent. That still seems
like an unambiguously bad idea. To touch on Magento again, its
documentation links (or linked) directly to that section of the
mod_deflate docs and people are using that. If your module scans for
and neutralizes that Header directive, you will break someone's site.


Re: Vary:User-Agent, best practices, and making the web faster.

2011-06-05 Thread Ben Noordhuis
On Sun, Jun 5, 2011 at 02:15, Joshua Marantz  wrote:
> On Sat, Jun 4, 2011 at 7:58 PM, Ben Noordhuis  wrote:
>> Some popular OSS packages depend on Vary: User-Agent to make
>> downstream proxies (reverse or forward) do the right thing.
>
> I'm pretty interested in deconstructing this further.  Can you be more
> specific?   Which OSS packages?  Under what scenario would a proxy do the
> wrong thing in the absence of Vary:User-Agent (other than, obviously, when
> the content actually varies based on user-agent)?

>From first-hand experience (because I broke it): Magento, a popular
PHP e-commerce framework. Magento (or one of its plug-ins) generates
browser-tailored HTML and sets the Vary header to ensure that
downstream proxies send the right HTML to the right client. If you
remove or ignore the header, the layout of your site breaks.

There are CPAN modules and Rack middleware that do similar things and
no doubt other software too.


Re: Vary:User-Agent, best practices, and making the web faster.

2011-06-04 Thread Ben Noordhuis
On Sun, Jun 5, 2011 at 00:34, Joshua Marantz  wrote:
> It was with some reluctance that I brought this up.  It occurs to me that
> this idea propagates the sort of spec violations that led to this issue
> (inappropriate user of Vary:User-Agent) in the first place.   However, I'm
> trying to figure out how to improve compliance to support legitimate uses of
> Vary:User-Agent without causing mod_pagespeed to become significantly less
> ineffective across a broad range of sites.
>
> We have found that putting complaints in Apache logs mostly causes disks to
> fill and servers to crash -- although that does get it noticed :).  The
> problem, put another way, is that mod_pagespeed cannot distinguish
> legitimate uses of Vary:User-Agent, so it really has no business complaining
> in logs.  Complaining in docs is fine; but some existing mod_pagespeed users
> that simply type "sudo yum update" will later notice a performance-drop and
> may not consult the docs to figure out why.
>
> I'm also trying to grok the first response from Eric:
>
> It's because of the other (dated) canned exceptions that set/unset
> no-gzip/gzip-only-text/html based on the User-Agent, to second-guess
> browsers that send AE:gzip but can't properly deal with it.
>
>
> Going backwards:  which browsers send AE:gzip but can't properly deal with
> it?   Does IE6 have that issue or is it only true of IE5?   I know that IE6
> has had issues with compression in the past but they appear to be addressed
> by patches issued by Microsoft four and a half years ago:
> http://support.microsoft.com/default.aspx?scid=kb;en-us;Q312496.  Moreover
> IE6 is shrinking in market
> share(~
> 10%) and IE5 does not appear in the pie-chart at all.

This was indeed a (since fixed) problem with IE6. I haven't seen the
gzip issue crop up since but that is purely anecdotal.

> And I still don't understand how that relates to Vary:User-Agent.  What's
> really at issue here seems more related to proxies; is that right?  That
> proxies were not respecting Accept-Encoding, but sending gzipped content to
> browsers that did not want it?  Is that still a problem?  Which proxies were
> broken?  Are they still broken?

Some popular OSS packages depend on Vary: User-Agent to make
downstream proxies (reverse or forward) do the right thing.

> And, while I understand the reluctance to help me figure out from our module
> what values were passed to SetEnvIfNoCase and Header, I would like to see
> whether there's agreement that the Apache 2.2 docs for mod_deflate are no
> longer appropriate -- and in fact harmful.

I've been mulling it over for 10 minutes and I can't decide. It's
harmful because it leads to a proliferation of cached objects (bad)
but removing it from the documentation will break things for someone
somewhere (also bad).


Re: Vary:User-Agent, best practices, and making the web faster.

2011-06-04 Thread Ben Noordhuis
On Sat, Jun 4, 2011 at 21:26, Joshua Marantz  wrote:
> I think what we'd do is basically let mod_pagespeed ignore "Vary:User-Agent"
> if we saw that it was inserted per this exact pattern.  This would, to be

This seems like a stupendously bad idea. Warn about it in your docs,
complain about it in the logs but don't willy-nilly override people's
settings.


Re: mod_gnutls and mod_proxy (TLS termination)

2011-05-04 Thread Ben Noordhuis
On Wed, May 4, 2011 at 17:50, Hardy Griech  wrote:
> Sorry, my fault.  I focused on ssl_proxy_enable() which is not called in my
> case.  ssl_engine_disable() does the job.
>
> So my problem is hopefully solved.
>
> Disadvantage of this solution is, that mod_ssl and mod_gnutls cannot be
> loaded simultaneously.

I think you can work around this by chaining the optional functions.

In your pre_config hook, look up and store the mod_ssl functions, then
register your own. Your functions do their thing when it's mod_gnutls
handling the connection and delegate to their mod_ssl counterparts
otherwise.


Re: mod_gnutls and mod_proxy (TLS termination)

2011-05-03 Thread Ben Noordhuis
On Tue, May 3, 2011 at 21:10, Hardy Griech  wrote:
> On 03.05.2011 00:13, Ben Noordhuis wrote:
>>
>> On Mon, May 2, 2011 at 20:51, Hardy Griech  wrote:
>>>
>>> Now my concern is, how can I reliably catch the condition that the
>>> connection has been initiated by mod_proxy.  Any ideas?
>>
>> r->proxyreq != PROXYREQ_NONE? Does 'initiated' mean 'request from an
>> external reverse proxy' or 'request handled by mod_proxy'?
>
> Sorry, I forgot to mention that the code is in the pre-connection hook.  So
> no proxyreq available :-(
>
> Also my previous patch does not work, if the destination server is on
> another machine.
>
> Currently I'm checking (c->sbh == NULL) to detect the mod_proxy request
> (yes, I meant a mod_proxy request).
>
> In mod_ssl they seem to have a similar problem with mod_proxy: mod_proxy
> calls some mod_ssl functions (ssl_proxy_enable() and ssl_engine_disable())
> to signal a request handled by mod_proxy.
>
> I've tried to implement also these two functions - without success, they are
> never called also I've tried to register them just like mod_ssl does
> (mod_ssl is not loaded BTW).

Hardy, when and where are you registering your optional functions?
mod_proxy looks them up in the post_config phase so they must have
been registered by then. register_hooks is a good place for it.

Can you perhaps post or link to your code?


Re: mod_gnutls and mod_proxy (TLS termination)

2011-05-02 Thread Ben Noordhuis
On Mon, May 2, 2011 at 20:51, Hardy Griech  wrote:
> Now my concern is, how can I reliably catch the condition that the
> connection has been initiated by mod_proxy.  Any ideas?

r->proxyreq != PROXYREQ_NONE? Does 'initiated' mean 'request from an
external reverse proxy' or 'request handled by mod_proxy'?


Re: mod_gnutls and mod_proxy (TLS termination)

2011-04-29 Thread Ben Noordhuis
> - mod_ssl (openssl?) does not obey the maximum fragmentation
>  length requested by the clients

I think that this has been fixed in openssl 1.0.0.a.

Monkey curiosity: why do you need it?

> - install 'apache2-dbg'
> - enter gdb with the above command line
> - run (in gdb)
> - break gdb when the modules have been loaded
> - set the breakpoint and continue with debugging
>
> PS: anyway to automate the above procedure, esp. a breakpoint
>    when all modules have been loaded?

$ cat > gdb.conf
b ap_run_post_config
r
b your_function
c
$ gdb -x gdb.conf --args /path/to/httpd -X

But mostly you place the breakpoint once manually, then restart the
server (`r` or `run`) when you recompile your module.


Re: mod_gnutls and mod_proxy (TLS termination)

2011-04-29 Thread Ben Noordhuis
On Fri, Apr 29, 2011 at 10:27, Hardy Griech  wrote:
> I'm trying to use mod_gnutls for TLS termination without success.

My first suggestion would be to use mod_ssl.

Alternatively, compile Apache and mod_gnutls with -g -O0 and run it
with `gdb --args httpd -X -e debug`. Put a breakpoint on the
pre_connection hook and take it from there.


Re: KeepAlive -- why is it off by default?

2011-04-11 Thread Ben Noordhuis
On Tue, Apr 12, 2011 at 02:18, Joshua Marantz  wrote:
> What are the reasons someone might wish to turn KeepAlive off?  The only one
> I can think of is in single-process mode (httpd -X) it can be a drag to
> refresh a page with lots of resources; but this seems like a secondary issue
> that could be worked around if needed.

Lower throughput under heavy load.

With KeepAlive on, each client gets allocated a process (or thread) to
serve the requests. That process isn't doing anything useful between
the client's requests so you end up with lots of idle processes, even
when it's peak traffic.

The event MPM is purposely written to deal with this issue: it puts
the idle connection in a pool so the worker can serve other requests.


Re: Changing request method

2011-04-04 Thread Ben Noordhuis
On Tue, Apr 5, 2011 at 00:10, Jason Cwik  wrote:
> I'm trying to write a module that will improve REST compatibility with Flex
> & JS by implementing support for X-Http-Method-Override to change the
> request method when only GET/POST are the only methods supported from the
> client.
>
> I've read the modules book and looked through lots of other modules, but I
> can't figure out the right way to do this.  I've implemented a input filter,
> but it always seems like the request has been read by the time my filter
> runs and I only get body content (I've tried registering it at
> AP_FTYPE_CONNECTION and even AP_FTYPE_CONNECTION+1, but never see the raw
> request)
>
> Looking at hooks, it seems like the earliest I can hook in is
> post_read_request; also too late.

Well... you can add your input filter at the end of the chain, buffer
everything until you see the X-Http-Method-Override header and then
pass on the buffered input with a modified request line. But that's a
hack and probably a can of worms, I really wouldn't recommend it.

> Another direction I went was to create a subrequest and execute that instead
> of the current request.  That worked fine, but ended up executing my current
> request twice.  Is this the right way to go?  If so,
> 1) how do I stream data to the subrequest (e.g. POST and PUT)
> 2) how do I terminate the current processing chain so it stops after my
> subrequest (and doesn't execute again)?  I tried ap_internal_fast_redirect,
> but that ended up crashing mod_php in the subrequest.

Depends on what you want to do. To trick mod_php into thinking that a
POST is really a PUT, add a fixup hook where you update r->method and
r->method_number. The drawback is that you break  directives
this way.

Out of curiosity, why don't you handle this from within PHP itself?

  if (isset($_SERVER['HTTP_X_HTTP_METHOD_OVERRIDE'])) {
$_SERVER['REQUEST_METHOD'] = $_SERVER['HTTP_X_HTTP_METHOD_OVERRIDE'];
  }

That should be all it takes, right?


Re: module interaction

2011-03-26 Thread Ben Noordhuis
On Sun, Mar 27, 2011 at 00:23, Simone Caruso  wrote:
> I think provider or apr_optional are what i'm looking for,
> i read mod_authnz_ldap for a provider sample and mod_ldap for apr_optional,
> maybe i have understood the logic, but... i don't understand the difference
> between these two approaches..
> there is a doc or someone can explain it ?

Providers are for when you have some common functionality that has
more than one implementation. Example: a session cache mechanism with
an in-memory and an on-disk implementation.

Optional functions let you expose your functions to other modules.
Example: mod_foo exports a function get_num_processed_requests() that
mod_bar calls when it generates a status page.

I wager that optional functions are what you're looking for.


Re: module interaction

2011-03-26 Thread Ben Noordhuis
On Sat, Mar 26, 2011 at 16:09, Brian McQueen  wrote:
> If you want to share functions then put them into a library and they
> will be shared in the usual way like normal c functions. If you want
> to pass data between modules there are are notes and environment.

This. And there is also the provider API and the optional function
stuff in ap_provider.h and apr_optional.h.


Re: how to parse html content in handler

2011-03-24 Thread Ben Noordhuis
On Thu, Mar 24, 2011 at 13:10, Whut  Jia  wrote:
> Hi,all
> I want to parse a html content and withdraw some element in myself apache 
> handler.Please ask how to do it.
> Thanks,
> Jia

Hey, have a look at how mod_proxy_html[1] does it.

[1] http://apache.webthing.com/mod_proxy_html/


Re: ordering output filters

2011-03-14 Thread Ben Noordhuis
On Mon, Mar 14, 2011 at 16:54, Joshua Marantz  wrote:
> Even in the absence of 'remove_comments', it would be preferable to have
> mod_pagespeed run after mod_includes so that it has an opportunity to
> optimize the included text.  The user can achieve this by putting this line
> into his config file:
>
>    AddOutputFilter INCLUDES;MOD_PAGESPEED_OUTPUT_FILTER html
>
> But this is not desirable for a couple of reasons.  We'd like to force the
> correct order automatically if possible.

> We also have a constraint that mod_pagespeed must run before mod_deflate.
>  Actually mod_pagespeed already inserts mod_deflate in the filter-chain to
> run downstream of it:
>
>  ap_add_output_filter("DEFLATE", NULL, request, request->connection);

mod_include runs at AP_FTYPE_RESOURCE, mod_deflate at AP_FTYPE_CONTENT_SET.

If you register your filter at  AP_FTYPE_RESOURCE + 1 or
AP_FTYPE_CONTENT_SET - 1, it will run after mod_include but before
mod_deflate.


Re: Saving the original request URI ahead of a mod_rewrite

2011-03-13 Thread Ben Noordhuis
On Sun, Mar 13, 2011 at 13:15, Eric Covener  wrote:
> r->main doesn't change on an internal redirect AFAICT.

You're right. And there is ap_internal_fast_redirect() that works
different still. The only thing I can think of that should work for
all three is to follow r->main until it's NULL, then follow r->prev
until you're at the root request.


Re: Saving the original request URI ahead of a mod_rewrite

2011-03-12 Thread Ben Noordhuis
On Sun, Mar 13, 2011 at 04:02, Eric Covener  wrote:
> OP specifically mentions "internal redirect" and rewrite-in-htaccess.

Hah, the moment I fired off that email I thought "oh wait, mod_rewrite
*does* do an internal redirect somewhere".

Internal redirects share a pool so your suggestion would work but
following r->main is still the better solution, I think, if only
because it works for both redirects and sub-requests.


Re: POST body in request_rec

2011-03-12 Thread Ben Noordhuis
Hi Jodi,

On Sun, Mar 13, 2011 at 01:29, Jodi Bosa  wrote:
> how do I access the BODY contents for a POST request?

You need to register an input filter for that.

> Also, r->method shows "POST" and apr_table_get(r->headers_in,
> "Content-Length") returns 300 bytes, but yet r->clength says 0.

r->clength is the size of the response body.


Re: Saving the original request URI ahead of a mod_rewrite

2011-03-12 Thread Ben Noordhuis
On Sun, Mar 13, 2011 at 02:48, Eric Covener  wrote:
>> So I like #1 best.  Any other opinions or ideas?
>
> I solved a similar problem recently by using apr_pool_userdata_set on
> r->pool which you can still find after the internal redirects of
> rewrite in htaccess / with PT flag.

What Joshua is describing sounds like a sub-request and those get
their own pool, without the user data of the parent pool. I would
probably follow r->main.


Re: APR global mutexes

2011-03-08 Thread Ben Noordhuis
2011/3/9 Massimo Manghi :
> the subject might suggest the message is an off-topic
> for the list.

Technically it is, d...@apr.apache.org would have been a better place for it.

> To put it simple my question is: can every module in APR be used
> also to build standalone applications? More specifically: can
> apr_global_mutex_* calls be used this way?

Yes to both (first-hand experience talking here).

> The very first version of this program creates a global mutex
> and wants to use it to synchronize access to a resource shared
> between parent and child processes (a shared memory area)
>
> I've found out that, when used this way,
> apr_global_mutex_[lock|unlock|trylock] block forever. Why?
> I also tried to unlock the mutex after it was created,
> surmising the mutex could be created by _mutex_create
> as locked, but also in this case
> apr_global_mutex_unlock blocks and never returns.

Your code snippet looks sound, I can't find fault with it. You should
post it to apr-dev (preferably as a complete, compilable test case),
you'll get more and better feedback there. Mention on what platform
and with what flags you compiled APR.


Re: Converting a 16-bit string to 8-bit?

2011-03-05 Thread Ben Noordhuis
On Sat, Mar 5, 2011 at 02:11, Adelle Hartley  wrote:
> Thanks for the feedback.  Is there any documentation for the apr_xlate
> functions?

My pleasure, Adelle. Documentation: I don't think so save for the
source itself[1].

[1] https://github.com/apache/apr/blob/trunk/xlate/xlate.c


Re: Converting a 16-bit string to 8-bit?

2011-03-04 Thread Ben Noordhuis
On Fri, Mar 4, 2011 at 02:24, Adelle Hartley  wrote:
> This is a helper class I wrote for the module I'm working on.  It assumes
> the native wide encoding is UTF-32.  To make it cross platform, you'd have
> to check what the correct wide encoding is.
>
> This is my first apache module, so any corrections welcome.
>
> class our_response_t
> {
> protected:
>        request_rec* m;
>        apr_xlate_t* m_convset;
>        char m_bufferBytes[CHARSET_CONVERSION_BUFFER_SIZE];
> public:
>        our_response_t(request_rec* request) : m(request), m_convset(NULL),
> m_html(NULL), m_json(NULL)
>        {
>                apr_pool_t* pool = m->pool;
>                apr_status_t status = apr_xlate_open(&m_convset, "UTF-8",
> "UTF-32", pool);
>
>                if (m_convset)
>                {
>                        ap_set_content_type(m, "text/html;charset=UTF-8");
>                }
>        }
>
>        ~our_response_t()
>        {
>                if (m_convset)
>                {
>                        apr_xlate_close(m_convset);
>                }
>        }
>
>        void append_chars(const wchar_t* str, size_t num_chars)
>        {
>                apr_size_t inbytes_left = num_chars*sizeof(wchar_t);
>                apr_size_t outbytes_left = CHARSET_CONVERSION_BUFFER_SIZE-1;
>                apr_status_t status = apr_xlate_conv_buffer(m_convset, (const
> char*)str, &inbytes_left, m_bufferBytes, &outbytes_left);
>                m_bufferBytes[CHARSET_CONVERSION_BUFFER_SIZE-outbytes_left-1]
> = 0;
>                ap_rputs(m_bufferBytes, m);
>        }
>
> };

Adelle, your code doesn't appear to be handling errors. A number of
things can go wrong here:

1. The conversion may not be supported.

2. Partial character sequences (not an issue here since the input is
UTF-32 but I mention it for posterity's sake), reported as
APR_EINCOMPLETE.

3. Illegal character sequences, reported as APR_EINVAL.

4. Output buffer too short. Reported as APR_SUCCESS but with inbytes_left > 0.


Re: Converting a 16-bit string to 8-bit?

2011-03-03 Thread Ben Noordhuis
On Thu, Mar 3, 2011 at 17:12, Sam Carleton  wrote:
> I am looking at using a 3rd party library that only operates on 16-bit
> strings.  Is there a built in functions to convert the strings back to
> 8-bit?  I am currenly on Windows and Windows has built in functions I could
> use, I would prefer to use Apache functions if they exist.

Sam, you want either apr_xlate.h or apr-iconv / native iconv.


Re: Help with apr mutex

2011-03-01 Thread Ben Noordhuis
On Tue, Mar 1, 2011 at 17:32, Joshua Marantz  wrote:
> Is there any design doc or examples on how these can be used to, say, 
> construct
> a server-wide cache that can be accessed from any of the child processes in
> (say) prefork mode?

I don't know of any design documents or tutorials but as to examples:

* slotmem is for fixed size caches, it's used by mod_proxy_balancer.c
to store state about its backends.

* socache is for dynamic caches, mod_ssl and mod_authn_socache.c
respectively cache SSL sessions and user credentials with it.

mod_authn_socache.c is a good starting point, it's less than 500 lines of code.


Re: Help with apr mutex

2011-02-28 Thread Ben Noordhuis
On Mon, Feb 28, 2011 at 18:26, Simone Caruso  wrote:
> I wrote a simple cache inside my module with apr_shm and apr_rmm

Simone, have a look at ap_socache.h and ap_slotmem.h, they're two
simple cache facilities that were added in 2.3.0. Might save you some
work. :)


Re: mod_ruby on Apache for Windows 2.2.17

2011-02-11 Thread Ben Noordhuis
On Fri, Feb 11, 2011 at 14:11, Zeno Davatz  wrote:
>> Apache on Windows serves all requests from a single process.
>
> Apache on Linux does not do that?

Nope. The worker and event MPMs are hybrids: serving requests from
many processes, where each process has many threads. And if all
processes are busy, Apache will simply spin up more.

>> The Ruby interpreter is not thread-safe so mod_ruby creates a Big Mutex
>> whenever it needs to run. Thus on Windows, with its single-process
>> model, mod_ruby can only serve one request at a time.
>
> So you say, that mod_ruby on Windows can _not_ leverage its power
> because Apache on Windows is a single thread process?

Effectively single-threaded when mod_ruby is serving a request, yes.


Re: mod_ruby on Apache for Windows 2.2.17

2011-02-11 Thread Ben Noordhuis
On Fri, Feb 11, 2011 at 08:25, Zeno Davatz  wrote:
> I am trying to debug mod_ruby to load in Apache for Windows. So far
> Apache for Windows does start with mod_ruby.so but it seems that httpd
> does not start correctly with mod_ruby enabled in Apache for Windows.

I don't have a solution for you but I would suggest to not do this
(run mod_ruby on Windows, that is).

Apache on Windows serves all requests from a single process. The Ruby
interpreter is not thread-safe so mod_ruby creates a Big Mutex
whenever it needs to run. Thus on Windows, with its single-process
model, mod_ruby can only serve one request at a time.


Re: Filter to modify request headers

2011-01-25 Thread Ben Noordhuis
On Tue, Jan 25, 2011 at 23:22, Jodi Bosa  wrote:
> What would be a good early hook to modify request headers that is _AFTER_
> mod_ssl is finished decrypting request?
>
> When I do a ap_add_input_filter() from a ap_hook_insert_filter() seems to
> trigger really late (e.g. after quick_handler, post_read, etc...).

post_read is a good place to tamper with headers, it's what mod_headers uses.

Unconditional filters should be registered with
ap_register_input_filter(), conditional filters with
ap_hook_insert_filter() and ap_hook_insert_error_filter()* +
ap_add_input_filter().

mod_ssl's input filter runs at AP_FTYPE_CONNECTION + 5 by the way, so
you're good to go if you hook at AP_FTYPE_CONNECTION + 4. But you
probably don't need to do that.

* Something of a misnomer - error_filter also runs for 204 and 3xx responses.


Re: Re: How to send a jpeg-file in Handler

2011-01-19 Thread Ben Noordhuis
2011/1/19 Whut  Jia :
> Can I don't use sub-request??
> I want only a single picture to client;Just like the same as sending a text :
> r->content_type="text/html";
> ap_rputs("helloworld",r);
> return OK;

Not sure what you mean. If it's a single static image, convert it to a
C byte array and send it with ap_rwrite().


Re: How to send a jpeg-file in Handler

2011-01-19 Thread Ben Noordhuis
2011/1/19 Whut  Jia :
> I want to return a local jpeg-file to client when client request url is 
> /image/metto .In handler module ,I should how to write??

ap_sub_req_lookup_uri() or ap_sub_req_lookup_file()?


Re: module configuration kill

2011-01-12 Thread Ben Noordhuis
On Wed, Jan 12, 2011 at 22:56, Peter Janovsky  wrote:
> that is definitely of use.  thank you.  where would i call
> apr_pool_register_cleanup?  originally i thought it would be in register_hooks

>From your child_init hook. Register it with ap_hook_child_init() from
your register_hooks function.


Re: mod_sflow

2011-01-09 Thread Ben Noordhuis
On Sat, Jan 8, 2011 at 00:06, Neil McKee  wrote:
> This module is designed to work in both "prefork" and "worker" models.  I 
> would really
> appreciate it if someone could review the design to make sure I made 
> appropriate choices
> about where to use pipes, shared-memory, mutex locking,  and so on(!)   These 
> choices
> are documented in the comment at the top of the mod_sflow.c file,  here:
> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=14

Neil, two points of critique:

1. You are doing way too much in your critical section, including
potentially blocking actions like logging.

2. Assuming it's safe to write up to 4K to the pipe is dangerous for
several reasons: PIPE_BUF may be < 4096, the pipe may not be empty,
etc. This ties in with #1 since you are doing it from within the
critical section.


Re: Help trying to figure out why an output_filter is not called.

2011-01-05 Thread Ben Noordhuis
On Wed, Jan 5, 2011 at 22:03, Joshua Marantz  wrote:
> Right you are.  That's much simpler then.  Thanks!

My pleasure, Joshua.

Two quick questions, hope you don't mind: Is mod_pagespeed an official
Google project? Or is it something you guys do on your day off? And
are there plans for a nginx port?


Re: Help trying to figure out why an output_filter is not called.

2011-01-05 Thread Ben Noordhuis
On Wed, Jan 5, 2011 at 20:40, Joshua Marantz  wrote:
> So if I try to remove the 'expires' filter from my handler (which runs
> early) then mod_expires will have a handler that runs later that inserts it
> after my module has completed.

No, it's the other way around. mod_expires uses the insert_filter hook
to insert its filter before your handler is run (and how could it be
otherwise? Output filters are there to post-process the content your
handler generates).

Have a look at ap_invoke_handler() in config.c, that should give you a
handle on how the filter chain works. But don't hesitate to post your
questions if you have them, of course. :)


Re: Help trying to figure out why an output_filter is not called.

2011-01-05 Thread Ben Noordhuis
On Wed, Jan 5, 2011 at 15:54, Joshua Marantz  wrote:
> Can you elaborate?   Is this a common practice, to write bytes directly to
> the network from an output filter?  What should I look for?  The owner of

Not common but sometimes you can't avoid it (search the mailing list,
there are a few examples).

> LoadModule perl_module modules/mod_perl.so

mod_perl allows scripts to write directly to the socket.

> I guess we should eliminate FIXUP_HEADERS_OUT, FIXUP_HEADERS_ERR, and
> MOD_EXPIRES.  Are there any other similar header-mucking-filters I need to
> kill?  I don't mind squirreling through the source code to find these names
> (all are string literals in .c files) but I'm nervous they could change
> without warning in a future version.

That's very unlikely to happen with Apache core modules.

> Moreover, expires_insert_filter runs as APR_HOOK_MIDDLE which means it runs
> after my content-generator, which means that it won't have been inserted by
> the time when I want to set my caching headers.

You can remove it from your handler, scan
r->output_filters->frec->name to find the filter.

> I guess that means I have to insert a new late-running hook that kills
> undesirable output filters.  Does that wind up being simpler?

The above is probably easier but whatever ends up being the most
readable / maintainable, right?


Re: Help trying to figure out why an output_filter is not called.

2011-01-05 Thread Ben Noordhuis
On Wed, Jan 5, 2011 at 14:45, Joshua Marantz  wrote:
> other filter be somehow finding our filter and killing it?  Or sending the
> bytes directly to the network before our filter has a chance to run?

Possibly, yes.

By the way, why the complex setup? If you don't want the mod_headers
filter to run, insert your filter before it, then remove it for each
request that you handle.


Re: Overriding mod_rewrite from another module

2011-01-03 Thread Ben Noordhuis
On Mon, Jan 3, 2011 at 23:19, Joshua Marantz  wrote:
> My goal is not to remove authentication from the server; only from messing
> with my module's rewritten resource.  The above statement is just observing
> that, while it's possible to shunt off mod_rewrite by returning OK from an
> upstream handler, the same is not true of mod_authz_host because it's
> invoked with a different magic macro.

My bad, I parsed your post as 'mod_authz_host is a core module and
cannot be removed' which is obviously false but not what you meant.

Yes, all auth_checker hooks are run. You can't prevent it but you can
catch the 403 on the rebound and complain loudly in the logs.
Actually, that's a lie. You can prevent it and that might also answer
this next bit...

> There may exist some buffer in Apache that's 8k.  But I have traced through
> failing requests earlier that were more like 256 bytes.  This was reported
> as mod_pagespeed Issue
> 9 and
> resolved by limiting the number of css files that could be combined together
> so that we did not exceed the pathname limitations.  I'm pretty sure it was
> due to some built-in filter or core element in httpd trying to map the URL
> to a filename (which is not necessary as far as mod_pagespeed is concerned)
> and bumping into an OS path limitation (showing up as 403 Forbidden).

This might be the doing of core_map_to_storage(). Never run into it
myself (with URLs up to 4K, anyway) but there you go.

Okay, here is a dirty secret: if you hook map_to_storage and return
DONE, you bypass Apache's authentication stack - and nearly all other
hooks too. Probably an exceedingly bad idea.

You can however use it to prevent core_map_to_storage() from running.
Just return OK and you're set.

> I'm still interested in your opinion on my solution where I (inspired by
> your hack) save the original URL in request->notes and then use *that* in my
> resource handler in lieu of request->unparsed_uri.  This change is now
> committed to svn trunk (but not released in a formal patch) as
> http://code.google.com/p/modpagespeed/source/detail?r=348 .

Sounds fine, that's the kind of stuff request notes are for.


Re: Overriding mod_rewrite from another module

2011-01-01 Thread Ben Noordhuis
On Sat, Jan 1, 2011 at 00:16, Joshua Marantz  wrote:
> Thanks for the quick response and the promising idea for a hack.  Looking at
> mod_rewrite.c this does indeed look a lot more surgical, if, perhaps,
> fragile, as mod_rewrite.c doesn't expose that string-constant in any formal
> interface (even as a #define in a .h).  Nevertheless the solution is
> easy-to-implement and easy-to-test, so...thanks!

You're welcome, Joshua. :)

You could try persuading a core committer to add this as a
(semi-)official extension. Nick Kew reads this list, Paul Querna often
idles in #node.js at freenode.net.

> I'm also still wondering if there's a good source of official documentation
> for the detailed semantics of interfaces like ap_hook_translate_name.
>  Neither a Google Search, a  stackoverflow.com search, nor the Apache
> Modulesbook
> offer much detail.
> code.google.com fares a little better but just points to 4 existing usages.

This question comes up often. In my experience the online
documentation is almost always outdated, incomplete or outright wrong.
I don't bother looking things up, I go straight to the source.

It's a kind of job security, I suppose. There are only a handful of
people that truly and deeply understand Apache. We can ask any hourly
rate we want!


Re: Overriding mod_rewrite from another module

2010-12-31 Thread Ben Noordhuis
On Fri, Dec 31, 2010 at 18:17, Joshua Marantz  wrote:
> Is there a better way to solve the original problem: preventing mod_rewrite
> from corrupting mod_pagespeed's resources?

>From memory and from a quick peek at mod_rewrite.c: in your
translate_name hook, set a "mod_rewrite_rewritten" note in r->notes
with value "0" and return DECLINED. That'll trick mod_rewrite into
thinking that it has already processed the request.


Re: Re: compile a file written by C++ into apache

2010-11-30 Thread Ben Noordhuis
2010/11/30 whut_jia :
> In Apache2.2, I compile a c++ source file with g++ as below:
> g++ -fPIC -shared -o mod_validate.so mod_validate.cpp -I/usr/include/httpd 
> -I/usr/include/apr-1 -I/opt/opensaml/include
> After it , I copy mod_calidate.so into apache module location ,and this 
> module work well.
> But now,in apache2.3,I compile this file in  the same way.it accurs the 
> following error,
>     /apache2.3/include/http_config.h:989:error:expected ","or "..." before 
> ‘new’
> (In headers file ,The 989th  line is:
> AP_DECLARE(void) ap_merge_log_config(const struct ap_logconf *old,
>                                     struct ap_logconf *new);
> )
> I think there are not errors in this line ,but why can i compile it 
> successfullly?

new is a C++ keyword. Three solutions.

1. Rename the parameter in http_config.h to new_conf. Bad.
2. At the top of your source file add "#define new new_". Bad.
3. Make your module C only. Split off the C++ code into a separate file. Good.


Re: compile a file written by C++ into apache

2010-11-30 Thread Ben Noordhuis
2010/11/30 whut_jia :
> I write a module by C++ supporting the generation of SAML assertion, so in my 
>  module I called the OpenSAML Library.The question is how I compile this 
> source file written by c++ language into my apache server.

Like you compile any mixed project: define a function in your C++ code
with C linkage[1] and call it from your C code. Compile .c and .cc
files to object files, then link them into a shared object.

apxs2 probably won't be much help here, it doesn't seem to handle C++
source files (at least, it never does in my projects).

[1] extern "C" void entrypoint(void *data) { /* ... */ }


Re: Can i send multi-request(with Range:bytes=start-lenth) on a single connection?

2010-11-23 Thread Ben Noordhuis
2010/11/24 zhoubug :
> Can i send multi-request(with Range:bytes=start-lenth) on a single
> connection?
> i want to reuse a connection with keep-alive,and send second request
> after receive
> the first response,but the apache response with 501 error?
> what should i do if i want reuse the connection?

Are you sure you are on the right mailing list? This is modules-dev,
maybe you're looking for users[1]?

[1] http://httpd.apache.org/userslist.html


Re: Shared memory ?

2010-11-15 Thread Ben Noordhuis
On Mon, Nov 15, 2010 at 17:12, Rémy Sanchez  wrote:
> I'm coding a module to somehow replace/complement mod_security (it's more a
> proof of concept than a real project for now). The first thing that I'd like
> to have is a DNSBL, so that detected intruders are instantly banned when
> added to the blacklist. Because doing a DNS query for each HTTP request
> might be a bit heavy, I'd like to keep the results in cache.

The stuff in apr_shm.h is what you want.

> I guess that if I create something from the config pool, it will be
> duplicated between processes. But another security I want is to check URL
> against regexps commonly used by botnets/script kiddies. Then, if an IP is
> blacklisted, I want its state to be changed instantly in all caches. Which,
> if data is duplicated, is not possible. Would there be a simple way to
> achieve this ? Or would it be more clever to move to another solution, like
> using a common redis datastore for blacklist/whitelist/rules lookup ?

I would probably take this direction (store it in a database,
relational or otherwise). Works across multiple nodes and is
scriptable from outside.


Re: ownership & mmaped files - I have to be missing something...

2010-11-11 Thread Ben Noordhuis
On Thu, Nov 11, 2010 at 08:28, Mike Meyer
 wrote:
> Is there a hook that runs after config in the parent, but as the
> unprivileged id that I should be using? I couldn't find one (either in

There isn't one, setuid() is called right before the child_init hook.

Having said that, can't you open() the file in the parent and have the
children mmap() the fd into memory? Just make sure it isn't marked as
FD_CLOEXEC.


Re: Determining if a bucket is created by

2010-10-29 Thread Ben Noordhuis
On Sat, Oct 30, 2010 at 00:24, Travis Bassetti
 wrote:
> #include directive via mod_include.    Is there a way to tell when a bucket is
> created via #include?   I want to exclude processing the bucket if it was
> created by the #echo directive.    I can't tell if there is a difference in 
> the
> bucket type when created by #include or #echo.   Is there some other flag or 
> way
> to differentiate these buckets?

I don't think so. You could override the #echo directive with
APR_RETRIEVE_OPTIONAL_FN(ap_register_include_handler) and tag the
bucket in some way.


Re: How to init an mmaped file?

2010-10-25 Thread Ben Noordhuis
Mike, is your code available anywhere?


Re: How to access client socket from a protocol handler

2010-10-24 Thread Ben Noordhuis
On Sun, Oct 24, 2010 at 11:09, Alexander Farber
 wrote:
> So for content handlers the convention is
> to use "SetHandler XXX" in httpd.conf and
> then at the runtime they check for that string with
>
> if (!r->handler || (strcmp(r->handler, "XXX") != 0)) {
>    return DECLINED;
> }
>
> But for protocol handlers there is no such convention.
> You have to introduce some keyword for httpd.conf
> and check for it. Or in my case you could just:
>
>        if (conn->base_server->port != 843)
>                return DECLINED;
>
> at the beginning? (seems to work)

Yes, that is correct. A config directive is the cleaner solution, though.


Re: How to access client socket from a protocol handler

2010-10-23 Thread Ben Noordhuis
On Sun, Oct 24, 2010 at 00:00, Alexander Farber
 wrote:
> I've created a module using bb (the source code at the bottom)
> and it suffers from the same problem (hijacks the port 80 too).
> Could it be that "SetHandler" is a wrong directive for protocol handler?

The wrong directive, yes. SetHandler handlers work at the request
level, protocol handlers at the connection level.

> Also, I do not know, how to check that the
> "handler is enabled for the current vhost".

>From mod_echo.c:

  static int process_echo_connection(conn_rec *c)
  {
  EchoConfig *pConfig =
ap_get_module_config(c->base_server->module_config, &echo_module);
  if (!pConfig->bEnabled) {
  return DECLINED;
  }

Hope that helps.


Re: How to access client socket from a protocol handler

2010-10-23 Thread Ben Noordhuis
Alexander, take a look at mod_echo.c (included in the source tarball).
It's a great example of how a protocol handler should work and it just
might convince you to use bucket brigades after all. :)

You need to check if your handler is enabled for the current vhost. If
it's not, return DECLINED. If it is, look up the client socket and go
from there.


Re: How to access client socket from a protocol handler

2010-10-23 Thread Ben Noordhuis
On Sat, Oct 23, 2010 at 10:13, Alexander Farber
 wrote:
> I wonder why my mod_perl module works and the C one not.

Your connection handler should return DECLINED for vhosts it doesn't
handle (I wager mod_perl did this for you).

You can get the vhost with conn->base_server and your module's
per-server config with
ap_get_module_config(conn->base_server->module_config, &your_module).


Re: How to access client socket from a protocol handler

2010-10-22 Thread Ben Noordhuis
On Sat, Oct 23, 2010 at 00:15, Alexander Farber
 wrote:
> Should I maybe try
> apr_socket_t *socket = conn->cs->desc->s
> or something similar instead?

Probably not, the conn_config solution is most portable across Apache versions.

> And what do you mean by &core_module
> in my case (source code below)?

That's the reference to Apache itself, the core is a module too.
Elegant, isn't it?

> And why is direct socket I/O bad idea,
> isn't this how protocol handling modules (like mod_ftp, mod_smtp)
> are supposed to work?

There is no yes or no to this question, mostly it depends.

You should strive to use what is already in place, if only because it
will make your life easier down the road. Upsides to using the bucket
brigade and the filter chain:

* cross-platform
* published and supported APIs (will work with future releases of Apache)
* fairly straight-forward and transparent SSL/TLS integration

Downsides:

* overhead (slower)
* higher learning curve

So consider the pros and cons and pick the best solution. And don't
hesitate to ask questions if you have them. :)


Re: How to access client socket from a protocol handler

2010-10-22 Thread Ben Noordhuis
On Sat, Oct 23, 2010 at 00:01, Mike Meyer
 wrote:
> I use that to get the socket so I can poll for it to have data in it,
> and do other things while I'm waiting. Is there a better alternative
> for that, or is this an exception?

You could do it through apr_bucket_read(APR_NONBLOCK_READ) but polling
on the socket is probably simpler, especially if you are polling on
more than one fd.

Just don't read or write data directly, that would bypass the filter
chain (and break logging, for starters).


Re: How to access client socket from a protocol handler

2010-10-22 Thread Ben Noordhuis
On Fri, Oct 22, 2010 at 23:08, Alexander Farber
 wrote:
> Unfortunately there aren't many example for the protocol handlers
> on the web or in Nick's book. I've come up with the following,
> but don't know how to get the client socket via conn_rec?

apr_socket_t *client  = ap_get_module_config(conn->conn_config, &core_module);

But note that in most cases direct socket I/O is a bad idea.


Re: How do I get hold of session information?

2010-10-20 Thread Ben Noordhuis
Most people hack on Apache in their own time and nobody likes writing
documentation so yes, what documentation there is, is often sparse.

"Use the source, Luke" is the best advice I can give you.


Re: How do I get hold of session information?

2010-10-19 Thread Ben Noordhuis
On Tue, Oct 19, 2010 at 17:30, Paul Donaldson
 wrote:
> Thank you. I will take a look at mod_session. Will my module be able to check 
> if
> mod_session is "enabled" (sorry, I don't know the Apache terminology) and, if 
> it
> is, talk to it and ask it for what it has stored in its session?

Yes. It exports an environment variable, see the SessionEnv directive
for details.


Re: How do I get hold of session information?

2010-10-19 Thread Ben Noordhuis
On Tue, Oct 19, 2010 at 17:05, Paul Donaldson
 wrote:
> I assume that if I were to make a request to a web site hosted on Apache then
> the capability exists for one of the server side web pages to create a session
> and store some piece of data in it. What I want to do in my module is get hold
> of that session (if it exists) and read data from it.

Apache core doesn't have a concept of sessions (or state as such) but
take a look at mod_session.


Re: Memory Pool

2010-10-12 Thread Ben Noordhuis
Martin, if you are working in a constrained environment, then you are
probably better off using something like libmicrohttpd[1] or
libevent's evhttp interface[2]. Apache has a rather heavy resource
footprint.

[1] http://www.gnu.org/software/libmicrohttpd/
[2] http://monkey.org/~provos/libevent/doxygen/evhttp_8h.html


Re: Memory Pool

2010-10-11 Thread Ben Noordhuis
On Mon, Oct 11, 2010 at 16:40, Martin Townsend
 wrote:
> use, or should I set a flag and then use a hook like fix-ups that will check
> this flag and then call  apr_pool_clear()?

This. You can use a request note for a flag.


Re: Memory Pool

2010-10-11 Thread Ben Noordhuis
On Mon, Oct 11, 2010 at 16:14, Martin Townsend
 wrote:
> I have created a pool from the child pool for storing warning messages that
> can live across requests, the final request will insert the warnings into
> the response.  How do I ensure that this pool is cleared at the end of the
> final request?

By calling apr_pool_clear() or apr_pool_destroy()?


Re: New module for anonymous ip logging

2010-10-06 Thread Ben Noordhuis
On Wed, Oct 6, 2010 at 09:35, Franz Schwartau  wrote:
> But how exactly can I "abort"? If NULL is returned from log_ip_hash() a
> '-' is printed for the % directive from mod_log_iphash only.

Return HTTP_INTERNAL_SERVER_ERROR from your post_config hook. Or
anything but OK or DECLINED, really.


  1   2   >