On Sun, 29 Nov 2020 17:01:42 +0000
Nick Kew <n...@apache.org> wrote:

> > On 29 Nov 2020, at 14:48, Florian Wagner <flor...@wagner-flo.net> wrote:
> > 
> > Hi everyone!
> > 
> > I was wondering if someone with a better understanding of httpd and
> > mod_proxy could review my module idea and prototype implementation and
> > warn me of any unforeseen pitfalls looming ahead before I commit to
> > implementing this fully.  
> The main thing that springs to mind is security when you spawn a process.
> CGI and fastcgi are battle-hardened by decades of operational use.

Hey Nick, thanks for your comments.

I'd never want to build the process management and supervision part
completely from the ground up. Especially since the tried and true
implementation in mod_fcgid seems to be rather easy to lift out.

> [...]
> > While fully understanding that one can be of the opinion that process
> > management is best kept out of httpd, I personally like the convenience
> > and more importantly clarity offered by having the complete command,
> > arguments and environment required to run the backend application in
> > the httpd configuration. Authentication, URL rewriting and whatelse
> > will already be setup there, anyway.  
> If it's in the config then at least it's [probably] coming from a trusted 
> source,
> but then why run per-request?

Let me apologize for the choice of function name (start_process) in my
prototype. That probably made you think I'd want to start the backend
process once for each request (CGI style). Rather I'm aiming at long
running backends. That function should probably be called


instead... :-D

Is that what you were hinting at with your question?

> > So I took a shot at seeing if I could implement a module to do just
> > that. My current idea/prototype:
> > 
> > 1. Register a hook to run before mod_proxy.c:proxy_handler and have a
> >    look at the request filename and handler to see if they start with
> >    "proxy:spawn://".  
> Big red flag for security there.  Hope you're paying very careful attention
> to your input: there's nothing to that effect in what you attached.

My naive understanding of the httpd request pipeline made me try

  curl 'http://.../proxy:spawn://${MALICIOUS}|http://localhost/'

For this r->filename ended up being "${DocumentRoot}/proxy:spawn:" as
there is no matching ProxyPass/RewriteRule, mod_proxy and my
proxy_spawn_handler skip this one as the filename prefix is not right
and finally the file handler gets invoked returning a 404. Next:

  curl 'http://.../pass/proxy:spawn://${MALICIOUS}|http://localhost/'
with "ProxyPass /pass spawn://${CMD}|http://localhost/"; in the httpd.conf.
That got rewritten by proxy_trans to


which also doesn't seem like an issue. That would spawn ${CMD} and
simply pass the "proxy:spawn:${MALICIOUS}" part on to the backend
service as the request URL.

For sure there are RewriteRules that could lead to external clients
being able to execute arbitrary command, like

  RewriteRule ^(.*)$ spawn://$1|http://localhost/

but these all seem to me like they only would be caused by erroneous
httpd configurations.

I probably missed something to consider here. Help and pointers would
be appreciated!

> Also I'd consider hooking it earlier in the request cycle

Yeah since rewriting the name is what this does, it's probably better
fitted for a translate_name hook. With APR_HOOK_FIRST and aszPred = {
"mod_proxy.c", NULL } so it gets called after mod_proxy and proxy_trans
has the chance to match the filename against ProxyPass rules.

> or into mod_proxy instead.

That's what I tried first. After reading the code for half a day I gave
up finding a way to hook this into. Explaining the various ways I tried
and failed would be half a novel so I'll spare you that.

But if anyone has a hint how I could accomplish it that way I'll
happily have a look and try again.

> How does mod_proxy_fcgi fit your vision?

Like any of the protocol implementations of mod_proxy that support
AF_UNIX sockets it should work nicely:

  ProxyPass / spawn://...|fcgi://localhost/

The process management/supervision being separate from the actual
protocol proxying is rather important to me. That way I can continue to
deploy legacy FastCGI services through mod_proxy_fcgi while newer apps
that use WebSockets use mod_proxy_http and mod_proxy_wstunnel.

And all the while the backend processes are started and supervised by
my module.


