I am trying to understand how path stripping works in httpd(8),
particularly how FastCGI's SCRIPT_NAME parameter gets filled.
The rule about whether it has a trailing slash or not seems
inconsistent. I would really appreciate some extra eyes to work through this.
I don't know if httpd is at fault, my app, or my understanding of CGI.

I am giving a webapp[1] a mountpoint on my site, using
`request strip 3` to hide the mountpoint from the app.

```
# /etc/httpd.conf
server "default" {
        listen on localhost port 80
        directory auto index

        location "/path/to/app" {
            request strip 3
            fastcgi socket :5232
        }

        location "/path/to/app/*" {
            request strip 3
            fastcgi socket :5232
        }

        log syslog
}
```

with this, I see:

http://localhost/path/to/app =>
 'DOCUMENT_URI': '/path/to/app',
 'PATH_INFO': '',
 'REQUEST_URI': '/path/to/app',
 'SCRIPT_NAME': '/path/to/app',

http://localhost/path/to/app/ =>
 'DOCUMENT_URI': '/path/to/app/',
 'PATH_INFO': '',
 'REQUEST_URI': '/path/to/app/',
 'SCRIPT_NAME': '/path/to/app/',

http://localhost/path/to/app/login =>
 'DOCUMENT_URI': '/path/to/app/login',
 'PATH_INFO': '/login',
 'REQUEST_URI': '/path/to/app/login',
 'SCRIPT_NAME': '/path/to/app',

http://localhost/path/to/app/posts/1 =>
 'DOCUMENT_URI': '/path/to/app/posts/1',
 'PATH_INFO': '/posts/1',
 'REQUEST_URI': '/path/to/app/posts/1',
 'SCRIPT_NAME': '/path/to/app',

Up to the strip limit, SCRIPT_NAME doesn't have a trailing slash,
after the strip limit it doesn't have a trailing slash, but *at*
the strip limit it does.

This is causing me angst because I *want* to use the simpler

```
# /etc/httpd.conf
server "default" {
        listen on localhost port 80
        directory auto index

        location "/path/to/app" {
            request rewrite "$DOCUMENT_URI/"
        }

        location "/path/to/app/*" {
            request strip 3
            fastcgi socket :5232
        }

        log syslog
}
```

but with this

http://localhost/path/to/app =>
 'DOCUMENT_URI': '/path/to/app/',
 'PATH_INFO': '',
 'REQUEST_URI': '/path/to/app',
 'SCRIPT_NAME': '/path/to/app/',

**which gives this warning**

"WARNING: SCRIPT_NAME does not match REQUEST_URI"

which is complaining that SCRIPT_NAME is not a prefix of REQUEST_URI.
SCRIPT_NAME shouldn't have been touched, imo; my goal in `request rewrite 
"$DOCUMENT_URI/"`
was to append to PATH_INFO -- and if I `request rewrite "$DOCUMENT_URI/login"` 
instead that's
exactly what happens, PATH_INFO gets "/login" -- it's only when I add a single 
"/" that
this problem crops up.

Unrelated to the rewrite, the same underlying issue, that /path/to/app/ sets 
PATH_INFO="",
also causes Radicale to mistakenly redirect /path/to/app/ to 
/path/to/app/app/.web [2], because
it thinks that means it's being called as /path/to/app/. I don't know if httpd 
or Radicale is at fault here.

I suspect this is an off-by-one in httpd [3] but I'd like to know if there's a 
better explanation for this behaviour.
I think the better behaviour is

http://localhost/path/to/app/ =>
 'DOCUMENT_URI': '/path/to/app/',
 'PATH_INFO': '/',
 'REQUEST_URI': '/path/to/app/',
 'SCRIPT_NAME': '/path/to/app',

but I am second-guessing myself a lot.

Thank you for your time, and any clues you can toss my way
-Nick

[1] It's Radicale. But see below for my testing webapp that isolated the issue.
[2] 
https://github.com/Kozea/Radicale/blob/db7587c59335fa00580ce88d583419ce45594143/radicale/app/get.py#L64-L69
[3] 
https://github.com/openbsd/src/blob/4564063e97c6de536114caf655a9e16da7a4259f/usr.sbin/httpd/server_fcgi.c#L215


# Appendix: Reproduction (OpenBSD 6.6)

```
$ doas pkg_add py3-flup
$ cat app.fcgi
#!/usr/bin/env python3

"""
Python FastCGI example.

Opens a FastCGI socket on localhost:5232 that just returns "Hello, World!"
but while logging the FastCGI parameters.

"""

from flup.server.fcgi import WSGIServer
from pprint import pprint
import sys

def application(environ, start_response):
    pprint(environ, stream=sys.stderr)
    start_response('200 OK', [('Content-Type', 'text/html')])
    yield 'Hello, World!\r\n'

if __name__ == "__main__":
    WSGIServer(application, bindAddress=("localhost", 5232)).run()
$ chmod +x app.fcgi
```

```
$ cat /etc/httpd.conf
server "default" {
        listen on localhost port 80
        directory auto index

        # Add a trailing slash so the app recognizes /base as its own name
        # as in https://wordpress.org/support/article/htaccess/
        #    or https://radicale.org/proxy/
        location "/path/to/app" {
            request rewrite "$DOCUMENT_URI/"
        }

        location "/path/to/app/*" {
            request strip 3
            fastcgi socket :5232
        }

        log syslog
}
```

Client:

```
$ curl -s -v http://localhost/path/to/app 
*   Trying 127.0.0.1:80...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /path/to/app HTTP/1.1
> Host: localhost
> User-Agent: curl/7.66.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Connection: keep-alive
< Content-Type: text/html
< Date: Mon, 17 Feb 2020 10:09:29 GMT
< Server: OpenBSD httpd
< Transfer-Encoding: chunked
< 
Hello, World!
* Connection #0 to host localhost left intact
```

Server:

```
$ doas httpd -d -v
startup
WARNING: SCRIPT_NAME does not match REQUEST_URI
default 127.0.0.1 - - [17/Feb/2020:05:09:29 -0500] "GET /path/to/app HTTP/1.1" 
200 0
```

App:

```
$ ./app.fcgi
WARNING: SCRIPT_NAME does not match REQUEST_URI{'DOCUMENT_ROOT': '/htdocs',
 'DOCUMENT_URI': '/path/to/app/',
 'GATEWAY_INTERFACE': 'CGI/1.1',
 'HTTP_ACCEPT': '*/*',
 'HTTP_HOST': 'localhost',
 'HTTP_USER_AGENT': 'curl/7.66.0',
 'PATH_INFO': '',
 'QUERY_STRING': '',
 'REMOTE_ADDR': '127.0.0.1',
 'REMOTE_PORT': '20707',
 'REQUEST_METHOD': 'GET',
 'REQUEST_URI': '/path/to/app',
 'SCRIPT_FILENAME': '/htdocs/',
 'SCRIPT_NAME': '/path/to/app/',
 'SERVER_ADDR': '127.0.0.1',
 'SERVER_NAME': 'default',
 'SERVER_PORT': '80',
 'SERVER_PROTOCOL': 'HTTP/1.1',
 'SERVER_SOFTWARE': 'OpenBSD httpd',
 'wsgi.errors': <flup.server.fcgi_base.TeeOutputStream object at 0xe9a4835d690>,
 'wsgi.input': <flup.server.fcgi_base.InputStream object at 0xe9a48355b50>,
 'wsgi.multiprocess': False,
 'wsgi.multithread': True,
 'wsgi.run_once': False,
 'wsgi.url_scheme': 'http',
 'wsgi.version': (1, 0)}
```

Reply via email to