On 5/9/07, Shannon -jj Behrens <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I could use a bit of help.
>
> I'm looking at the code for PrefixMiddleware:
>
>     http://pythonpaste.org/deploy/paste/deploy/config.py.html?f=192&l=277#192
>
> If X_FORWARDED_SERVER is set and self.force_port is None,
> environ['SERVER_PORT'] should be updated intelligently.  It should
> contain either the port specified in X_FORWARDED_SERVER, 80, or 443.
>
> Back in my Aquarium days, this worked.  The code looked like:
>
>         defaultReturnValue = (cgiEnv["SERVER_NAME"], cgiEnv["SERVER_PORT"])
>         (cgiEnv["SERVER_NAME"], cgiEnv["SERVER_PORT"]) = parseHost(
>             header=cgiEnv.get("HTTP_X_FORWARDED_HOST"),
>             defaultReturnValue=defaultReturnValue, isSecure=isSecure)
>
> where parseHost was defined in:
>
> http://aquarium.cvs.sourceforge.net/aquarium/aquarium/aquarium/parse/Host.py?view=markup
>
> Since environ['SERVER_PORT'] isn't being updated properly, I'm forced
> to specify force_port.  That's a pain because the value differs
> between production and development.  Ideally, this should just come
> from X_FORWARDED_SERVER.
>
> What's worse is that I can't currently specify force_port because
> specifying the following in the .ini file breaks "paster setup-app":
>
> filter-with = proxy-prefix
>
> [filter:proxy-prefix]
> use = egg:PasteDeploy#prefix
> force_port =
> prefix =
>
> I brought up this issue here:
> http://groups.google.com/group/pylons-discuss/browse_frm/thread/1da07543512792
>
> Thanks for your help!
> -jj

I broke down and wrote my own proxy middleware, which I'm happy to
contribute back to Paste.

In my app, I don't need to change the URL prefix since I'm proxying
the entire subdomain.  Nor do I need to force a port; I prefer that it
be pulled from the X_FORWARDED_HOST header.  Last of all, I wanted the
same middleware to be used in all cases with no configuration whether
or not a proxy is actually used.

Here's the middleware:

================================================================================
import re

__docformat__ = "restructuredtext"

HTTP_PORT = 80
HTTPS_PORT = 443
MIN_PORT = 1
MAX_PORT = 2 ** 16

MODIFIED_HEADERS = ('SERVER_NAME', 'SERVER_PORT', 'HTTP_HOST',
                    'HTTP_X_FORWARDED_HOST')

# This is a regex for a valid hostname.
#
# There's not much point in being overly strict about what you'll accept in
# the HOST header, so I'll accept anything that's even vaguely similar to a
# hostname or IP.

hostname_regex = re.compile("^[a-zA-Z0-9_\-\.]+$")


class ProxyHandler:

    """Automatically do the RIGHT THING if a proxy is used.

    HACK: Unfortunately, PasteDeploy's PrefixMiddleware is failing in
    these ways:

     * If you put the following in your .ini file, "paster setup-app" will
       fail::

           filter-with = proxy-prefix

           [filter:proxy-prefix]
           use = egg:PasteDeploy#prefix
           force_port =
           prefix =

     * If you don't specify force_port at all, the SERVER_PORT doesn't get
       set correctly.

     * If you try to shove PrefixMiddleware directly into middleware.py
       in Pylons, you must pick a port, but that port differs between
       dev and production.

    Ideally, using a proxy should JUST WORK without needing to configure
    anything.  Hopefully, this middleware will make that possible.

    """

    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):

        """Temporarily modify the HTTP headers if a proxy is being used.

        If present, use the X_FORWARDED_HOST header in order to override
        SERVER_NAME and SERVER_PORT.

        """

        is_proxied = 'HTTP_X_FORWARDED_HOST' in environ
        try:
            if is_proxied:

                # Save the original headers.

                saved = environ['proxy_handler_saved_headers'] = {}
                for header in MODIFIED_HEADERS:
                    saved[header] = environ[header]

                # Modify the headers, but only if everything is well
                # formed.

                default_return_value = (environ['SERVER_NAME'],
                                        environ['SERVER_PORT'])
                is_secure = environ['wsgi.url_scheme'] == 'https'
                returned = (environ['SERVER_NAME'], environ['SERVER_PORT']) = \
                    parse_host_header(
                        header=environ['HTTP_X_FORWARDED_HOST'],
                        default_return_value=default_return_value,
                        is_secure=is_secure)
                if returned != default_return_value:
                    environ['HTTP_HOST'] = environ.pop('HTTP_X_FORWARDED_HOST')

            return self.app(environ, start_response)

        finally:
            if is_proxied:

                # Restore the original headers.

                for header in MODIFIED_HEADERS:
                    environ[header] = saved[header]


def parse_host_header(header=None, default_return_value=(None, None),
                      is_secure=False):
    """Parse the ``Host`` header.

    header
      This is the value of the ``Host`` header or None.

    default_return_value
      Return this if the header is absent or malformed.

    is_secure
      Is the current connection over SSL?  This will tell me the default port.

    Return a tuple ``(server_name, server_port)``.

    We're not very strict about requiring the ``Host`` header.

    See also: Host_

    .. _Host:
       http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23.

    """
    if header is None:
        return default_return_value
    if is_secure:
        default_port = HTTPS_PORT
    else:
        default_port = HTTP_PORT
    (server_name, server_port) = \
        (header.split(":") + [str(default_port)])[:2]  # A default trick
    if not hostname_regex.match(server_name):
        return default_return_value
    try:
        port = int(server_port)
    except ValueError:
        return default_return_value
    if port < MIN_PORT or MAX_PORT < port:
        return default_return_value
    return (server_name, server_port)
================================================================================

Here are the tests:

================================================================================
"""Test ``lookandfeel.lib.proxyhandler``."""

from copy import deepcopy
from pprint import pprint
from unittest import TestCase

from lookandfeel.lib.proxyhandler import ProxyHandler, parse_host_header, \
    HTTP_PORT, HTTPS_PORT, MAX_PORT


class Test(TestCase):

    def test_parse_host_header(self):
        DEFAULT = ('default', '7777')
        for (header, is_secure, expected) in (
            (None, False, DEFAULT),
            ('foo:800', False, ('foo', '800')),
            ('foo', False, ('foo', str(HTTP_PORT))),
            ('foo', True, ('foo', str(HTTPS_PORT))),
            ('&garbage', False, DEFAULT),
            ('foo:a', False, DEFAULT),
            ('foo:%d' % (MAX_PORT + 1), False, DEFAULT),
            ('foo:0', False, DEFAULT)
        ):
            value = parse_host_header(header, default_return_value=DEFAULT,
                                      is_secure=is_secure)
            self.assertEqual(value, expected)

    def test_not_proxied(self):
        orig_environ = dict(SERVER_NAME='localhost', SERVER_PORT='8000',
                            HTTP_HOST='localhost:8000')
        self._test_proxy_handler(orig_environ, orig_environ)

    def test_proxied(self):
        orig_environ = dict(SERVER_NAME='localhost', SERVER_PORT='8000',
                            HTTP_HOST='localhost:8000',
                            HTTP_X_FORWARDED_HOST='example.com')
        expected_environ = dict(SERVER_NAME='example.com', SERVER_PORT='80',
                                HTTP_HOST='example.com',
                                proxy_handler_saved_headers=orig_environ)
        self._test_proxy_handler(orig_environ, expected_environ)

    def test_invalid_proxy(self):
        orig_environ = dict(SERVER_NAME='localhost', SERVER_PORT='8000',
                            HTTP_HOST='localhost:8000',
                            HTTP_X_FORWARDED_HOST='&garbage')
        expected_environ = orig_environ.copy()
        expected_environ['proxy_handler_saved_headers'] = orig_environ
        self._test_proxy_handler(orig_environ, expected_environ)

    def _test_proxy_handler(self, orig_environ, expected_environ):

        # The environs should effectively be pass by value, or else
        # things get too confusing.

        orig_environ = deepcopy(orig_environ)
        expected_environ = deepcopy(expected_environ)

        def app(environ, start_response):
            pprint(('environ:', environ))
            pprint(('expected_environ:', expected_environ))
            self.assertEqual(environ, expected_environ)

        for env in (orig_environ, expected_environ):
            env.setdefault('wsgi.url_scheme', 'http')
        ProxyHandler(app)(orig_environ, start_response=lambda: None)
================================================================================

To use the middleware, which happens to live in
lookandfeel.lib.proxyhandler in my application, add the following to
``yourpackage/config/middleware.py`` before the RegistryManager
middleware::

        from lookandfeel.lib.proxyhandler import ProxyHandler

        # Proxies should *just work*.
        app = ProxyHandler(app)

I've tested, and h.url_for(action='whatever', qualified=True) *just
works* whether I'm using a proxy or not.

Happy Hacking!
-jj

-- 
http://jjinux.blogspot.com/

_______________________________________________
Paste-users mailing list
[email protected]
http://webwareforpython.org/cgi-bin/mailman/listinfo/paste-users

Reply via email to