Ian Bicking wrote:

This is something messed up with CGI on NT, and whatever server you are using, and perhaps the CGI adapter (maybe there's a way to get the raw environment without any encoding, for example?)

Python decodes the environ to its own copy (wrapped in os.environ) at interpreter startup time; there's no way to query the real ‘live’ environment that I know of. It'd require a C extension.

Honestly I don't know if anyone is doing anything with WSGI and Python 3.

I know Graham has done some work on mod_wsgi for 3.0, but no, I don't know anyone using it in anger.

Is it worth submitting patches to simple_server to make it run on 3.0? Is it too late to include at this stage anyway? Shipping 3.0 with a non-functional wsgiref is a bit embarrassing.

I assume there is some way to get at the bytes in the environment, if not then that is a Python 3 bug.

There is not, and this appears to be deliberate.

I think it might be feasible to support an encoded version of SCRIPT_NAME and PATH_INFO for WSGI 2.0 (creating entirely new key names, and I don't know of any particular standard to base those names on),
moving from the two keys to a single REQUEST_URI is not feasible.

That's certainly a possibility, but I feel it's easier to hitch a ride on the existing header, which despite being non-standard is still quite widely used.

I guess you'd probably count segments, try to catch %2f (where the
segments won't match up), and then double check that the decoded
REQUEST_URI matches SCRIPT_NAME+PATH_INFO.

I'm currently testing with just the segment counting. It's only necessary that the segments from SCRIPT_NAME are matched and stripped, and those are extremely unlikely to contain ‘%2F’ because:

  - there aren't many filesystems that can accept ‘/’ as a filename
    character. RISC OS is the only one I can think of, and it by
    convention swaps ‘/’ and ‘.’ to compensate as it is, so even
    there you couldn't use ‘%2F’;
  - there aren't many webservers that can map a file or alias to a
    path containing ‘%2F’;
  - no-one wants to mount a webapp alias at such a weird name — it's
    only in the section corresponding to PATH_INFO that ‘%2F’ might
    ever be of use in practice.

In the worst case, many applications already know and can strip the URL at which they're mounted, but unless there's a legitimate ‘%2F’ in their SCRIPT_NAME it doesn't actually matter.

frankly IIS is probably less relevant to most developers than CGI.

Er... really?

You and I may not favour it, but it's ≈35% of the world out there, not something we can afford to ignore IMO.

So if IIS has problems with PATH_INFO, the WSGI adapter (be it CGI or otherwise) should be configured to fix those problems up front.

What I'm saying is that neither Apache's nor IIS's behaviour can be considered clearly correct or wrong at this point, and there is no way a WSGI adapter living underneath them *can* fix up the differences.

(There is an problem with PATH_INFO that a WSGI adapter *could* clear up, which is that IIS makes PATH_INFO the entire path including SCRIPT_NAME. I'm not sure whether it's worth fixing that up in the adapter layer though... it's possible some frameworks are already dealing with it, and might even be relying on it!)

--
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to