Hello, it's me again,
Phillip J. Eby wrote: > MoinMoin, for example, has its own encoding scheme for handling > pseudo-slashes in paths, and IMO it's a better way to handle it than > trying to rely on finding a server that supports *not* decoding URLs. I had the abstract knowledge that CGI is still used for deployment, but growing up with application servers must have spoiled me. Still, I think nothing stops mod_wsgi passing an encoded URL down to my apps but for adherence to the CGI spec. I've never checked it, nor the ajp + flup combination. Something more for the todo pile. On the short run I'll $2F my slashes. I can't actually use %252F, because everyone seems to think they'll either get an encoded URL to unquote() or that unquote(unquote()) is a no-op: Routes was not alone in this. Blake Winton wrote: > I respectfully disagree. I've been using %-escapes in urls for years, > intending that they get unescaped before being passed to > applications... %7E instead of ~ mainly. > > in XML you can't tell the difference between <![CDATA[<]]> and < > and < You've given an example of separate ways to escape the same '<' character, and I agree that you shouldn't have to distinguish between them. But XML does treat '<' differently from '<': if you just want to write a '<' instead of starting a tag, you need to escape it. I don't want my SAX code[*] to deal with all the different ways to write a literal '<'. But I expect a "<tag" to generate a start_tag event, and "<43" to be decoded into '<' in some element's text property, *not* to generate a start_43 event. I think the same reasoning applies to '/'. Would it apply to '~' and ';' too? [*] I've never actually written SAX-structured code; please pardon any mistaeks. > in urls I would expect the url parser to unescape things, and pass you > the unescaped data. Yeah, me too. I just don't want to lose information: "this was a literal slash, not an hierarchy delimiter". But if the framework splits on the real slashes and *then* unquotes each segment, I'd be happy to get that list of unquoted segments. This way, my URLs use the obvious way to escape slashes and by the time it gets to my code I have unescaped data. This could be "dealt with" by using a REQUEST_URI instead. But then I have to manually trim the components that URL dispatching moved into SCRIPT_NAME. And I don't actually *have* a REQUEST_URI in the environ. Ian Bicking wrote: > distinguishing %2f and / is more of a corner case I'll call it a canary in the URL mine. Should you have to balance '{' and '}' to find the quoted namespaces for GData terms? I haven't touched GData, but .split('/') and *then* unquoting looks like what's exactly needed in that case. Thank you, -- Luis Bruno _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com