> I jst compare urlencode/urldecode with Java that is from its nature using > Unicode. > > http://java.sun.com/javase/6/docs/api/java/net/URLEncoder.html > > The input parameter is a *String* (which is per definition Unicode in > Java). The output is (a ASCII-only) string with the URL-encoded values. > The character set for the out put is choosen by an additional parameter > (that could be done in PHP, too) or it uses the platform default (in PHP > that would be the encoding used to write strings to files or the web > output. This default encoding would fully conform to what is currently > done when creating web pages. Web Browsers encode the entered values in > the encoding the webpage, that contains the form, uses. A php script that > generates URLs should act in the same way, so the URLencoded values in an > URL should be encoded using te characterst that is used for text output. A > special case would be the HTTP/URI standard that states that URLs should > encoded always UTF-8 (but ALL browser do not do this for form values). But > they do it for encoding path components containing special characters and > webservers exspect it in that way when mapping the path component to a > local filesystem. So the default encoding when using rawurlencode (which > is normally used NOT for forms but more for Pathes, DOIs, URNs,...) should > be UTF-8. But I think that would be contraproductive to differentiate > between rawurlencode and urlencode. But for the case when a user want to > encode a string to a different encoding, he could use an optional second > parameter to (raw)urlencode (like in Java). > > In the case that (raw)urlencode is given a *binary* string the second > parameter should be disallowed and a warning or what ever should be > raised. A binary string should encode byte-by-byte as before! > > I think this would make a lot of applications more backwards compatible > and code more simplier.
I forgot. The same should be in the other way round. Urldecode should accept only a binary string (it is always ASCII only and "encoded") and convert it to a decoded unicode string using the default or supplied encoding: http://java.sun.com/javase/6/docs/api/java/net/URLDecoder.html ----- Uwe Schindler [EMAIL PROTECTED] - http://www.php.net NSAPI SAPI developer Bremen, Germany -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php