On Oct 6, 2006, at 10:15 AM, Toby Rush wrote:
That's probably what I'll end up doing... but it's going to be a
speed hit, I'm guessing. Shouldn't decodeURLComponent do this, or
at least have a setting to indicate how the %xx entities are encoded?
I am sure that you are right and that it is a bug...
However, I encountered a bug with this before (on Linux) and it was a
trivial exercise to write your own version which handles encodings
properly. Just use MemoryBlocks (or the in-memory BinaryStream)
copying byte values until you find a % character, and then convert
the following two bytes into a single byte.
Just remember that a generic URLComponent string is suppose to be
encoding-free (undefined). Even though you looked up the proper way
to encode UTF-8 text, there is no encoding tag to identify it as
such. Therefore it is up to you to identify the text as UTF-8, which
is pretty hard to do without a validator... unless you are in a
closed loop system where you know all data is being included as
UTF-8. I would guess that web browsers send data in the encoding
defined by the web page, but I wouldn't be surprised if some browsers
are not UTF-8 aware.
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>