On Wed, Dec 2, 2009 at 8:44 PM, Maciej Stachowiak <[email protected]> wrote:
> > On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote: > > What about Maciej's comment. JS strings are often use to store binary > values. Obviously, if people stick to octets, then it should be fine, but > perhaps some folks leverage all 16 bits? > > > I think some people do use JavaScript strings this way, though not > necessarily with LocalStorage. This kind of use will probably become > obsolete when we add a proper way to store binary data from the platform. > > Most Web-related APIs are fully accepting of JavaScript strings that are > not proper UTF-16. I don't see a strong reason to make LocalStorage an > exception. It does make sense for WebSocket to be an exception, since in > that case charset transcoding is required by the protocol, and since it is > desirable in that case to prevent any funny business that may trip up the > server.. > > Also, looking at UTF-16 more closely, it seems like all UTF-16 can be > transcoded to UTF-8 and round-tripped if one is willing to allow technically > invalid UTF-8 that encodes unpaired characters in the surrogate range as if > they were characters. It's not clear to me why Firefox or IE choose to > reject instead of doing this. This also removes my original objection to > storing strings as UTF-8. > > I think it is typical for UTF-16 to UTF-8 conversion to involve the intermediate step of forming a Unicode code point. If that cannot be done, then conversion fails. This may actually be a security thing. If something expects UTF-8, it is safer to ensure that it gets valid UTF-8 (even if that involves loss of information). -Darin > Regards, > Maciej > > > -Darin > > On Wed, Dec 2, 2009 at 5:03 PM, Ian Hickson <[email protected]> wrote: > >> On Wed, 2 Dec 2009, Michael Nordman wrote: >> > >> > Arguably, seems like a bug that invalid string values are let thru the >> > door to start with? >> >> Yeah, I should make the spec through SYNTAX_ERR if there are any unpaired >> surrogates, the same way WebSocket does. I'll file a bug. >> >> -- >> Ian Hickson U+1047E )\._.,--....,'``. fL >> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. >> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' >> _______________________________________________ >> webkit-dev mailing list >> [email protected] >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> > > _______________________________________________ > webkit-dev mailing list > [email protected] > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > > >
_______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

