The HTMLTokenizer still works in UChars. There's likely some performance to be gained by moving it to an 8-bit character type. There's some trickiness involved because HTML entities can expand to characters outside of Latin-1. Also, it's unclear if we want two tokenizers (one that's 8 bits wide and another that's 16 bits wide) or if we should find a way for the 8-bit tokenizer to handle, for example, UTF-16 encoded network responses.
Adam On Thu, Mar 7, 2013 at 10:11 AM, Darin Adler <da...@apple.com> wrote: > No. I retract my question. Sounds like we already have it right! thanks for > setting me straight. > > Maybe some day we could make a non copying code path that points directly at > the data in the SharedBuffer, but I have no idea if that'd be beneficial. > > -- Darin > > Sent from my iPhone > > On Mar 7, 2013, at 10:01 AM, Michael Saboff <msab...@apple.com> wrote: > >> There is an all-ASCII case in TextCodecUTF8::decode(). It should be keeping >> all ASCII data as 8 bit. TextCodecWindowsLatin1::decode() has not only an >> all-ASCII case, but it only up converts to 16 bit in a couple of rare cases. >> Is there some other case you don't think we are handling? >> >> - Michael >> >> On Mar 7, 2013, at 9:29 AM, Darin Adler <da...@apple.com> wrote: >> >>> Hi folks. >>> >>> Today, bytes that come in from the network get turned into UTF-16 by the >>> decoding process. We then turn some of them back into Latin-1 during the >>> parsing process. Should we make changes so there’s an 8-bit path? It might >>> be as simple as writing code that has more of an all-ASCII special case in >>> TextCodecUTF8 and something similar in TextCodecWindowsLatin1. >>> >>> Is there something significant to be gained here? I’ve been wondering this >>> for a while, so I thought I’d ask the rest of the WebKit contributors. >>> >>> -- Darin >>> _______________________________________________ >>> webkit-dev mailing list >>> webkit-dev@lists.webkit.org >>> https://lists.webkit.org/mailman/listinfo/webkit-dev >> > _______________________________________________ > webkit-dev mailing list > webkit-dev@lists.webkit.org > https://lists.webkit.org/mailman/listinfo/webkit-dev _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev