Hey Evan, I apologize for missing this discussion, I'm sure that I'm not seeing the entire picture and the pros of this argument. I mentioned before that I'm in support of utf-8 everywhere we can get it.
We are obviously going to have platform specific code for the UI (win32 / cocoa/objective-c / gtk), and it makes sense to use the native UI string type there. However, I think it should be possible for all "non-platform" common code and interfaces to be in utf-8, and I feel like this would be a more logical design and equivalent performance. I just wanted to point out a few concerns I have with using string16 in general. - Another string type. It's already a bit confusing with WebKit strings, StringPiece, std::string, std::wstring, and string16. I feel like making the UI be string16 is going to prevent us from every really pushing one string encoding everywhere. - WebKit strings are not an argument for string16 We don't have to interact with WebKit from the UI, and we have a very nice interface there forced onto us by the IPC. So I don't think WebKit using utf-16 is an argument for our UI code. WebKit's use of utf-16 is forced by the JavaScript standard. - std::wstring == std::string, only on Windows I think this will cause some confusion and likely a few bugs, where strings are improperly converted/confused between the two. - You can't have string16 literals on Mac / Linux. On Windows, L"foo" will be a 16-bit string, making it fine as a std::wstring or string16. On Mac and Linux these will be 32-bit, unless we compile with -fshort-wchar, but I'm not sure that's a good idea. This means any string literals will need to be stored in another encoding (ascii, utf-8, wchar_t), and then converted to UTF-16. This isn't so strange until you think of what will happen on Linux, when we have utf-8 -> utf-16 -> utf-8 -> gtk. - We don't have good library functions for string16 We have a lot of great things in string_util, and most operate on std::string / ascii / utf8 / std::wstring. We would need to add string16 versions for all of these (at least it would be really nice to be able to use them). - Memory / speed You pointed out originally this isn't a big deal, and that we don't have many UI strings. (This will later be my argument for why paying a utf-8 -> native conversion isn't a problem). utf-8 is a more concise memory encoding, meaning for very commonly ASCII cases we save a byte, and for unicode cases, it's probably the same, the utf-8 encoding would probably only take 2 bytes. This also makes a difference in performance, since memory is a bottle neck, and you have to deal with less of it. Probably not really worth evaluating in this setting, but I just wanted to point out that I feel like utf-8 is the superior encoding here. I'm definitely looking forward to the other side of the picture, and why using string16 will make our UI code simpler on Mac and Linux. On Wed, Feb 4, 2009 at 3:11 AM, Evan Martin <[email protected]> wrote: > > [A bunch of the team met up today to hammer out some decisions.] > > In brief: for strings that are known to be Unicode (that is, not > random byte strings read from a file), we will migrate towards using > string16. This means all places we use wstring should be split into > the appropriate types: > - byte strings should be string or vectors of chars > - paths should be FilePath > - urls should be GURL > - UI strings, etc. should be string16. > > string16 uses UTF-16 underneath. It's equivalent to wstring on > Windows, but wstring involves 4-byte characters on Linux/Mac. > > Some important factors were: > - we don't have too many strings in this category (with the huge > exception of WebKit), so memory usage isn't much of an issue > - it's the native string type of Windows, Mac, and WebKit > - we want it to be explicit (i.e. a compile error) when you > accidentally use a byte string in a place where we should know the > encoding (which std::string and UTF-8 doesn't allow) > - we still use UTF-8 in some places (like the history full-text > database) where space is more of a concern > > > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
