On Wed, May 13, 2009 at 2:20 PM, Greg Spencer <[email protected]> wrote:
> On Wed, May 13, 2009 at 2:05 PM, Darin Fisher <[email protected]> wrote: >> >> That conversion is not defined. If you are on Linux, the contents of the >> file path is just an array of bytes. It might be UTF-8, in which case you >> can convert to UTF-16. However, it may also be some crazy encoding or it >> may not match any encoding. This OS does not require it to match an >> encoding. >> >> When we need to convert a FilePath to Unicode, we use the >> SysWideToNativeMB and SysNativeMBToWide functions from base. This works by >> inspecting what the system thinks the current multi-byte encoding is. On >> Mac that is UTF-8. On Linux, it depends on the value of $LANG. Each time >> we do such a conversion, we are introducing a potential bug in the product >> (on Linux at least), so we try hard to avoid them. >> > > Yes, I know that this is how it works (see earlier messages in this > thread), but can you tell me if there are any Linux apps that manage to do > this correctly (e.g. without having this bug), and how they do it? > > I can't see how any Linux app can do any better than looking at LANG and > LC_CHAR and hoping that they're set correctly. Certainly there's no way to > decode a pathname that includes multiple encodings, and I have no idea what > happens with NFS mounts between machines with different settings. > > I'm just saying why not just do as well as can be done by the best app out > there, and punt after that? > > -Greg. > Sorry to repeat information. This is a long thread! The "solution" is to not convert to UTF-16 unless you are trying to generate a string to display to the user. Then you should use the LANG information to determine how best to render the text for display to the user. The program should try its best to preserve the file path in the original form and not try to convert to UTF-16 and back again since that conversion may be lossy. I know this doesn't really help. I think it is reasonable to have a utility somewhere to perform a conversion to UTF-16 (or UTF-8), but it should come with a stern warning, and I kind of prefer it not being a method on FilePath since I would prefer people not be tempted to overuse it. -Darin --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
