I was about to chime in that UTF-8 has been the preferred encoding for (stored) text on Mac OS X as long as I've been hacking on it (think "Rhapsody"), so why is this even an issue?
Judging from the docs, nl_langinfo seems like a Unix portability function (something more likely to be happier with ASCII in a terminal), not something to be used by a native Cocoa application. <vote> Set it to UTF-8 and forget about it </vote> -DrD- > Apple is highly unlikely to change the behavior of nl_langinfo(). > > There is already code in the JDK that calls into JRSCopyPrimaryLanguage(), > JRSCopyCanonicalLanguageForPrimaryLanguage(), and JRSSetDefaultLocalization() > for exactly this purpose. > > Please proceed with setting the encoding to UTF-8. It is the de-facto > standard for every Cocoa application I have ever seen. US-ASCII is always the > wrong choice for a graphical app on OS X. > > Regards, > Mike Swingler > Apple Inc. > > On Jul 30, 2013, at 9:05 AM, Francis Devereux <[email protected]> wrote: > >> I suspect that Apple might be unlikely to change the value that nl_langinfo >> returns when LANG is unset. >> >> However, it might be possible to fix this issue without second-guessing the >> character set reported by the OS by calling [NSLocale currentLocale] (or the >> CFLocale equivalent) instead of nl_langinfo. I think (although I haven't >> checked) that that [NSLocale currentLocale] determines the current locale >> using a mechanism other than environment variables, because LANG is usually >> be unset for GUI apps on OS X. >> >> On 30 Jul 2013, at 15:56, Scott Palmer <[email protected]> wrote: >> >>> Then shouldn't you be complaining to Apple that the value returned by >>> nl_langinfo needs to be changed? >>> David's point seems to be that second guessing the character set reported >>> by the OS is likely to cause a different set of problems. >>> >>> Scott >>> >>> >>> On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> On Tue, 30 Jul 2013, David Holmes wrote: >>>> >>>>> On 30/07/2013 5:54 AM, Brent Christian wrote: >>>>>> On 7/28/13 10:13 PM, David Holmes wrote: >>>>>>> On 27/07/2013 3:53 AM, Brent Christian wrote: >>>>>>>> Please review my fix for 8011194 : "Apps launched via >>>> double-clicked >>>>>>>> .jars have file.encoding value of US-ASCII on Mac OS X" >>>>>>>> >>>>>>>> http://bugs.sun.com/view_bug.do?bug_id=8011194 >>>>>>>> >>>>>>>> In most cases of launching a Java app on Mac (from the cmdline, or >>>>>>>> from a native .app bundle), reading and displaying UTF-8 >>>>>>>> characters beyond the standard ASCII range works fine. >>>>>>>> >>>>>>>> A notable exception is the launching of an app by double-clicking >>>>>>>> a .jar file. In this case, file.encoding defaults to US-ASCII, >>>>>>>> and characters outside of the ASCII range show up as garbage. >>>>>>> >>>>>>> Why does this occur? What sets the encoding to US-ASCII? >>>>>> >>>>>> "US-ASCII" is the answer we get from nl_langinfo(CODESET) because no >>>>>> values for LANG/LC* are set in the environment when double-clicking a >>>>>> .jar. >>>>>> >>>>>> We get "UTF-8" when launching from the command line because the >>>>>> default Terminal.app setup on Mac will setup LANG for you (to >>>>>> "en_US.UTF-8" in the US). >>>>> >>>>> Sounds like a user environment error to me. This isn't my area but I'm >>>>> not convinced we should be second guessing what we think the encoding >>>>> should be. >>>> >>>> Except that that is not the case here, of course. The user did *not* set >>>> any environment variable in this case. >>>> >>>> So we are not talking about "second guessing" or "user environment error" >>>> but about a sensible default. >>>> >>>> As to US-ASCII, sorry to say: the seventies called and want their >>>> character set back. >>>> >>>> There can be no question that UTF-8 is the best default character >>>> encoding, or are you even going to question *that*? >>>> >>>>> What if someone intends for it to be US-ASCII? >>>> >>>> Then LANG would not be unset, would it. >>>> >>>> Hth, >>>> Johannes >>>> >>> >> >
