Then shouldn't you be complaining to Apple that the value returned by nl_langinfo needs to be changed? David's point seems to be that second guessing the character set reported by the OS is likely to cause a different set of problems.
Scott On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin < [email protected]> wrote: > Hi, > > On Tue, 30 Jul 2013, David Holmes wrote: > > > On 30/07/2013 5:54 AM, Brent Christian wrote: > > > On 7/28/13 10:13 PM, David Holmes wrote: > > > > On 27/07/2013 3:53 AM, Brent Christian wrote: > > > > > Please review my fix for 8011194 : "Apps launched via > double-clicked > > > > > .jars have file.encoding value of US-ASCII on Mac OS X" > > > > > > > > > > http://bugs.sun.com/view_bug.do?bug_id=8011194 > > > > > > > > > > In most cases of launching a Java app on Mac (from the cmdline, or > > > > > from a native .app bundle), reading and displaying UTF-8 > > > > > characters beyond the standard ASCII range works fine. > > > > > > > > > > A notable exception is the launching of an app by double-clicking > > > > > a .jar file. In this case, file.encoding defaults to US-ASCII, > > > > > and characters outside of the ASCII range show up as garbage. > > > > > > > > Why does this occur? What sets the encoding to US-ASCII? > > > > > > "US-ASCII" is the answer we get from nl_langinfo(CODESET) because no > > > values for LANG/LC* are set in the environment when double-clicking a > > > .jar. > > > > > > We get "UTF-8" when launching from the command line because the > > > default Terminal.app setup on Mac will setup LANG for you (to > > > "en_US.UTF-8" in the US). > > > > Sounds like a user environment error to me. This isn't my area but I'm > > not convinced we should be second guessing what we think the encoding > > should be. > > Except that that is not the case here, of course. The user did *not* set > any environment variable in this case. > > So we are not talking about "second guessing" or "user environment error" > but about a sensible default. > > As to US-ASCII, sorry to say: the seventies called and want their > character set back. > > There can be no question that UTF-8 is the best default character > encoding, or are you even going to question *that*? > > > What if someone intends for it to be US-ASCII? > > Then LANG would not be unset, would it. > > Hth, > Johannes >
