Seppo, On May 16, 2007, at 11:39 AM, Seppo Nyrkkö wrote:
> Dear mac & R users, > > Returning to this issue, I and Antti found out this certain problem > with R.app and Scandinavian characters was triggered by the Mac OS > X's system-wide language locale set to "C" (POSIX) in the OS X > installation phase. > Did you even have a look? If you did, you'd see that pretty much nothing of what you (or Antti) said is true. For example in the US locale you get: > Sys.getlocale() [1] "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8" Please read the documentation, especially R for Mac OS X FAQ: "9 Internationalization of the R.app". The R.app sets the locale according to user's preferences. For proper operation you *must* use UTF-8 locale, because that is what all system calls expect. You can run R in C locale if you insist, but then you lose the ability to display non-ASCII character. This has nothing to do with reading and writing files - note the "encoding" option in all read/write functions that allows you to read/ write files in various encodings. Please read R documentation on encodings. Cheers, Simon > (details follow) > > > On June 22, 2006 at 19:43, Antti Arppe wrote: > > Dear colleagues, >> >> With the help of a colleague of mine here in Helsinki (Seppo Nyrkkö) >> who looked at the innards of the R source code for Mac it turned out >> that this was in the end indeed an issue concerning the Mac locale >> and >> its settings and not R. >> >> Though we had tried this earlier by changing the LANG variable to >> 'fi_FI', we hadn't looked hard enough in the available encodings >> (with >> locale -a) to select the exactly correct value, being: >> >> LANG=fi_FI.IS08859-1; export LANG; >> >> With this configuration R was able to happily read in my original >> table with the Scandinavian characters in the header, without no >> fuss. >> >> Thanks for your advice, and wishing all a good Midsummer, >> >> -Antti Arppe >> > > > At the startup, R checks whether it is running in an international > character set locale or not. The locale information is inherited from > the parent process, i.e. the os x window server, which reads locale > settings from the system-wide settings. This information describes > which characters are printable, and which should be displayed as > substituted characters during the whole R session. The POSIX C locale > allows only displaying 7-bit ASCII characters, and disables any > printing of the scandinavian characters (ä,ö,å) in R.app. > > First step of recovery is to change the system from the C locale to an > international locale which allows utf-8 character sequences (can be > done through System Preferences). This enables proper output of > unicode > characters in the R.app terminal. > > Then, to read and write files in the latin-1 (iso-8859-1) character > set > (note that the system does utf-8 by default now), one should change > the > default encoding for file operations by commanding > 'options(encoding="iso-8859-1")' > at the command prompt. It is also possible to add this setting in the > startup file ".Rprofile" in the project startup directory. > > Changing the locale in the command-line shell session (either by hand > or in the shell profile script) might not be the best solution here, > since other locale-aware OS X applications, launched from the window > manager, would remain in the C locale. > > with best regards, > Seppo > > _______________________________________________ > R-SIG-Mac mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/r-sig-mac > > _______________________________________________ R-SIG-Mac mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-mac
