Hi! I'm having trouble with importing spss files containing non-ascii characters (R 2.4.1, debian linux, i386). To reproduce:
Download the following file: http://statmath.wu-wien.ac.at/data/spss/de/comphomeneu.sav require (foreign) Sys.setlocale (locale="C") read.spss("comphomeneu.sav")$ARBEIT[1] # prints: # [1] im B\374ro # Levels: im B\374ro zuhause \374 of course is actually a u-umlaut. However, I guess in the C locale it's not expected to print as such. But now try this (use any UTF-8 locale you may have installed): Sys.setlocale (locale="de_DE.UTF-8") read.spss("comphomeneu.sav")$ARBEIT[1] # prints: # [1]Error in print.default(xx, quote = quote, ...) : # invalid multibyte string To me it looks, like read.spss () would probably need an encoding parameter, and / or some iconv () magic. Now, locale conversion always makes my head spin, so I thought I'd better post here, before calling this to be a bug in R. Two questions: 1) Is there some way to work around this, i.e. make sure it is converted to proper UTF-8 while importing? Am I missing something obvious? 2) Should I submit this as a bug report? Thanks! Thomas Friedrichsmeier
pgpEhd7gpCdY9.pgp
Description: PGP signature
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
