On 15/04/2010 12:22 PM, Michael Stegh wrote:
Dear List,

I have data which contain the special German characters "ä", "ö", "ü" etc. 
After reading the
text files into R those characters are displayed strangely, e. g. "ä" is  "ä". 
The first step is to
replace those with their typical transcription, e. g. "ä" becomes "ae" by using 
the gsub
command.

Your example of "ä" is what you would see if you stored it in UTF-8 encoding, then read it in Latin1. So I suspect you need to declare the encoding of the files you are reading before reading them. You can do this as follows:

con <- file("foo.txt", encoding="UTF-8", open="r")
readLines(con)
close(con)

By default, R assumes the encoding of files matches the default encoding on your system.
Until I upgraded to version 2.10.1 (from 2.8.0) this worked perfectly for all 
characters. Now it
works for all characters but "Ü".

temp1<-gsub("Ãoe","Ue",temp1)

You might want to try perl=TRUE in the gsub() call; it seems to handle strange characters in regular expressions better than the default TRE library does.

Duncan Murdoch

This letter is displayed as "Ãoe" (as before), but R is no longer able to find 
this character. The
problem seems to be linked to the "oe" part, since I could substitute for "Ã" 
without a problem.
Strangely if I get the two characters by extracting them with the substr 
command to a variable
and then using the variable I am able to substitute without a problem. Any 
ideas, what I am
missing?

Thanks,

Michael

        [[alternative HTML version deleted]]

------------------------------------------------------------------------

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to