Hi,

I usually do not give second thought to accented vowels and R handles 
everything fine thanks to UTF8 being used in my R scripts. But today I have a 
problem. Accented vowels do not behave properly when they were imported into R 
using list.files.

Maybe this is because  OS X (I'm using 10.6.8) still uses MacRoman for file 
names, though visually the names seem to have been read correctly into R.

An example is better than words:

sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     


This does not cause problem:
a = c("1_MO2 crevettes po2crit.Rda", "1_MO2 soles Sète sda.Rda", "1_MO2 turbots 
po2crit.Rda"); a
[1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles Sète sda.Rda"    "1_MO2 turbots 
po2crit.Rda"  

a2 = gsub(" Sète", "S", a); a2
[1] "1_MO2 crevettes po2crit.Rda" "1_MO2 solesS sda.Rda"        "1_MO2 turbots 
po2crit.Rda"  


but if instead of creating the vector within the R script, I read it as a 
series of file names, the substitution does not work. I am sorry that I cannot 
make this a reproducible example as it requires the 3 files to exist on your 
computer, but you could create 3 dummy files having the same names in the 
directory of your choice.

don = file.path("données/")
b = list.files(path = don, pattern = "1_MO2"); b
[1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles Sète sda.Rda"     "1_MO2 turbots 
po2crit.Rda"  

b2 = gsub(" Sète", "S",  b); b2  
[1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles Sète sda.Rda"     "1_MO2 turbots 
po2crit.Rda"  

I am puzzled and also "stuck". For now I'll modify the file name, but I need to 
be able to handle such names at some point.

Any advice?

thanks in advance,

Denis

_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to