Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-06-01 Thread Hans-Jörg Bibiko
On 31.05.2008, at 00:11, Prof Brian Ripley wrote: On Fri, 30 May 2008, Duncan Murdoch wrote: But I think with Brian Ripley's work over the last while, R for Windows actually handles utf-8 pretty well. (It might not guess at that encoding, but if you tell it that's what you're using...)

[R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Stefan Th. Gries
Hi all Four questions regarding Unicode. Three Windows questions. I am using - a PC with Windows XP (Build 20600.xpsp080413-2111 (Service Pack 3); - the following R version: R.version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status

Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Hans-Jörg Bibiko
Hi, to put it simply. Windows cannot handle utf-8 data. There is no utf-8 locale available. If your corpus only contains Russian data, maybe English glosses etc. you can try to set lang of Rgui.exe to Russian. Then at least you can use grep, strsplit because they are depending on the

Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Duncan Murdoch
On 5/30/2008 12:58 PM, Hans-Jörg Bibiko wrote: Hi, to put it simply. Windows cannot handle utf-8 data. There is no utf-8 locale available. Code page 65001 is utf-8. Most text editors (including Notepad) include an option to save in the UTF-8 encoding. Some programs don't fully support

Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Hans-Joerg Bibiko
Quoting Duncan Murdoch [EMAIL PROTECTED]: On 5/30/2008 12:58 PM, Hans-Jörg Bibiko wrote: to put it simply. Windows cannot handle utf-8 data. There is no utf-8 locale available. Code page 65001 is utf-8. Most text editors (including Notepad) include an option to save in the UTF-8

Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Duncan Murdoch
On 5/30/2008 4:12 PM, Hans-Joerg Bibiko wrote: Quoting Duncan Murdoch [EMAIL PROTECTED]: On 5/30/2008 12:58 PM, Hans-Jörg Bibiko wrote: to put it simply. Windows cannot handle utf-8 data. There is no utf-8 locale available. Code page 65001 is utf-8. Most text editors (including Notepad)

Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Prof Brian Ripley
On Fri, 30 May 2008, Duncan Murdoch wrote: On 5/30/2008 4:12 PM, Hans-Joerg Bibiko wrote: Quoting Duncan Murdoch [EMAIL PROTECTED]: On 5/30/2008 12:58 PM, Hans-Jörg Bibiko wrote: to put it simply. Windows cannot handle utf-8 data. There is no utf-8 locale available. Code page 65001 is