I have exactly the same problem as you have with my Íslandish letters
As an example I creeate a file with Notepad with "þæö" It can be using codepages or it can be in Unicode As it happens with Notepad you can store the file as Unicode, Unicode Big Endian and Utf-8 as well as ANSI (using codepages) With ANSI and codepages there are a number of places "þæö" and the rest of the Íslandish chars can be Let me create a file with only "þæö" in it and save it as different options of Notepad allows require'files' a=:fread'c:\temp\thorn.txt' a. i. a 254 230 246 'þæö'fwrite'c:\temp\thornuni.txt' 6 b=:fread'c:\temp\thornuni.txt' a. i. b 195 190 195 166 195 182 c=:fread'c:\temp\thornuninote.txt' a. i. c 255 254 254 0 230 0 246 0 d=:fread'c:\temp\thornunibignote.txt' a. i. d 254 255 0 254 0 230 0 246 e=:fread'c:\temp\thornutfnote.txt' a. i. e 239 187 191 195 190 195 166 195 182 In Notepad they all look the same In J only b and e look the same but the a. i. display is not the same b is created from J as you saw from above the rest from different encodings by Notepad a b þæö c d e þæö I am sure you may be even more confused by the above How this issue is to be resolved - wellllll..... 2006/10/12, Ronan Reilly <[EMAIL PROTECTED]>:
Thanks Chris, Björn and Bill. That clarifies things. However, I'm finding that the text strings I read from the file are not getting converted by applying utf8. It seems that they have to be datatype unicode for this to work, but they are read in and stored as literals. Is there any way of coercing datatypes in J to get around this? Thanks again, Ronan PS: I'm running J on a Mac if that's relevant On 11/10/2006 23:59, "Chris Burke" <[EMAIL PROTECTED]> wrote: > Ronan Reilly wrote: >> I'm trying to display some German text using plot (J601). The text has been >> read in from a file using fread, decoded using ". , and stored in a table. >> >> When I extract a word with an umlaut from the table (e.g., Aktualität), >> assign it to W1, and plot it using >> >> pd 'reset;text 0 _1x ',W1,';show' >> >> the letter with an umlaut does not display correctly. >> >> However, if I evaluate W1 in the jwd, like so: >> >> W1 >> Aktualität >> >> and then edit and evaluate it like so >> >> W2 =: Aktualität >> >> and then display W2 using >> >> pd 'reset;text 0 _1x ',W2,';show' >> >> W2 displays correctly. >> >> Also >> >> $W1 >> 10 >> $W2 >> 11 >> >> What is going on here? How can I display the unicode characters using plot? > > In general, J assumes incoming text is in utf8 format. J also supports a > "unicode" data type, which is 2-byte unicode, see the help for u: . > > Text as either utf8 or unicode will display correctly in the session, > but only utf8 will work in plot. > > In this example, W2 is in utf8 format, and W1 in 2-byte unicode. You > need to convert W1 to utf8, using the utf8 verb. > > Here is what is happening: > > #W2=: 'Aktualität' > 11 > #W1=: ucp W2 > 10 > > W2 -: utf8 W1 > > datatype W2 > literal > datatype W1 > unicode > > a.i.W2 > 65 107 116 117 97 108 105 116 195 164 116 > a.i.W1 > 65 107 116 117 97 108 105 116 228 116 > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Professor Ronan Reilly Head of Department Department of Computer Science NUI Maynooth Maynooth Co. Kildare IRELAND t: +353-1-7083847 e: [EMAIL PROTECTED] w: http://www.cs.nuim.ie; http://cortex.cs.nuim.ie ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
-- Björn Helgason, Verkfræðingur Fugl&Fiskur ehf, Þerneyjarsund 23, Box 127 801 Grímsnes ,t-póst: [EMAIL PROTECTED] Skype: gosiminn, gsm: +3546985532 Landslags og skrúðgarðagerð, gröfuþjónusta http://groups.google.com/group/J-Programming Tæknikunnátta höndlar hið flókna, sköpunargáfa er meistari einfaldleikans góður kennari getur stigið á tær án þess að glansinn fari af skónum /|_ .-----------------------------------. ,' .\ / | Með léttri lund verður | ,--' _,' | Dagurinn í dag | / / | Enn betri en gærdagurinn | ( -. | `-----------------------------------' | ) | (\_ _/) (`-. '--.) (='.'=) `. )----' (")_(")
---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
