There is no change in the way J6/J7/J8 handle this. Try it in J6! What is happening is that you are writing a unicode datatype to file, and reading it back as characters, i.e.
]l=:(u:16b2211),' 1 2 3' NB. 16b2211 is Unicode sigma. ∑ 1 2 3 l fwrite 'test.txt' 7 l-:6 u:fread 'test.txt' NB. 6 u: converts chars to unicode 1 A=: utf8 l A fwrite 'test.txt' 9 A -: fread 'test.txt' 1 On Tue, Mar 26, 2013 at 3:28 AM, John Baker <[email protected]> wrote: > I definitely second this. The low level 1!:2 write should simply put out > an uninterpreted stream of bytes. This is the simplest and fastest thing to > do and is usually what you want 99% of the time. Any more complex > transformations should be left to utility verbs or new primitives. > > > On Mon, Mar 25, 2013 at 2:00 PM, Don Guinn <[email protected]> wrote: > > > Looks like fwrite (1!:2) has been modified after J6 to convert unicode > > (DBCS) text to UTF-8. It doesn't do it quite right. If it encounters a > > character that needs to be converted to UTF-8 it does so properly; > however, > > ASCII characters (128{.a) are padded to two characters with a zero byte. > > The ASCII characters should be written out without padding. Or the > > non-ASCII characters should be written out as-is like in J6. > > > > This makes a confusing mess as fread does not convert the UTF-8 > characters > > automatically. And it shouldn't as it should be able to read any file > type > > where bytes may look like UTF-8 but are not. > > > > fwrite should not attempt to convert unicode to UTF-8 as it writes as one > > may really want to create a DBCS file. unicode text can still be written > > out as UTF-8 if the user so chooses by simply applying 8&u: before > writing. > > > > If it is felt that people should be able to automatically convert between > > unicode and UTF-8 when reading and writing files then there should be new > > read and write options added to the file conjunction leaving the old ones > > alone. > > > > This fails in J8 64 bit and J7 64 bit under Windows 7. Have not tried 32 > > bit. > > > > JVERSION > > > > Engine: j701/2011-01-10/11:25 > > > > Library: 8.01.008 > > > > Qt IDE: 1.0.3 > > > > Platform: Win 64 > > > > Installer: j801 beta install > > > > InstallPath: c:/j/j64-801a > > > > ]l=:(u:16b2211),' 1 2 3' NB. 16b2211 is Unicode sigma. > > > > ∑ 1 2 3 > > > > $l > > > > 7 > > > > l fwrite 'test.txt' > > > > 7 > > > > fread 'test.txt' > > > > ┬" 1 2 3 > > > > 3 u: fread 'test.txt' > > > > 17 34 32 0 49 0 32 0 50 0 32 0 51 0 > > > > 3 u: l > > > > 8721 32 49 32 50 32 51 > > > > 3 u: 8 u: l > > > > 226 136 145 32 49 32 50 32 51 > > > > > > By the way. I copied and pasted the above from the term window where all > > input lines were indented 3 spaces. For some reason the indention is lost > > in the paste after the first line. > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > -- > John D. Baker > [email protected] > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
