I definitely second this. The low level 1!:2 write should simply put out an uninterpreted stream of bytes. This is the simplest and fastest thing to do and is usually what you want 99% of the time. Any more complex transformations should be left to utility verbs or new primitives.
On Mon, Mar 25, 2013 at 2:00 PM, Don Guinn <[email protected]> wrote: > Looks like fwrite (1!:2) has been modified after J6 to convert unicode > (DBCS) text to UTF-8. It doesn't do it quite right. If it encounters a > character that needs to be converted to UTF-8 it does so properly; however, > ASCII characters (128{.a) are padded to two characters with a zero byte. > The ASCII characters should be written out without padding. Or the > non-ASCII characters should be written out as-is like in J6. > > This makes a confusing mess as fread does not convert the UTF-8 characters > automatically. And it shouldn't as it should be able to read any file type > where bytes may look like UTF-8 but are not. > > fwrite should not attempt to convert unicode to UTF-8 as it writes as one > may really want to create a DBCS file. unicode text can still be written > out as UTF-8 if the user so chooses by simply applying 8&u: before writing. > > If it is felt that people should be able to automatically convert between > unicode and UTF-8 when reading and writing files then there should be new > read and write options added to the file conjunction leaving the old ones > alone. > > This fails in J8 64 bit and J7 64 bit under Windows 7. Have not tried 32 > bit. > > JVERSION > > Engine: j701/2011-01-10/11:25 > > Library: 8.01.008 > > Qt IDE: 1.0.3 > > Platform: Win 64 > > Installer: j801 beta install > > InstallPath: c:/j/j64-801a > > ]l=:(u:16b2211),' 1 2 3' NB. 16b2211 is Unicode sigma. > > ∑ 1 2 3 > > $l > > 7 > > l fwrite 'test.txt' > > 7 > > fread 'test.txt' > > ┬" 1 2 3 > > 3 u: fread 'test.txt' > > 17 34 32 0 49 0 32 0 50 0 32 0 51 0 > > 3 u: l > > 8721 32 49 32 50 32 51 > > 3 u: 8 u: l > > 226 136 145 32 49 32 50 32 51 > > > By the way. I copied and pasted the above from the term window where all > input lines were indented 3 spaces. For some reason the indention is lost > in the paste after the first line. > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm -- John D. Baker [email protected] ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
