Looks like fwrite (1!:2) has been modified after J6 to convert unicode
(DBCS) text to UTF-8. It doesn't do it quite right. If it encounters a
character that needs to be converted to UTF-8 it does so properly; however,
ASCII characters (128{.a) are padded to two characters with a zero byte.
The ASCII characters should be written out without padding. Or the
non-ASCII characters should be written out as-is like in J6.

This makes a confusing mess as fread does not convert the UTF-8 characters
automatically. And it shouldn't as it should be able to read any file type
where bytes may look like UTF-8 but are not.

fwrite should not attempt to convert unicode to UTF-8 as it writes as one
may really want to create a DBCS file. unicode text can still be written
out as UTF-8 if the user so chooses by simply applying 8&u: before writing.

If it is felt that people should be able to automatically convert between
unicode and UTF-8 when reading and writing files then there should be new
read and write options added to the file conjunction leaving the old ones
alone.

This fails in J8 64 bit and J7 64 bit under Windows 7. Have not tried 32
bit.

   JVERSION

Engine: j701/2011-01-10/11:25

Library: 8.01.008

Qt IDE: 1.0.3

Platform: Win 64

Installer: j801 beta install

InstallPath: c:/j/j64-801a

   ]l=:(u:16b2211),' 1 2 3' NB. 16b2211 is Unicode sigma.

∑ 1 2 3

$l

7

l fwrite 'test.txt'

7

fread 'test.txt'

┬" 1 2 3

3 u: fread 'test.txt'

17 34 32 0 49 0 32 0 50 0 32 0 51 0

3 u: l

8721 32 49 32 50 32 51

3 u: 8 u: l

226 136 145 32 49 32 50 32 51


By the way. I copied and pasted the above from the term window where all
input lines were indented 3 spaces. For some reason the indention is lost
in the paste after the first line.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to