There is no change in the way J6/J7/J8 handle this. Try it in J6!

What is happening is that you are writing a unicode datatype to file, and
reading it back as characters, i.e.

   ]l=:(u:16b2211),' 1 2 3'        NB. 16b2211 is Unicode sigma.

∑ 1 2 3


   l fwrite 'test.txt'

7


   l-:6 u:fread 'test.txt'             NB. 6 u: converts chars to unicode

1


   A=: utf8 l


   A fwrite 'test.txt'

9


   A -: fread 'test.txt'

1


On Tue, Mar 26, 2013 at 3:28 AM, John Baker <[email protected]> wrote:

> I definitely second this.  The low level 1!:2 write should simply put out
> an uninterpreted stream of bytes. This is the simplest and fastest thing to
> do and is usually what you want 99% of the time. Any more complex
> transformations should be left to utility verbs or new primitives.
>
>
> On Mon, Mar 25, 2013 at 2:00 PM, Don Guinn <[email protected]> wrote:
>
> > Looks like fwrite (1!:2) has been modified after J6 to convert unicode
> > (DBCS) text to UTF-8. It doesn't do it quite right. If it encounters a
> > character that needs to be converted to UTF-8 it does so properly;
> however,
> > ASCII characters (128{.a) are padded to two characters with a zero byte.
> > The ASCII characters should be written out without padding. Or the
> > non-ASCII characters should be written out as-is like in J6.
> >
> > This makes a confusing mess as fread does not convert the UTF-8
> characters
> > automatically. And it shouldn't as it should be able to read any file
> type
> > where bytes may look like UTF-8 but are not.
> >
> > fwrite should not attempt to convert unicode to UTF-8 as it writes as one
> > may really want to create a DBCS file. unicode text can still be written
> > out as UTF-8 if the user so chooses by simply applying 8&u: before
> writing.
> >
> > If it is felt that people should be able to automatically convert between
> > unicode and UTF-8 when reading and writing files then there should be new
> > read and write options added to the file conjunction leaving the old ones
> > alone.
> >
> > This fails in J8 64 bit and J7 64 bit under Windows 7. Have not tried 32
> > bit.
> >
> >    JVERSION
> >
> > Engine: j701/2011-01-10/11:25
> >
> > Library: 8.01.008
> >
> > Qt IDE: 1.0.3
> >
> > Platform: Win 64
> >
> > Installer: j801 beta install
> >
> > InstallPath: c:/j/j64-801a
> >
> >    ]l=:(u:16b2211),' 1 2 3' NB. 16b2211 is Unicode sigma.
> >
> > ∑ 1 2 3
> >
> > $l
> >
> > 7
> >
> > l fwrite 'test.txt'
> >
> > 7
> >
> > fread 'test.txt'
> >
> > ┬" 1 2 3
> >
> > 3 u: fread 'test.txt'
> >
> > 17 34 32 0 49 0 32 0 50 0 32 0 51 0
> >
> > 3 u: l
> >
> > 8721 32 49 32 50 32 51
> >
> > 3 u: 8 u: l
> >
> > 226 136 145 32 49 32 50 32 51
> >
> >
> > By the way. I copied and pasted the above from the term window where all
> > input lines were indented 3 spaces. For some reason the indention is lost
> > in the paste after the first line.
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
>
>
>
> --
> John D. Baker
> [email protected]
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to