Problem solved!

Many thanks for the help.

Ronan

On 12/10/2006 13:30, "Chris Burke" <[EMAIL PROTECTED]> wrote:

> Ronan Reilly wrote:
>> That clarifies things.  However, I'm finding that the text strings I read
>> from the file are not getting converted by applying utf8.  It seems that
>> they have to be datatype unicode for this to work, but they are read in and
>> stored as literals.  Is there any way of coercing datatypes in J to get
>> around this?   
> 
> Your data is probably the old 8-bit ansi (aka ISO-8859-1 or Latin1). To
> convert this to utf8, use the verb fix below.
> 
> For example:
> 
> NB. a umlaut is 228 { a. in ISO-8859-1
> 
>    a=. 65 107 116 117 97 108 105 116 228 116 { a.
> 
> NB. a umlaut is 196 164 { a. in utf8
> 
>    a. i. fix a
> 65 107 116 117 97 108 105 116 195 164 116
> 
> fix=: 3 : 0
> val=. a. i. y
> msk=. 127 < val
> uni=. 192 128 +"1 [ 0 64 #: msk # val
> val=. val #~ 1 j. msk
> ndx=. I. 127 < val
> dat=. a. {~ uni (ndx +/ 0 1) } val
> )
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 

--
Professor Ronan Reilly
Head of Department
Department of Computer Science
NUI Maynooth
Maynooth
Co. Kildare
IRELAND

t: +353-1-7083847
e: [EMAIL PROTECTED]
w: http://www.cs.nuim.ie; http://cortex.cs.nuim.ie



----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to