Re: [Jbeta] Definitie UTF-8 to Unicde conversion for u:

Chris Burke Tue, 06 Feb 2007 21:51:06 -0800

Oleg Kobchenko wrote:
> Despite the multitude of function selectors in u: verb,
> it's not clear how it is possible to convert from UTF-8
> to wchar (Unicode) in one step.
> 
> 7 u: coverts to EITHER char OR wchar.
> 
> But it seems that a fairly common use is to supply wchar (Unicode)
> argument to DLL call or to obtain a binary form of wchar
> (which per se is not clear how to get: byte array of wchars),
> where the argument is UTF-8, but could be ASCII, like word "Test".
> 
> But for word "Test", 7 u: does not work:
> 
>    datatype 7 u: 'Test'   NB. Western
> literal
>    datatype 7 u: 'Òåñò'   NB. Cyrillic
> unicode
> 
> So the workaround is 4 u: 3 u: 7 u: (three verbs)
> 
>    datatype 4 u: 3 u: 7 u: 'Test'
> unicode
> 
>    4 u: 3 u: 7 u: 'Òåñò'
> Òåñò
> 
> Is this EITHER / OR in 7 u: really needed?
> 
> Why not just always yield wchar?


It is useful because the result of 7 u: is in its simplest form, i.e.
the conversion to 2 byte unicode takes place only if necessary. Bill's
solution is recommended.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jbeta] Definitie UTF-8 to Unicde conversion for u:

Reply via email to