Oleg Kobchenko wrote: > Despite the multitude of function selectors in u: verb, > it's not clear how it is possible to convert from UTF-8 > to wchar (Unicode) in one step. > > 7 u: coverts to EITHER char OR wchar. > > But it seems that a fairly common use is to supply wchar (Unicode) > argument to DLL call or to obtain a binary form of wchar > (which per se is not clear how to get: byte array of wchars), > where the argument is UTF-8, but could be ASCII, like word "Test". > > But for word "Test", 7 u: does not work: > > datatype 7 u: 'Test' NB. Western > literal > datatype 7 u: 'Òåñò' NB. Cyrillic > unicode > > So the workaround is 4 u: 3 u: 7 u: (three verbs) > > datatype 4 u: 3 u: 7 u: 'Test' > unicode > > 4 u: 3 u: 7 u: 'Òåñò' > Òåñò > > Is this EITHER / OR in 7 u: really needed? > > Why not just always yield wchar?
It is useful because the result of 7 u: is in its simplest form, i.e. the conversion to 2 byte unicode takes place only if necessary. Bill's solution is recommended. ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
