Oleg Kobchenko wrote: > --- Chris Burke <[EMAIL PROTECTED]> wrote: > >> Oleg Kobchenko wrote: >>> Despite the multitude of function selectors in u: verb, >>> it's not clear how it is possible to convert from UTF-8 >>> to wchar (Unicode) in one step. >>> >>> 7 u: coverts to EITHER char OR wchar. >>> >>> But it seems that a fairly common use is to supply wchar (Unicode) >>> argument to DLL call or to obtain a binary form of wchar >>> (which per se is not clear how to get: byte array of wchars), >>> where the argument is UTF-8, but could be ASCII, like word "Test". >>> >>> But for word "Test", 7 u: does not work: >>> >>> datatype 7 u: 'Test' NB. Western >>> literal >>> datatype 7 u: 'Òåñò' NB. Cyrillic >>> unicode >>> >>> So the workaround is 4 u: 3 u: 7 u: (three verbs) >>> >>> datatype 4 u: 3 u: 7 u: 'Test' >>> unicode >>> >>> 4 u: 3 u: 7 u: 'Òåñò' >>> Òåñò >>> >>> Is this EITHER / OR in 7 u: really needed? >>> >>> Why not just always yield wchar? >> It is useful because the result of 7 u: is in its simplest form, i.e. >> the conversion to 2 byte unicode takes place only if necessary. Bill's >> solution is recommended. > > Using monadic u: is dummy appending of zeros to each char, > which does not work with UTF-8. > > As a result we have an interface that is too smart > and makes complex decisions in its simplest form, > and for a simple direct transformation it requires > a complex application of 3 verbs. > > The default behavior of u: is unpredictable, > depending on incoming data, it can return two > possible datatypes. > > It's OK to possibly to use the workaround, but > it's not easily evident where such flexibility > would be useful?
The problem is that the u: family was already defined to map between 8-bit char and 16-bit wchar. Had utf8 been adopted earlier, the 8-bit char probably would not be there. Given that u: 7: does what is needed to convert utf8 to unicode, it was thought unnecessary to provide another X u: to do this. Perhaps there should be another stdlib verb for it? ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
