--- Chris Burke <[EMAIL PROTECTED]> wrote: > Oleg Kobchenko wrote: > > Despite the multitude of function selectors in u: verb, > > it's not clear how it is possible to convert from UTF-8 > > to wchar (Unicode) in one step. > > > > 7 u: coverts to EITHER char OR wchar. > > > > But it seems that a fairly common use is to supply wchar (Unicode) > > argument to DLL call or to obtain a binary form of wchar > > (which per se is not clear how to get: byte array of wchars), > > where the argument is UTF-8, but could be ASCII, like word "Test". > > > > But for word "Test", 7 u: does not work: > > > > datatype 7 u: 'Test' NB. Western > > literal > > datatype 7 u: 'Òåñò' NB. Cyrillic > > unicode > > > > So the workaround is 4 u: 3 u: 7 u: (three verbs) > > > > datatype 4 u: 3 u: 7 u: 'Test' > > unicode > > > > 4 u: 3 u: 7 u: 'Òåñò' > > Òåñò > > > > Is this EITHER / OR in 7 u: really needed? > > > > Why not just always yield wchar? > > It is useful because the result of 7 u: is in its simplest form, i.e. > the conversion to 2 byte unicode takes place only if necessary. Bill's > solution is recommended.
Using monadic u: is dummy appending of zeros to each char, which does not work with UTF-8. As a result we have an interface that is too smart and makes complex decisions in its simplest form, and for a simple direct transformation it requires a complex application of 3 verbs. The default behavior of u: is unpredictable, depending on incoming data, it can return two possible datatypes. It's OK to possibly to use the workaround, but it's not easily evident where such flexibility would be useful? ____________________________________________________________________________________ Don't pick lemons. See all the new 2007 cars at Yahoo! Autos. http://autos.yahoo.com/new_cars.html ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
