Oleg Kobchenko wrote:
> --- Chris Burke <[EMAIL PROTECTED]> wrote:
> 
>> Oleg Kobchenko wrote:
>>> Despite the multitude of function selectors in u: verb,
>>> it's not clear how it is possible to convert from UTF-8
>>> to wchar (Unicode) in one step.
>>>
>>> 7 u: coverts to EITHER char OR wchar.
>>>
>>> But it seems that a fairly common use is to supply wchar (Unicode)
>>> argument to DLL call or to obtain a binary form of wchar
>>> (which per se is not clear how to get: byte array of wchars),
>>> where the argument is UTF-8, but could be ASCII, like word "Test".
>>>
>>> But for word "Test", 7 u: does not work:
>>>
>>>    datatype 7 u: 'Test'   NB. Western
>>> literal
>>>    datatype 7 u: 'Òåñò'   NB. Cyrillic
>>> unicode
>>>
>>> So the workaround is 4 u: 3 u: 7 u: (three verbs)
>>>
>>>    datatype 4 u: 3 u: 7 u: 'Test'
>>> unicode
>>>
>>>    4 u: 3 u: 7 u: 'Òåñò'
>>> Òåñò
>>>
>>> Is this EITHER / OR in 7 u: really needed?
>>>
>>> Why not just always yield wchar?
>> It is useful because the result of 7 u: is in its simplest form, i.e.
>> the conversion to 2 byte unicode takes place only if necessary. Bill's
>> solution is recommended.
> 
> Using monadic u: is dummy appending of zeros to each char,
> which does not work with UTF-8.
> 
> As a result we have an interface that is too smart
> and makes complex decisions in its simplest form,
> and for a simple direct transformation it requires
> a complex application of 3 verbs.
> 
> The default behavior of u: is unpredictable,
> depending on incoming data, it can return two
> possible datatypes.
> 
> It's OK to possibly to use the workaround, but
> it's not easily evident where such flexibility
> would be useful?

The problem is that the u: family was already defined to map between
8-bit char and 16-bit wchar. Had utf8 been adopted earlier, the 8-bit
char probably would not be there.

Given that u: 7: does what is needed to convert utf8 to unicode, it was
thought unnecessary to provide another X u: to do this. Perhaps there
should be another stdlib verb for it?



----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to