The default conversion of literal to unicode (char to wchar) does not work
as one would expect. Simply setting the high-order byte to zero is like
converting integer to floating point by simply copying the bits as is.
J favors representing Unicode as U8, not unicode (wchar). Not a problem.
And normally, if a literal contains characters in the range _128{.{a. they
represent U8 characters. Why not default conversion of literal to unicode
as if the literal might be U8? Make the default of monadic u: be 7&u: when
applied to a literal and when concatenating literal and unicode assume that
the literal is U8.
The monadic default for u: to be 2&u: is not really a problem, but the
concatenation can easily result in errors.
z=.u:16b2211
3!:0 z
131072
3!:0 ":z
2
z,":z
∑â
3 u: z,":z
8721 226 136 145
3 u: z,7 u: ":z
8721 8721
It seems to me that the last statement is what one would expect for
concatenation.
J8 is soon to be released. Although the way J handles U8 and unicode has
been around quite a while, this might be a good time to make change the
defaults, if you agree that the change would be good.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm