What you had asked is not just for append but promotion in general. For promotion I think it should meet some requirements,
1 preserve semantic 2 same shape 3 round trip conversion for promotion from integer to double, 1 still a number, the same number 2 same shape 3 demote to the same number promotion from literal to wide using u: can satisfy these requirements, whereas using 7&u: would break them. Users should have the best knowledge about the domain/meaning/encoding of literals being used and therefore should be reponsible to do conversion by themselves. Пн, 11 июл 2016, Don Guinn написал(а): > Thank you, Raul, I appreciate your questions on my reasoning. That said, I > am quite pleased with J when working with Unicode. The tools provided make > it easy. > > > > Can you give me an example where this would give a different result from: > > > > append=: dyad define > > if. 131074 = x +&(3!:0) y do. x ,&(7&u:) y else. x, y end. > > ) > > > > For that matter, is there some reason you would not want to use > > ,&(7&u:) if you are mixing utf-16 and utf-8 characters? > > > > Sorry for the delay in responding. This is how I was asking for the default > to work. That is what I do when there is a possibility for UTF-8 to be in > data. 7&u: is a pretty powerful verb. > > > > > > > However, I feel that the current standard of converting > > > with u: monadic should not be allowed at all. It should be an error > > period. > > > > Why is that? > > > > Is this because that is the only use you have? Is this because you > > believe this would break no existing code? Or is this because you > > believe that no one should ever use a 16 bit literal for non-unicode > > data in J? (For example, when dealing with binary files representing > > music, or for representing pixels?) > > > > I feel that the default action for combining char with wide is that <7f > data is not UTF-8 is no longer a good choice. Most of the time it is UTF-8. > And I suspect that Unicode in the form of UTF-8 will grow. For that reason > I felt that it should be the default action. However, if one should make a > conscious decision as to how char maps to wide then there should be no > default. Although browser data is really strange in how it supports > Unicode. Fortunately most of that disappears before we see it in J > > > > > > > > In the current world one never really can predict when some data may > > appear > > > with UTF-8 characters unexpectedly. This would force manual conversion > > > insuring that the proper conversion from char to wide as required by the > > > application is done. Otherwise testing with only ASCII char would not > > catch > > > the possible error. > > > > I feel that you have not encountered enough problems with "almost > > utf-8 data", or "utf-8 data mixed in with other binary data in a file) > > if you are saying stuff like this. > > > > True. Char and wide > can be used for all sorts of things. Right now, wide being relatively new I > thought that it would not be used for other things. But 16 bit audio files > do fit nicely in wide to save space. > > > > > > > For that matter, if by "manual conversion" you mean using 7&u: then I > > do not see that why this should be a problem. > > > > > It seems to me that automatic conversion from char to wide assume UTF-8 > > is > > > a proper choice now. It is possible that one could run into a need to > > leave > > > the conversion as it is now, but where would that data come from? > > > > A file, most likely. Or a network stream. > > > > > And it would really be a pain do view given that J is so insistent to > > treat char > > > as UTF-8 when displaying. > > > > Usually you convert such data to numbers (possibly hexadecimal) when > > you want to inspect it. But you expect J to function in a transparent > > and predictable fashion, to get there. > > > > > J automatically converts integer (64 bit) into float when it can cause a > > > loss of accuracy and we accept that. How is this different? > > > > This conversion changes the shape of the data. > > > > Yes it does. One of the major problems with dealing with UTF-8. Alignment > problems, working with char with UTF-8 where the number of bytes do not > agree with the number of characters is difficult. Converting to wide avoids > these problems. When through simply convert back. > > > > > Thanks, > > > > -- > > Raul > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm -- regards, ==================================================== GPG key 1024D/4434BAB3 2008-08-24 gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm