Ugh -- good point. So concatenate strings needs to be a library utility (perhaps homebrewed because we're not very good at looking things up...).
Thanks, -- Raul On Thursday, September 19, 2019, bill lam <[email protected]> wrote: > That will make J language become inconsistent, eg the shape of the result > of concatenation will be unpredictable from the shape of left and right > arguments alone because it will depend on the content data of arguments. > > > On Fri, Sep 20, 2019, 12:06 AM Raul Miller <[email protected]> wrote: > > > This is an unpleasant issue, but I can imagine that we might > > eventually need an incompatible change to the J engine for this. > > > > There are contexts (like pulling data out of foreign file formats) > > where we want literals of different sizes which are not unicode > > literals. But the typical case of concatenating literals involves > > unicode. And unicode is a type of sequence. So it seems like it would > > make sense for automatic coercion to use the unicode conversion > > instead of what's essentially a numeric conversion. > > > > This won't be without problems, but it seems to be where we are heading. > > > > Thanks, > > > > -- > > Raul > > > > On Wednesday, September 18, 2019, Don Guinn <[email protected]> wrote: > > > On the section on mixing text types - that is, byte concatenated with > > > unicode. It should be mentioned that the conversion from byte to > unicode > > > simply puts a high-order byte of zeros in front of the byte. This works > > > fine for ASCII characters but is incorrect for any utf-8 characters. > One > > > needs to be careful if utf-8 characters are in the literal. A similar > > > problem can occur between unicode and unicode-32. If any utf-8 > characters > > > are in the literal the literal must be converted to unicode before > > > concatenating. > > > > > > On Wed, Sep 18, 2019 at 5:18 PM Ian Clark <[email protected]> > wrote: > > > > > > > Well done, Bob. > > > > > > > > I've read the "differences between revisions" and that's a mean task > > you've > > > > completed. > > > > > > > > I have to confess I find the new stuff totally baffling. I wrote the > > > > original article 2 years ago and I still have the bruises on my > > forehead :) > > > > I was ignorant of how J901 supports the newer code pages until I read > > it on > > > > this thread. > > > > > > > > Some helpful(?) questions: > > > > ++ How does Dyalog APL do it? > > > > ++ How does Swift 5.1 do it? > > > > ++ How does Python 3.7 do it? > > > > ++ How does Javascript do it? > > > > …All are languages with serious pretensions to manipulating text > > containing > > > > UCPs. Maybe over 90% of application code being written in these > > languages > > > > does just that, and mostly on webpages. The writer of the Swift > manuals > > > > published by iBooks delights in showing emojis between quotes in code > > > > samples. Smart stuff – but only a GUI coder or indie publisher would > > know > > > > it. > > > > > > > > In my day-to-day programming I have little or no use for any greater > > > > precision than utf-8 and wide characters (…are we still calling them > > that? > > > > – how about mega-wide and giga-wide for the new precisions?) Just > > about the > > > > only use I'd have for the newer UCPs is to embed them in a PDF > > document via > > > > copy-paste. Nowadays that's more likely to be a layman's review blog > > than a > > > > learned paper. In which case I'd be at the mercy of my WP vendor to > > get it > > > > right when coding the copy/paste. > > > > > > > > On past form, the omens are not good. From 1999 to the present day, > as > > an > > > > indie publisher of books with fancy fonts, I watched Microsoft and > > Adobe > > > > completely foul-up the introduction of utf-8 to their products, > notably > > > > export to PDF. Assuming it won't take them another 20 years to > migrate > > to > > > > utf-32, I guess I can look forward to running sequential machines on > > emojis > > > > in my care home. > > > > > > > > Ian > > > > > > > > On Wed, 18 Sep 2019 at 20:45, 'robert therriault' via Programming < > > > > [email protected]> wrote: > > > > > > > > > Hi Henry, Bill and Ian > > > > > > > > > > I have edited the wiki for the UCP page. > > > > > > > > > > The synopsis is that I included some information on how literals > and > > > > utf-8 > > > > > are related and a section on surrogate pairs. I hope I got most of > > this > > > > > right, but if I didn't please make the necessary changes and/or > > correct > > > > me. > > > > > > > > > > Ian, I hope that I was able to retain the spirit of what you > > established > > > > > with your excellent foundation. > > > > > > > > > > https://code.jsoftware.com/wiki/Vocabulary/UnicodeCodePoint > > > > > > > > > > Cheers, bob > > > > > > > > > > > On Sep 13, 2019, at 10:59 AM, Henry Rich <[email protected]> > > wrote: > > > > > > > > > > > > Detail is great, but put it towards the end of the page if > > possible. > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > For information about J forums see > > http://www.jsoftware.com/forums.htm > > > > > > > > > ------------------------------------------------------------ > ---------- > > > > For information about J forums see http://www.jsoftware.com/ > forums.htm > > > > > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
