Looks like Swift 5.0 moved to using utf-8 encoding in almost all cases. This link provides some insight into their decision.
https://swift.org/blog/utf8-string/ Cheers, bob > On Sep 19, 2019, at 12:46 AM, Marshall Lochbaum <[email protected]> wrote: > > Dyalog stores code points directly using 1-, 2-, or 4-byte unsigned > integers. The type for a given array has to fit all the characters, and > we try to choose the smallest possible. I'm not sure how good our > surrogate pair handling is, but I think they are supposed to be combined > into single characters on input. > > Marshall > > On Thu, Sep 19, 2019 at 12:18:13AM +0100, Ian Clark wrote: >> Well done, Bob. >> >> I've read the "differences between revisions" and that's a mean task you've >> completed. >> >> I have to confess I find the new stuff totally baffling. I wrote the >> original article 2 years ago and I still have the bruises on my forehead :) >> I was ignorant of how J901 supports the newer code pages until I read it on >> this thread. >> >> Some helpful(?) questions: >> ++ How does Dyalog APL do it? >> ++ How does Swift 5.1 do it? >> ++ How does Python 3.7 do it? >> ++ How does Javascript do it? >> …All are languages with serious pretensions to manipulating text containing >> UCPs. Maybe over 90% of application code being written in these languages >> does just that, and mostly on webpages. The writer of the Swift manuals >> published by iBooks delights in showing emojis between quotes in code >> samples. Smart stuff – but only a GUI coder or indie publisher would know >> it. >> >> In my day-to-day programming I have little or no use for any greater >> precision than utf-8 and wide characters (…are we still calling them that? >> – how about mega-wide and giga-wide for the new precisions?) Just about the >> only use I'd have for the newer UCPs is to embed them in a PDF document via >> copy-paste. Nowadays that's more likely to be a layman's review blog than a >> learned paper. In which case I'd be at the mercy of my WP vendor to get it >> right when coding the copy/paste. >> >> On past form, the omens are not good. From 1999 to the present day, as an >> indie publisher of books with fancy fonts, I watched Microsoft and Adobe >> completely foul-up the introduction of utf-8 to their products, notably >> export to PDF. Assuming it won't take them another 20 years to migrate to >> utf-32, I guess I can look forward to running sequential machines on emojis >> in my care home. >> >> Ian >> >> On Wed, 18 Sep 2019 at 20:45, 'robert therriault' via Programming < >> [email protected]> wrote: >> >>> Hi Henry, Bill and Ian >>> >>> I have edited the wiki for the UCP page. >>> >>> The synopsis is that I included some information on how literals and utf-8 >>> are related and a section on surrogate pairs. I hope I got most of this >>> right, but if I didn't please make the necessary changes and/or correct me. >>> >>> Ian, I hope that I was able to retain the spirit of what you established >>> with your excellent foundation. >>> >>> https://code.jsoftware.com/wiki/Vocabulary/UnicodeCodePoint >>> >>> Cheers, bob >>> >>>> On Sep 13, 2019, at 10:59 AM, Henry Rich <[email protected]> wrote: >>>> >>>> Detail is great, but put it towards the end of the page if possible. >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
