On Wed, Apr 12, 2017 at 3:37 PM, Tony Harminc <[email protected]> wrote:
> On 12 April 2017 at 15:22, Paul Gilmartin < > [email protected]> wrote: > > > I see that: > > CONVERT UTF-8 TO UTF-16 > > CONVERT UTF-8 TO UNICODE > > ... > > 'B2A7' > > ... > > The one-, two-, three-, or four-byte UTF-8 characters of the second > > operand are converted to two-byte Unicode characters and placed > > at the first operand location. > > > > But Wikipedia, which is always right, tells me that UTF-16 is not a > > two-byte representation but a variable-length representation. > > > > RCF in order? > > > > Probably. I don't know if the Unicode entities that make up surrogate pairs > are legitimately called "characters". I think not. > I think not also, per http://unicode.org/faq/utf_bom.html [quote] Q: What are surrogates? A: Surrogates are code points from two special ranges of Unicode values, reserved for use as the leading, and trailing values of paired code units in UTF-16. Leading, also called high, surrogates are from D80016 to DBFF16, and trailing, or low, surrogates are from DC0016 to DFFF16. They are called surrogates, since they do not represent characters directly, but only as a pair. [quote] > > Tony H. > > -- "Irrigation of the land with seawater desalinated by fusion power is ancient. It's called 'rain'." -- Michael McClary, in alt.fusion Maranatha! <>< John McKown ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
