from Wikipedia Number of bytesBits for code pointFirst code pointLast code pointByte 1Byte 2Byte 3Byte 4 1 7 U+0000 U+007F 0xxxxxxx 2 11 U+0080 U+07FF 110xxxxx 10xxxxxx 3 16 U+0800 U+FFFF 1110xxxx 10xxxxxx 10xxxxxx 4 21 U+10000 U+10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
so it can be cut dyad using <;.1 with a mask +./ the ranges of the byte 1 of each of the 4 cases. On Sun, Sep 15, 2019, 8:45 AM bill lam <[email protected]> wrote: > Yes, needs to handle this situation, the value of the leading byte of a > multi-byte utf8 will tell its length whether 2,3 or 4 bytes. > > On Sun, Sep 15, 2019, 8:39 AM Raul Miller <[email protected]> wrote: > >> And if there's several adjacent characters with representation in that >> range? >> >> Thanks, >> >> -- >> Raul >> >> On Saturday, September 14, 2019, bill lam <[email protected]> wrote: >> >> > That's the easiest way. >> > >> > A harder way without conversion to unicode is the keep bytes higher 127 >> in >> > sequence during reversing, but beware of contiguous multi bytes. this >> > needs to study utf8 encoding rfc or whatnot. >> > >> > On Fri, Sep 13, 2019, 6:30 AM Henry Rich <[email protected]> wrote: >> > >> > > Raul said what I was thinking. >> > > >> > > The point is, a literal string is UTF-8 (i. e. bytes) NOT unicode. >> When >> > > you reverse the bytes in a UTF-8 string you get garbage. You need to >> > > convert to Unicode first, as Raul showed. >> > > >> > > Henry Rich >> > > >> > > On 9/12/2019 4:54 PM, Mark Linton wrote: >> > > > I’ve used >> > > > a. >> > > > And >> > > > u: i. 255 >> > > > To create a list of characters (ASCII alphabet) >> > > > >> > > > Is there a way to change character-sets or code-pages or fonts to >> > change >> > > > the letters that are displayed when a J sentence is executed? >> > > > >> > > > Would it be a setting in the terminal or a verb in J? >> > > > >> > > > I’ve also noticed that |. has problems reversing two byte UTF-8 >> > > > characters? What is the preferred method for reversing such >> > characters >> > > > in a string >> > > >> > > >> > > --- >> > > This email has been checked for viruses by AVG. >> > > https://www.avg.com >> > > >> > > ---------------------------------------------------------------------- >> > > For information about J forums see >> http://www.jsoftware.com/forums.htm >> > > >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> > >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
