On 2015-12-10 16:06, Mike Schwab wrote: > https://en.wikipedia.org/wiki/UTF-8 > B'0.......' is a 8 bit ASCII characters. > ITYM 7 bit. (Well, maybe.)
> B'110.....' is a 16 bit UTF character. > (Or, perhaps, only Unicode 13.) > B'1110....' is a 24 bit UTF character. > (Or, perhaps, only Unicode 20.) Etc. > B'11110...' is a 32 bit UTF character. > B'111110..' could be a 40 bit UTF character (none established). > B'1111110.' could be a 48 bit UTF character (none established). > B'11111110' could be a 56 bit UTF character (none established). > B'11111111' could be a 64 bit UTF character (none established). > B'10......' is a continuation UTF character after a previous leading > character. > B'10000000' is a padding UTF character and should be removed. -- gil ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
