Mano wrote:
>I am sure this is a stupid question...
>but hey!  why break with tradition.
>
>why is the first byte of my Greek character 206 (hex CE)
>when on the unicode chart Greek (not polytonic.. just simple) the first
>byte is 3 (hex 03 :)
>
>I may be further than I thought.
>
>Mano

This is UTF-8 see http://en.wikipedia.org/wiki/UTF-8 or google for UTF-8 
encoding or
similar. UTF-8 is a variable length encoding that allows for multi-byte 
characters to be
embedded in 8-bit data such that the 7-bit ASCII characters ( where 
$a(char)<128 ) are
each represented unchanged in a single byte and the characters with 
$a(char)>127 indicate
the introduction or the continuation of a multi-byte sequence.

The 206 is 11001110 in binary. That breaks down to 110_01110 (192 + 14 in 
decimal) where
the 110 (192 decimal) introduces a 2-byte character sequence and the 01110 (14 
decimal)
represents the high bits of the decoded unicode character number. The second 
byte of each
pair has high-bits 10 (128 decimal) leaving six bits to add to the five bits 
from the
first byte. The high bits get multiplied by 64 to effect a 6 bit shift.

Here is another decoding of your previous Greek example:

s z="ΛΚΞΔΣΛΦΚ" f i=1:1:$l(z) w !,i s c=$e(z,i),a=$a(c) s:a>127
i=i+1,c=c_$e(z,i),a=a-192*64+($a(z,i)-128) w 
*9,c,*9,a,*9,$s(a>127:"&#"_a_";",1:a)

1       Λ      923     &#923;
3       Κ      922     &#922;
5       Ξ      926     &#926;
7       Δ      916     &#916;
9       Σ      931     &#931;
11      Λ      923     &#923;
13      Φ      934     &#934;
15      Κ      922     &#922;


---------------------------------------
Jim Self
Systems Architect, Lead Developer
VMTH Computer Services, UC Davis
(http://www.vmth.ucdavis.edu/us/jaself)


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Hardhats-members mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/hardhats-members

Reply via email to