Carl W. Brown wrote:
> I think that the bigger issue might be how do you extend Morse code to
> incorporate the Unicode character set.
> [...]

Carl, this is unfair!! You spoiled my April 1st joke in mid November!

Ciao.
Marco :-)



----------------------------------------------------------------------
UTF-Morse - "Bringing Unicode in the telegraph age!"
    

1. Unicode characters U+0020..U+007E are encoded according to the
following table:

Code:  UTF-Morse:  Character name:
------ ----------- --------------------------
U+0020 /           SPACE
U+0021 -----.      EXCLAMATION MARK [1]
U+0022 .-..-.      QUOTATION MARK
U+0023 .-.-..      NUMBER SIGN [1]
U+0024 ..-...      DOLLAR SIGN [1]
U+0025 ..-..-      PERCENT SIGN [1]
U+0026 ..-.-.      AMPERSAND [1]
U+0027 .----.      APOSTROPHE
U+0028 -.--.-      LEFT PARENTHESIS
U+0029 -.---.      RIGHT PARENTHESIS [1]
U+002A -.----      ASTERISK [1]
U+002B --....      PLUS SIGN [1]
U+002C --..--      COMMA
U+002D -....-      HYPHEN-MINUS
U+002E .-.-.-      FULL STOP
U+002F -..-.       SOLIDUS [1]
U+0030 -----       DIGIT ZERO
U+0031 .----       DIGIT ONE
U+0032 ..---       DIGIT TWO
U+0033 ...--       DIGIT THREE
U+0034 ....-       DIGIT FOUR
U+0035 .....       DIGIT FIVE
U+0036 -....       DIGIT SIX
U+0037 --...       DIGIT SEVEN
U+0038 ---..       DIGIT EIGHT
U+0039 ----.       DIGIT NINE
U+003A ---...      COLON
U+003B ---..-      SEMICOLON [1]
U+003C ---.-.      LESS-THAN SIGN [1]
U+003D ----..      EQUALS SIGN [1]
U+003E ---.--      GREATER-THAN SIGN [1]
U+003F ..--..      QUESTION MARK
U+0040 -.-.-.      COMMERCIAL AT [1]
U+0041 ..-- .-     LATIN CAPITAL LETTER A [2]
U+0042 ..-- -...   LATIN CAPITAL LETTER B [2]
U+0043 ..-- -.-.   LATIN CAPITAL LETTER C [2]
U+0044 ..-- -..    LATIN CAPITAL LETTER D [2]
U+0045 ..-- .      LATIN CAPITAL LETTER E [2]
U+0046 ..-- ..-.   LATIN CAPITAL LETTER F [2]
U+0047 ..-- --.    LATIN CAPITAL LETTER G [2]
U+0048 ..-- ....   LATIN CAPITAL LETTER H [2]
U+0049 ..-- ..     LATIN CAPITAL LETTER I [2]
U+004A ..-- .---   LATIN CAPITAL LETTER J [2]
U+004B ..-- -.-    LATIN CAPITAL LETTER K [2]
U+004C ..-- .-..   LATIN CAPITAL LETTER L [2]
U+004D ..-- --     LATIN CAPITAL LETTER M [2]
U+004E ..-- -.     LATIN CAPITAL LETTER N [2]
U+004F ..-- ---    LATIN CAPITAL LETTER O [2]
U+0050 ..-- .--.   LATIN CAPITAL LETTER P [2]
U+0051 ..-- --.-   LATIN CAPITAL LETTER Q [2]
U+0052 ..-- .-.    LATIN CAPITAL LETTER R [2]
U+0053 ..-- ...    LATIN CAPITAL LETTER S [2]
U+0054 ..-- -      LATIN CAPITAL LETTER T [2]
U+0055 ..-- ..-    LATIN CAPITAL LETTER U [2]
U+0056 ..-- ...-   LATIN CAPITAL LETTER V [2]
U+0057 ..-- .--    LATIN CAPITAL LETTER W [2]
U+0058 ..-- -..-   LATIN CAPITAL LETTER X [2]
U+0059 ..-- -.--   LATIN CAPITAL LETTER Y [2]
U+005A ..-- --..   LATIN CAPITAL LETTER Z [2]
U+005B ..---.      LEFT SQUARE BRACKET [1]
U+005C .-....      REVERSE SOLIDUS [1]
U+005D ..----      RIGHT SQUARE BRACKET [1]
U+005E .-...-      CIRCUMFLEX ACCENT [1]
U+005F ------      LOW LINE [1]
U+0060 ...---      GRAVE ACCENT [1]
U+0061 .-          LATIN SMALL LETTER A
U+0062 -...        LATIN SMALL LETTER B
U+0063 -.-.        LATIN SMALL LETTER C
U+0064 -..         LATIN SMALL LETTER D
U+0065 .           LATIN SMALL LETTER E
U+0066 ..-.        LATIN SMALL LETTER F
U+0067 --.         LATIN SMALL LETTER G
U+0068 ....        LATIN SMALL LETTER H
U+0069 ..          LATIN SMALL LETTER I
U+006A .---        LATIN SMALL LETTER J
U+006B -.-         LATIN SMALL LETTER K
U+006C .-..        LATIN SMALL LETTER L
U+006D --          LATIN SMALL LETTER M
U+006E -.          LATIN SMALL LETTER N
U+006F ---         LATIN SMALL LETTER O
U+0070 .--.        LATIN SMALL LETTER P
U+0071 --.-        LATIN SMALL LETTER Q
U+0072 .-.         LATIN SMALL LETTER R
U+0073 ...         LATIN SMALL LETTER S
U+0074 -           LATIN SMALL LETTER T
U+0075 ..-         LATIN SMALL LETTER U
U+0076 ...-        LATIN SMALL LETTER V
U+0077 .--         LATIN SMALL LETTER W
U+0078 -..-        LATIN SMALL LETTER X
U+0079 -.--        LATIN SMALL LETTER Y
U+007A --..        LATIN SMALL LETTER Z
U+007B --.-..      LEFT CURLY BRACKET [1]
U+007C --.--.      VERTICAL LINE [1]
U+007D --.-.-      RIGHT CURLY BRACKET [1]
U+007E --.---      TILDE [1]


2. All other Unicode characters are encoded with one of seven
multi-Morse schemes:

Code range:        Scheme
-----------------  ------
U+0000..U+0007     1
U+0008..U+001F     2
U+007F..U+01FF     3
U+0200..U+0FFF     4
U+1000..U+7FFF     5
U+8000..U+3FFFF    6
U+40000..U+10FFFF  7

Each scheme uses a Morse sequence of the form ".-.yyy", possibly
preceded by one or more Morse sequences in the form ".-.yyy":

Scheme Bits (x: 0 or 1):     UTF-Morse (y: "." if x is 0, "-" if x is 1):
------ --------------------
------------------------------------------------
1      00000000000000000xxx  .-.yyy 
2      00000000000000xxxxxx  -..yyy .-.yyy 
3      00000000000xxxxxxxxx  -..yyy -..yyy .-.yyy 
4      00000000xxxxxxxxxxxx  -..yyy -..yyy -..yyy .-.yyy 
5      000000xxxxxxxxxxxxxx  -..yyy -..yyy -..yyy -..yyy .-.yyy 
6      000xxxxxxxxxxxxxxxxx  -..yyy -..yyy -..yyy -..yyy -..yyy .-.yyy 
7      xxxxxxxxxxxxxxxxxxxx  -..yyy -..yyy -..yyy -..yyy -..yyy -..yyy
.-.yyy


3. Notes

[1]: Some sequences are unique to UTF-Morse, and are unknown in
     traditional Morse code.

[2]: Capital letters use the same code as small letter, preceded by 
     sequence "..--" (which is unique to UTF-Morse).

----------------------------------------------------------------------------
-

Reply via email to