On Fri, 9 Dec 2016 20:37:41 -0500, Paul B. Gallagher wrote:
> Ralph Fox wrote:
>
>> Even if the recipient does have Wingdings, the recipient can still see
>> a "J" like the OP did.
>>
>>   *  The Unicode value for "J" is 74 (U+004A).
>>   *  But the Wingdings font's smiley face has a different _Unicode_ value
>>      61514 (U+F04A) in the Wingdings font's cmap.
>>
>> A program using Unicode will not find a glyph in the Wingdings font at
>> _Unicode_ value 74 (U+004A).  So it will substitute another font which
>> has a "J".
>
> Yes and no...
>
> If you launch Character Map, select Wingdings, and go to 0x4A, you see
> the smiley face. Switching to a normal font, you see "J" at the same
> position. (BTW, Wingdings is not a Unicode font AFAICT.)


The smiley face is at 0x4A in the MS-Windows "Symbol" codepage.

If you use a third-party character map program which lets you show 
Wingdings in Unicode, it will show the smiley face at U+F04A.
Two third-party character map programs which come to mind:
  *  BabelMap      -- http://www.babelstone.co.uk/Software/BabelMap.html
  *  SIL ViewGlyph -- 
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=ViewGlyph_home

You will already know that a code point can vary with the character set 
(codepage) selected.  In a normal font like Arial, the Euro character 
"€" is at 0x80 in "Windows Western", 0x88 in "Windows Cyrillic", and 
U+20AC in Unicode.

When you select Wingdings, Windows Character Map disables the "Character 
set" drop-down and forces use of the Windows "Symbol" codepage.  To 
avoid this, use a third-party character map program.


> Now try this...
>
> If you take and copy the smiley face from Wingdings into MS Word, you
> get a smiley face as expected (the font window in the ribbon shows
> "Wingdings"). Now select it and do CTRL-space to apply your default font
> (Times New Roman, Arial, whatever), and it changes to an empty box, as
> if there were nothing at that code point. Now select the box, copy it,
> and do CTRL-F to begin a search. When you paste it into the search input
> window, you get the smiley face again. So at some level, Word knows it's
> not a "J," and its search routine doesn't treat it as one.


MS-Word knows that the code point is U+F04A, not "J" = U+004A.

If you save that document as a .docx file, unzip the .docx file (it is a 
ZIP file with a .docx extension), and use a hex editor to look at the 
file "document.xml" in the unzipped .docx file, you will see that this 
smiley face character has been encoded as the 3 bytes 0xEF,0x81,0x8A. 
That is the UTF-8 encoding of U+F04A.


> Similarly, if you paste this character into Notepad and do Format | Font
> and choose Wingdings, you see the smiley face, but if you switch back to
> a normal font, you get the "I can't find that" box. So Notepad knows, too.
>
> In Character Map, the most complete font I know (Arial Unicode MS) has
> nothing at U+F04A -- it jumps from  and  at U+F001 and U+F002 to  and
>  at U+F700 and U+F701.


U+F04A is in the Unicode Private Use Area U+E000–U+F8FF.
https://en.wikipedia.org/w/index.php?title=Private_Use_Areas&oldid=753773200

The Unicode Consortium has allocated the range U+E000–U+F8FF as a 
Private Use Area.  The Unicode Consortium does not define characters in 
this range.

"Normal" fonts like Times New Roman or Arial use character code points 
which have been defined by the Unicode Consortium.  This is so you can 
change fonts without losing the text of the document.

Code points in the Unicode Private Use Area are used by such things as:
  * Dingbat symbol fonts like Wingdings.
  * Fantasy scripts such as Tolkien's Elvish or Star Trek's Klingon,
    for which the Unicode Consortium has not defined character code
    points.


-- 
Kind regards
Ralph
🦊
_______________________________________________
support-seamonkey mailing list
[email protected]
https://lists.mozilla.org/listinfo/support-seamonkey

Reply via email to