Title: [233413] trunk
Revision
233413
Author
[email protected]
Date
2018-07-01 18:36:43 -0700 (Sun, 01 Jul 2018)

Log Message

[Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
https://bugs.webkit.org/show_bug.cgi?id=187209
<rdar://problem/40920785>

Reviewed by Darin Adler.

Source/WebCore:

Inside our complex text codepath, we perform our own font fallback, which
includes a function that asks "can this font support this grapheme cluster?"
Because of the mechanics of how fonts work, the implementation of this
function is "Does the font's cmap table support every character of the
cluster?" We were using Font::glyphForCharacter() to determine this; however,
this function maps certain control characters to the zero width space
character (with the intention that these control characters shouldn't be
visible in the fast text codepath). That replacement, however, was causing
us to get false negatives, because Apple Color Emoji doesn't support zero
width space. Therefore, Apple Color Emoji was looking like it didn't support
emoji combining sequences.

The best solution to this would be to get Font::glyphForCharacter() to stop
performing these replacements (see https://bugs.webkit.org/show_bug.cgi?id=187166).
However, that is too risky of a change to be making right now. Instead,
a more localized solution is to implement a version of "Does the font's cmap
table support every character of the cluster" that doesn't perform the
substitutions. This patch does exactly that, and uses a bit vector to cache
the results. In order to not have a giant bit vector, we take the old code
path if we know the substitutions won't affect us (and uses ASSERT()s to
validate this) so the bit vector only holds at maximum 3 words of storage.

Test: fast/text/emoji-with-joiner.html

* platform/graphics/Font.cpp:
(WebCore::codePointSupportIndex):
(WebCore::createAndFillGlyphPage):
(WebCore::Font::platformSupportsCodePoint const):
(WebCore::Font::supportsCodePoint const):
(WebCore::Font::canRenderCombiningCharacterSequence const):
* platform/graphics/Font.h:
* platform/graphics/cocoa/FontCocoa.mm:
(WebCore::Font::platformSupportsCodePoint const):

Source/WTF:

* wtf/unicode/CharacterNames.h:

LayoutTests:

* fast/text/emoji-with-joiner-expected.txt: Added.
* fast/text/emoji-with-joiner.html: Added.

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (233412 => 233413)


--- trunk/LayoutTests/ChangeLog	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/LayoutTests/ChangeLog	2018-07-02 01:36:43 UTC (rev 233413)
@@ -1,3 +1,14 @@
+2018-07-01  Myles C. Maxfield  <[email protected]>
+
+        [Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
+        https://bugs.webkit.org/show_bug.cgi?id=187209
+        <rdar://problem/40920785>
+
+        Reviewed by Darin Adler.
+
+        * fast/text/emoji-with-joiner-expected.txt: Added.
+        * fast/text/emoji-with-joiner.html: Added.
+
 2018-07-01  Wenson Hsieh  <[email protected]>
 
         [macOS] Text replacements that end with symbols are expanded immediately

Added: trunk/LayoutTests/fast/text/emoji-with-joiner-expected.txt (0 => 233413)


--- trunk/LayoutTests/fast/text/emoji-with-joiner-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/text/emoji-with-joiner-expected.txt	2018-07-02 01:36:43 UTC (rev 233413)
@@ -0,0 +1,10 @@
+PASS element.offsetWidth is 48
+PASS element.offsetWidth is 48
+PASS element.offsetWidth is 48
+PASS element.offsetWidth is 48
+PASS element.offsetWidth is 48
+PASS element.offsetWidth is 48
+PASS successfullyParsed is true
+
+TEST COMPLETE
+👱‍♂️ 👱🏻‍♂️ 👱🏼‍♂️ 👱🏽‍♂️ 👱🏾‍♂️ 👱🏿‍♂️
Property changes on: trunk/LayoutTests/fast/text/emoji-with-joiner-expected.txt
___________________________________________________________________

Added: svn:eol-style

+native \ No newline at end of property

Added: svn:keywords

+Author Date Id Rev URL \ No newline at end of property

Added: trunk/LayoutTests/fast/text/emoji-with-joiner.html (0 => 233413)


--- trunk/LayoutTests/fast/text/emoji-with-joiner.html	                        (rev 0)
+++ trunk/LayoutTests/fast/text/emoji-with-joiner.html	2018-07-02 01:36:43 UTC (rev 233413)
@@ -0,0 +1,28 @@
+<!DOCTYPE html>
+<html>
+<head>
+<script src=""
+<style>
+.ib {
+    font: 48px 'Apple Color Emoji', 'LastResort';
+    display: inline-block;
+}
+</style>
+</head>
+<body>
+<div class="ib">&#x1F471;&zwj;&#x2642;&#xFE0F;</div>
+<div class="ib">&#x1F471;&#x1F3FB;&zwj;&#x2642;&#xFE0F;</div>
+<div class="ib">&#x1F471;&#x1F3FC;&zwj;&#x2642;&#xFE0F;</div>
+<div class="ib">&#x1F471;&#x1F3FD;&zwj;&#x2642;&#xFE0F;</div>
+<div class="ib">&#x1F471;&#x1F3FE;&zwj;&#x2642;&#xFE0F;</div>
+<div class="ib">&#x1F471;&#x1F3FF;&zwj;&#x2642;&#xFE0F;</div>
+<script>
+var elements = document.querySelectorAll(".ib");
+for (var element of elements) {
+    shouldBe("element.offsetWidth", "48");
+}
+</script>
+<script src=""
+</body>
+</html>
+

Modified: trunk/Source/WTF/ChangeLog (233412 => 233413)


--- trunk/Source/WTF/ChangeLog	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WTF/ChangeLog	2018-07-02 01:36:43 UTC (rev 233413)
@@ -1,3 +1,13 @@
+2018-07-01  Myles C. Maxfield  <[email protected]>
+
+        [Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
+        https://bugs.webkit.org/show_bug.cgi?id=187209
+        <rdar://problem/40920785>
+
+        Reviewed by Darin Adler.
+
+        * wtf/unicode/CharacterNames.h:
+
 2018-06-23  Darin Adler  <[email protected]>
 
         [Cocoa] Improve ARC compatibility of more code in _javascript_Core

Modified: trunk/Source/WTF/wtf/unicode/CharacterNames.h (233412 => 233413)


--- trunk/Source/WTF/wtf/unicode/CharacterNames.h	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WTF/wtf/unicode/CharacterNames.h	2018-07-02 01:36:43 UTC (rev 233413)
@@ -87,6 +87,7 @@
 const UChar smallLetterSharpS = 0x00DF;
 const UChar softHyphen = 0x00AD;
 const UChar space = 0x0020;
+const UChar tabCharacter = 0x0009;
 const UChar tibetanMarkDelimiterTshegBstar = 0x0F0C;
 const UChar tibetanMarkIntersyllabicTsheg = 0x0F0B;
 const UChar32 ugariticWordDivider = 0x1039F;
@@ -152,6 +153,7 @@
 using WTF::Unicode::sesameDot;
 using WTF::Unicode::softHyphen;
 using WTF::Unicode::space;
+using WTF::Unicode::tabCharacter;
 using WTF::Unicode::tibetanMarkDelimiterTshegBstar;
 using WTF::Unicode::tibetanMarkIntersyllabicTsheg;
 using WTF::Unicode::ugariticWordDivider;

Modified: trunk/Source/WebCore/ChangeLog (233412 => 233413)


--- trunk/Source/WebCore/ChangeLog	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WebCore/ChangeLog	2018-07-02 01:36:43 UTC (rev 233413)
@@ -1,3 +1,45 @@
+2018-07-01  Myles C. Maxfield  <[email protected]>
+
+        [Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
+        https://bugs.webkit.org/show_bug.cgi?id=187209
+        <rdar://problem/40920785>
+
+        Reviewed by Darin Adler.
+
+        Inside our complex text codepath, we perform our own font fallback, which
+        includes a function that asks "can this font support this grapheme cluster?"
+        Because of the mechanics of how fonts work, the implementation of this
+        function is "Does the font's cmap table support every character of the
+        cluster?" We were using Font::glyphForCharacter() to determine this; however,
+        this function maps certain control characters to the zero width space
+        character (with the intention that these control characters shouldn't be
+        visible in the fast text codepath). That replacement, however, was causing
+        us to get false negatives, because Apple Color Emoji doesn't support zero
+        width space. Therefore, Apple Color Emoji was looking like it didn't support
+        emoji combining sequences.
+
+        The best solution to this would be to get Font::glyphForCharacter() to stop
+        performing these replacements (see https://bugs.webkit.org/show_bug.cgi?id=187166).
+        However, that is too risky of a change to be making right now. Instead,
+        a more localized solution is to implement a version of "Does the font's cmap
+        table support every character of the cluster" that doesn't perform the
+        substitutions. This patch does exactly that, and uses a bit vector to cache
+        the results. In order to not have a giant bit vector, we take the old code
+        path if we know the substitutions won't affect us (and uses ASSERT()s to 
+        validate this) so the bit vector only holds at maximum 3 words of storage.
+
+        Test: fast/text/emoji-with-joiner.html
+
+        * platform/graphics/Font.cpp:
+        (WebCore::codePointSupportIndex):
+        (WebCore::createAndFillGlyphPage):
+        (WebCore::Font::platformSupportsCodePoint const):
+        (WebCore::Font::supportsCodePoint const):
+        (WebCore::Font::canRenderCombiningCharacterSequence const):
+        * platform/graphics/Font.h:
+        * platform/graphics/cocoa/FontCocoa.mm:
+        (WebCore::Font::platformSupportsCodePoint const):
+
 2018-07-01  Wenson Hsieh  <[email protected]>
 
         [macOS] Text replacements that end with symbols are expanded immediately

Modified: trunk/Source/WebCore/platform/graphics/Font.cpp (233412 => 233413)


--- trunk/Source/WebCore/platform/graphics/Font.cpp	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WebCore/platform/graphics/Font.cpp	2018-07-02 01:36:43 UTC (rev 233413)
@@ -151,6 +151,121 @@
     return hasGlyphs;
 }
 
+static std::optional<size_t> codePointSupportIndex(UChar32 codePoint)
+{
+    // FIXME: Consider reordering these so the most common ones are at the front.
+    // Doing this could cause the BitVector to fit inside inline storage and therefore
+    // be both a performance and a memory progression.
+    if (codePoint < 0x20)
+        return codePoint;
+    if (codePoint >= 0x7F && codePoint < 0xA0)
+        return codePoint - 0x7F + 0x20;
+    std::optional<size_t> result;
+    switch (codePoint) {
+    case softHyphen:
+        result = 0x41;
+        break;
+    case newlineCharacter:
+        result = 0x42;
+        break;
+    case tabCharacter:
+        result = 0x43;
+        break;
+    case noBreakSpace:
+        result = 0x44;
+        break;
+    case narrowNoBreakSpace:
+        result = 0x45;
+        break;
+    case leftToRightMark:
+        result = 0x46;
+        break;
+    case rightToLeftMark:
+        result = 0x47;
+        break;
+    case leftToRightEmbed:
+        result = 0x48;
+        break;
+    case rightToLeftEmbed:
+        result = 0x49;
+        break;
+    case leftToRightOverride:
+        result = 0x4A;
+        break;
+    case rightToLeftOverride:
+        result = 0x4B;
+        break;
+    case leftToRightIsolate:
+        result = 0x4C;
+        break;
+    case rightToLeftIsolate:
+        result = 0x4D;
+        break;
+    case zeroWidthNonJoiner:
+        result = 0x4E;
+        break;
+    case zeroWidthJoiner:
+        result = 0x4F;
+        break;
+    case popDirectionalFormatting:
+        result = 0x50;
+        break;
+    case popDirectionalIsolate:
+        result = 0x51;
+        break;
+    case firstStrongIsolate:
+        result = 0x52;
+        break;
+    case objectReplacementCharacter:
+        result = 0x53;
+        break;
+    case zeroWidthNoBreakSpace:
+        result = 0x54;
+        break;
+    default:
+        result = std::nullopt;
+    }
+
+#ifndef NDEBUG
+    UChar32 codePointOrder[] = {
+        0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
+        0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
+        0x7F,
+        0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89, 0x8A, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F,
+        0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, 0x98, 0x99, 0x9A, 0x9B, 0x9C, 0x9D, 0x9E, 0x9F,
+        softHyphen,
+        newlineCharacter,
+        tabCharacter,
+        noBreakSpace,
+        narrowNoBreakSpace,
+        leftToRightMark,
+        rightToLeftMark,
+        leftToRightEmbed,
+        rightToLeftEmbed,
+        leftToRightOverride,
+        rightToLeftOverride,
+        leftToRightIsolate,
+        rightToLeftIsolate,
+        zeroWidthNonJoiner,
+        zeroWidthJoiner,
+        popDirectionalFormatting,
+        popDirectionalIsolate,
+        firstStrongIsolate,
+        objectReplacementCharacter,
+        zeroWidthNoBreakSpace
+    };
+    bool found = false;
+    for (size_t i = 0; i < WTF_ARRAY_LENGTH(codePointOrder); ++i) {
+        if (codePointOrder[i] == codePoint) {
+            ASSERT(i == result);
+            found = true;
+        }
+    }
+    ASSERT(found == static_cast<bool>(result));
+#endif
+    return result;
+}
+
 static RefPtr<GlyphPage> createAndFillGlyphPage(unsigned pageNumber, const Font& font)
 {
 #if PLATFORM(IOS)
@@ -177,11 +292,14 @@
         auto overwriteCodePoints = [&](unsigned minimum, unsigned maximum, UChar newCodePoint) {
             unsigned begin = std::max(start, minimum);
             unsigned complete = std::min(end, maximum);
-            for (unsigned i = begin; i < complete; ++i)
+            for (unsigned i = begin; i < complete; ++i) {
+                ASSERT(codePointSupportIndex(i));
                 buffer[i - start] = newCodePoint;
+            }
         };
 
         auto overwriteCodePoint = [&](UChar codePoint, UChar newCodePoint) {
+            ASSERT(codePointSupportIndex(codePoint));
             if (codePoint >= start && codePoint < end)
                 buffer[codePoint - start] = newCodePoint;
         };
@@ -511,14 +629,39 @@
         return true;
     }
 }
+
+bool Font::platformSupportsCodePoint(UChar32 character) const
+{
+    return glyphForCharacter(character);
+}
 #endif
 
+bool Font::supportsCodePoint(UChar32 character) const
+{
+    // This is very similar to static_cast<bool>(glyphForCharacter(character))
+    // except that glyphForCharacter() maps certain code points to ZWS (because they
+    // shouldn't be visible). This function doesn't do that mapping, and instead is
+    // as honest as possible about what code points the font supports. This is so
+    // that we can accurately determine which characters are supported by this font
+    // so we know which boundaries to break strings when we send them to the complex
+    // text codepath. The complex text codepath is totally separate from this ZWS
+    // replacement logic (because CoreText handles those characters instead of WebKit).
+    if (auto index = codePointSupportIndex(character)) {
+        m_codePointSupport.ensureSize(2 * (*index + 1));
+        bool hasBeenSet = m_codePointSupport.quickSet(2 * *index);
+        if (!hasBeenSet && platformSupportsCodePoint(character))
+            m_codePointSupport.quickSet(2 * *index + 1);
+        return m_codePointSupport.quickGet(2 * *index + 1);
+    }
+    return glyphForCharacter(character);
+}
+
 bool Font::canRenderCombiningCharacterSequence(const UChar* characters, size_t length) const
 {
     ASSERT(isMainThread());
 
     for (UChar32 codePoint : StringView(characters, length).codePoints()) {
-        if (!glyphForCharacter(codePoint))
+        if (!supportsCodePoint(codePoint))
             return false;
     }
     return true;

Modified: trunk/Source/WebCore/platform/graphics/Font.h (233412 => 233413)


--- trunk/Source/WebCore/platform/graphics/Font.h	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WebCore/platform/graphics/Font.h	2018-07-02 01:36:43 UTC (rev 233413)
@@ -171,6 +171,8 @@
 
     GlyphData glyphDataForCharacter(UChar32) const;
     Glyph glyphForCharacter(UChar32) const;
+    bool supportsCodePoint(UChar32) const;
+    bool platformSupportsCodePoint(UChar32) const;
 
     RefPtr<Font> systemFallbackFontForCharacter(UChar32, const FontDescription&, bool isForPlatformFont) const;
 
@@ -251,6 +253,7 @@
     mutable HashMap<unsigned, RefPtr<GlyphPage>> m_glyphPages;
     mutable std::unique_ptr<GlyphMetricsMap<FloatRect>> m_glyphToBoundsMap;
     mutable GlyphMetricsMap<float> m_glyphToWidthMap;
+    mutable BitVector m_codePointSupport;
 
     mutable RefPtr<OpenTypeMathData> m_mathData;
 #if ENABLE(OPENTYPE_VERTICAL)

Modified: trunk/Source/WebCore/platform/graphics/cocoa/FontCocoa.mm (233412 => 233413)


--- trunk/Source/WebCore/platform/graphics/cocoa/FontCocoa.mm	2018-07-02 00:47:47 UTC (rev 233412)
+++ trunk/Source/WebCore/platform/graphics/cocoa/FontCocoa.mm	2018-07-02 01:36:43 UTC (rev 233413)
@@ -597,4 +597,13 @@
     return advance.width + m_syntheticBoldOffset;
 }
 
+bool Font::platformSupportsCodePoint(UChar32 character) const
+{
+    UniChar codeUnits[2];
+    CGGlyph glyphs[2];
+    CFIndex count = 0;
+    U16_APPEND_UNSAFE(codeUnits, count, character);
+    return CTFontGetGlyphsForCharacters(platformData().ctFont(), codeUnits, glyphs, count);
+}
+
 } // namespace WebCore
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to