Title: [145924] trunk
Revision
145924
Author
[email protected]
Date
2013-03-15 11:26:18 -0700 (Fri, 15 Mar 2013)

Log Message

REGRESSION (r127277): CSS URIs with multi-byte Unicode escape sequences fail to parse
https://bugs.webkit.org/show_bug.cgi?id=112436

Reviewed by Michael Saboff.

Source/WebCore:

r127277 modified the CSS parser to use 8-bit strings when possible.
However, it failed to handle URIs that contain Unicode escape sequences
that expand to multi-byte characters. Not only would the URI fail to
parse, but so would the remainder of the input string.

Fix this by producing a 16-bit result when we detect a URI with a
multi-byte Unicode escape, like we do for identifiers and other strings.

Test: fast/css/url-with-multi-byte-unicode-escape.html

* css/CSSParser.cpp:
(WebCore::checkAndSkipString): Changed to be a free function since it
doesn't access CSSParser's internal state.
(WebCore::CSSParser::findURI): Added a function that consolidates the
logic of finding the start and end pointers of the URI and the quote
character (if encountered), starting at the current character.
(WebCore::CSSParser::parseURIInternal): Removed duplicated logic for
finding the URI bounds (this now lives in findURI()) and logic for
setting the token type and updating the current character (this now
lives in parseURI()).
(WebCore::CSSParser::parseURI): Find the URI and parse it. If a
multi-byte Unicode escape is encountered, rewind to the beginning of
the URI and re-parse with a 16-bit string as the destination.
* css/CSSParser.h:

LayoutTests:

* fast/css/url-with-multi-byte-unicode-escape-expected.txt: Added.
* fast/css/url-with-multi-byte-unicode-escape.html: Added.

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (145923 => 145924)


--- trunk/LayoutTests/ChangeLog	2013-03-15 18:16:54 UTC (rev 145923)
+++ trunk/LayoutTests/ChangeLog	2013-03-15 18:26:18 UTC (rev 145924)
@@ -1,3 +1,13 @@
+2013-03-15  Andy Estes  <[email protected]>
+
+        REGRESSION (r127277): CSS URIs with multi-byte Unicode escape sequences fail to parse
+        https://bugs.webkit.org/show_bug.cgi?id=112436
+
+        Reviewed by Michael Saboff.
+
+        * fast/css/url-with-multi-byte-unicode-escape-expected.txt: Added.
+        * fast/css/url-with-multi-byte-unicode-escape.html: Added.
+
 2013-03-14  Simon Fraser  <[email protected]>
 
         Collect samples for unresponsive web processes

Added: trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape-expected.txt (0 => 145924)


--- trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape-expected.txt	2013-03-15 18:26:18 UTC (rev 145924)
@@ -0,0 +1,13 @@
+Test parsing a CSS URI containing a multi-byte Unicode escape sequence.
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS document.styleSheets[0].cssRules.length is 2
+PASS document.styleSheets[0].cssRules[0].style.getPropertyValue("background-image") is "url(data:%C4%80)"
+PASS document.styleSheets[0].cssRules[1].style.getPropertyValue("background-color") is "green"
+PASS window.getComputedStyle(document.getElementById("test")).getPropertyValue("background-color") is "rgb(0, 128, 0)"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape.html (0 => 145924)


--- trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape.html	                        (rev 0)
+++ trunk/LayoutTests/fast/css/url-with-multi-byte-unicode-escape.html	2013-03-15 18:26:18 UTC (rev 145924)
@@ -0,0 +1,22 @@
+<!DOCTYPE html>
+<html>
+<head>
+<style>#test { background-image: url("data:\100")} #test { background-color: green !important }</style>
+<script src=""
+<script>
+    if (window.testRunner)
+        testRunner.dumpAsText();
+</script>
+</head>
+<body>
+<div id="test" style="width: 100px; height: 100px; background-color: red"></div>
+<script>
+    description('Test parsing a CSS URI containing a multi-byte Unicode escape sequence.');
+    shouldBe('document.styleSheets[0].cssRules.length', '2');
+    shouldBeEqualToString('document.styleSheets[0].cssRules[0].style.getPropertyValue("background-image")', 'url(data:%C4%80)');
+    shouldBeEqualToString('document.styleSheets[0].cssRules[1].style.getPropertyValue("background-color")', 'green');
+    shouldBeEqualToString('window.getComputedStyle(document.getElementById("test")).getPropertyValue("background-color")', 'rgb(0, 128, 0)');
+</script>
+<script src=""
+</body>
+</html>

Modified: trunk/Source/WebCore/ChangeLog (145923 => 145924)


--- trunk/Source/WebCore/ChangeLog	2013-03-15 18:16:54 UTC (rev 145923)
+++ trunk/Source/WebCore/ChangeLog	2013-03-15 18:26:18 UTC (rev 145924)
@@ -1,3 +1,35 @@
+2013-03-15  Andy Estes  <[email protected]>
+
+        REGRESSION (r127277): CSS URIs with multi-byte Unicode escape sequences fail to parse
+        https://bugs.webkit.org/show_bug.cgi?id=112436
+
+        Reviewed by Michael Saboff.
+
+        r127277 modified the CSS parser to use 8-bit strings when possible.
+        However, it failed to handle URIs that contain Unicode escape sequences
+        that expand to multi-byte characters. Not only would the URI fail to
+        parse, but so would the remainder of the input string.
+
+        Fix this by producing a 16-bit result when we detect a URI with a
+        multi-byte Unicode escape, like we do for identifiers and other strings.
+
+        Test: fast/css/url-with-multi-byte-unicode-escape.html
+
+        * css/CSSParser.cpp:
+        (WebCore::checkAndSkipString): Changed to be a free function since it
+        doesn't access CSSParser's internal state.
+        (WebCore::CSSParser::findURI): Added a function that consolidates the
+        logic of finding the start and end pointers of the URI and the quote
+        character (if encountered), starting at the current character.
+        (WebCore::CSSParser::parseURIInternal): Removed duplicated logic for
+        finding the URI bounds (this now lives in findURI()) and logic for
+        setting the token type and updating the current character (this now
+        lives in parseURI()).
+        (WebCore::CSSParser::parseURI): Find the URI and parse it. If a
+        multi-byte Unicode escape is encountered, rewind to the beginning of
+        the URI and re-parse with a 16-bit string as the destination.
+        * css/CSSParser.h:
+
 2013-03-15  Andreas Kling  <[email protected]>
 
         [JSC] Remove custom WebAudio mark functions that we can generate instead.

Modified: trunk/Source/WebCore/css/CSSParser.cpp (145923 => 145924)


--- trunk/Source/WebCore/css/CSSParser.cpp	2013-03-15 18:16:54 UTC (rev 145923)
+++ trunk/Source/WebCore/css/CSSParser.cpp	2013-03-15 18:26:18 UTC (rev 145924)
@@ -9698,7 +9698,7 @@
 }
 
 template <typename CharacterType>
-inline CharacterType* CSSParser::checkAndSkipString(CharacterType* currentCharacter, int quote)
+static inline CharacterType* checkAndSkipString(CharacterType* currentCharacter, int quote)
 {
     // Returns with 0, if string check is failed. Otherwise
     // it returns with the following character. This is necessary
@@ -9896,61 +9896,53 @@
 }
 
 template <typename CharacterType>
-inline bool CSSParser::parseURIInternal(CharacterType*& start, CharacterType*& result)
+inline bool CSSParser::findURI(CharacterType*& start, CharacterType*& end, UChar& quote)
 {
-    CharacterType* uriStart = skipWhiteSpace(currentCharacter<CharacterType>());
-    CharacterType* savedResult = result;
-
-    if (*uriStart == '"' || *uriStart == '\'') {
-        CharacterType quote = *uriStart;
-        ++uriStart;
-
-        CharacterType* stringEnd = checkAndSkipString(uriStart, quote);
-        if (!stringEnd)
-            return true;
-        stringEnd = skipWhiteSpace(stringEnd);
-        if (*stringEnd != ')')
-            return true;
-
-        start = result = currentCharacter<CharacterType>() = uriStart;
-        if (!parseStringInternal(currentCharacter<CharacterType>(), result, quote))
+    start = skipWhiteSpace(currentCharacter<CharacterType>());
+    
+    if (*start == '"' || *start == '\'') {
+        quote = *start++;
+        end = checkAndSkipString(start, quote);
+        if (!end)
             return false;
-
-        currentCharacter<CharacterType>() = stringEnd + 1;
-        m_token = URI;
     } else {
-        CharacterType* stringEnd = uriStart;
-
-        while (isURILetter(*stringEnd)) {
-            if (*stringEnd != '\\')
-                ++stringEnd;
+        quote = 0;
+        end = start;
+        while (isURILetter(*end)) {
+            if (LIKELY(*end != '\\'))
+                ++end;
             else {
-                stringEnd = checkAndSkipEscape(stringEnd);
-                if (!stringEnd)
-                    return true;
+                end = checkAndSkipEscape(end);
+                if (!end)
+                    return false;
             }
         }
+    }
 
-        stringEnd = skipWhiteSpace(stringEnd);
-        if (*stringEnd != ')')
-            return true;
+    end = skipWhiteSpace(end);
+    if (*end != ')')
+        return false;
+    
+    return true;
+}
 
-        start = result = currentCharacter<CharacterType>() = uriStart;
-        while (isURILetter(*currentCharacter<CharacterType>())) {
-            if (LIKELY(*currentCharacter<CharacterType>() != '\\'))
-                *result++ = *currentCharacter<CharacterType>()++;
-            else {
-                unsigned unicode = parseEscape<CharacterType>(currentCharacter<CharacterType>());
-                if (unicode > 0xff && sizeof(CharacterType) == 1) {
-                    result = savedResult;
-                    return false;
-                }
-                UnicodeToChars(result, unicode);
-            }
+template <typename SrcCharacterType, typename DestCharacterType>
+inline bool CSSParser::parseURIInternal(SrcCharacterType*& src, DestCharacterType*& dest, UChar quote)
+{
+    if (quote) {
+        ASSERT(quote == '"' || quote == '\'');
+        return parseStringInternal(src, dest, quote);
+    }
+    
+    while (isURILetter(*src)) {
+        if (LIKELY(*src != '\\'))
+            *dest++ = *src++;
+        else {
+            unsigned unicode = parseEscape<SrcCharacterType>(src);
+            if (unicode > 0xff && sizeof(SrcCharacterType) == 1)
+                return false;
+            UnicodeToChars(dest, unicode);
         }
-
-        currentCharacter<CharacterType>() = stringEnd + 1;
-        m_token = URI;
     }
 
     return true;
@@ -9959,14 +9951,29 @@
 template <typename CharacterType>
 inline void CSSParser::parseURI(CSSParserString& string)
 {
-    CharacterType* uriStart = string.characters<CharacterType>();
-    CharacterType* result = uriStart + string.length();
-    bool parseResult = parseURIInternal(uriStart, result);
+    CharacterType* uriStart;
+    CharacterType* uriEnd;
+    UChar quote;
+    if (!findURI(uriStart, uriEnd, quote))
+        return;
+    
+    CharacterType* dest = currentCharacter<CharacterType>() = uriStart;
+    if (LIKELY(parseURIInternal(currentCharacter<CharacterType>(), dest, quote)))
+        string.init(uriStart, dest - uriStart);
+    else {
+        // An escape sequence was encountered that can't be stored in 8 bits.
+        // Reset the current character to the start of the URI and re-parse with
+        // a 16-bit destination.
+        ASSERT(is8BitSource());
+        UChar* uriStart16 = currentCharacter16();
+        currentCharacter<CharacterType>() = uriStart;
+        bool result = parseURIInternal(currentCharacter<CharacterType>(), currentCharacter16(), quote);
+        ASSERT_UNUSED(result, result);
+        string.init(uriStart16, currentCharacter16() - uriStart16);
+    }
 
-    // parseIdentifier() parsed and created the string, therefore there shouldn't be any unhandled escapses.
-    ASSERT_UNUSED(parseResult, parseResult);
-
-    string.init(uriStart, result - uriStart);
+    currentCharacter<CharacterType>() = uriEnd + 1;
+    m_token = URI;
 }
 
 template <typename CharacterType>

Modified: trunk/Source/WebCore/css/CSSParser.h (145923 => 145924)


--- trunk/Source/WebCore/css/CSSParser.h	2013-03-15 18:16:54 UTC (rev 145923)
+++ trunk/Source/WebCore/css/CSSParser.h	2013-03-15 18:26:18 UTC (rev 145924)
@@ -458,9 +458,6 @@
     inline bool isIdentifierStart();
 
     template <typename CharacterType>
-    static inline CharacterType* checkAndSkipString(CharacterType*, int);
-
-    template <typename CharacterType>
     unsigned parseEscape(CharacterType*&);
     template <typename DestCharacterType>
     inline void UnicodeToChars(DestCharacterType*&, unsigned);
@@ -477,8 +474,11 @@
     inline void parseString(CharacterType*&, CSSParserString& resultString, UChar);
 
     template <typename CharacterType>
-    inline bool parseURIInternal(CharacterType*&, CharacterType*&);
+    inline bool findURI(CharacterType*& start, CharacterType*& end, UChar& quote);
 
+    template <typename SrcCharacterType, typename DestCharacterType>
+    inline bool parseURIInternal(SrcCharacterType*&, DestCharacterType*&, UChar quote);
+
     template <typename CharacterType>
     inline void parseURI(CSSParserString&);
     template <typename CharacterType>
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to