Title: [134173] trunk/Source
Revision
134173
Author
[email protected]
Date
2012-11-11 06:17:46 -0800 (Sun, 11 Nov 2012)

Log Message

WTFString::utf8() should have a mode of conversion to use replacement character
https://bugs.webkit.org/show_bug.cgi?id=101678

Source/_javascript_Core:

Reviewed by Alexey Proskuryakov.

Follow the change on String::utf8()

* runtime/JSGlobalObjectFunctions.cpp:
(JSC::encode): Pass String::StrictConversion instead of true to String::utf8().

Source/WebCore:

Reviewed by Alexey Proskuryakov.

Follow the change on String::utf8()

No new tests. No changes in behavior.

* Modules/websockets/WebSocket.cpp:
(WebCore::WebSocket::close): Pass String::StrictConversion instead of true to String::utf8().
* Modules/websockets/WebSocketChannel.cpp:
(WebCore::WebSocketChannel::send): Ditto.
* html/MediaFragmentURIParser.cpp:
(WebCore::MediaFragmentURIParser::parseFragments): Ditto.
* platform/graphics/blackberry/MediaPlayerPrivateBlackBerry.cpp:
(WebCore::MediaPlayerPrivate::notifyChallengeResult): Ditto.
* platform/network/blackberry/rss/RSSFilterStream.cpp:
(WebCore::RSSFilterStream::convertContentToHtml): Ditto.
* platform/network/blackberry/rss/RSSGenerator.cpp:
(WebCore::RSSGenerator::generateHtml): Ditto.

Source/WebKit2:

Reviewed by Alexey Proskuryakov.

Update the symbol for String::utf8().

* win/WebKit2.def:
* win/WebKit2CFLite.def:

Source/WTF:

Reviewed by Alexander Pavlov.

Introduce conversion mode to String::utf8().
There are three conversion modes; lenient mode, strict mode, and
"replacing unpaired surrogates with the replacement character" (replacement) mode.
Lenient mode converts unpaired surrogates. Strict mode fails when there is an unpaired
surrogates and returns CString(). Replacement mode replaces unpaired surrogates with
the replacement character(U+FFFD). Replacement mode implements the algorithm defined at
http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode. WebSocket::send() requires
this algorithm to encode a string to utf-8.

* wtf/text/WTFString.cpp:
(WTF::String::utf8): Changed to take ConversionMode as the argument.
* wtf/text/WTFString.h:
(String):

Modified Paths

Diff

Modified: trunk/Source/_javascript_Core/ChangeLog (134172 => 134173)


--- trunk/Source/_javascript_Core/ChangeLog	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/_javascript_Core/ChangeLog	2012-11-11 14:17:46 UTC (rev 134173)
@@ -1,3 +1,15 @@
+2012-11-11  Kenichi Ishibashi  <[email protected]>
+
+        WTFString::utf8() should have a mode of conversion to use replacement character
+        https://bugs.webkit.org/show_bug.cgi?id=101678
+
+        Reviewed by Alexey Proskuryakov.
+
+        Follow the change on String::utf8()
+
+        * runtime/JSGlobalObjectFunctions.cpp:
+        (JSC::encode): Pass String::StrictConversion instead of true to String::utf8().
+
 2012-11-10  Filip Pizlo  <[email protected]>
 
         DFG should optimize out the NaN check on loads from double arrays if the array prototype chain is having a great time

Modified: trunk/Source/_javascript_Core/runtime/JSGlobalObjectFunctions.cpp (134172 => 134173)


--- trunk/Source/_javascript_Core/runtime/JSGlobalObjectFunctions.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/_javascript_Core/runtime/JSGlobalObjectFunctions.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -52,7 +52,7 @@
 
 static JSValue encode(ExecState* exec, const char* doNotEscape)
 {
-    CString cstr = exec->argument(0).toString(exec)->value(exec).utf8(true);
+    CString cstr = exec->argument(0).toString(exec)->value(exec).utf8(String::StrictConversion);
     if (!cstr.data())
         return throwError(exec, createURIError(exec, ASCIILiteral("String contained an illegal UTF-16 sequence.")));
 

Modified: trunk/Source/WTF/ChangeLog (134172 => 134173)


--- trunk/Source/WTF/ChangeLog	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WTF/ChangeLog	2012-11-11 14:17:46 UTC (rev 134173)
@@ -1,3 +1,24 @@
+2012-11-11  Kenichi Ishibashi  <[email protected]>
+
+        WTFString::utf8() should have a mode of conversion to use replacement character
+        https://bugs.webkit.org/show_bug.cgi?id=101678
+
+        Reviewed by Alexander Pavlov.
+
+        Introduce conversion mode to String::utf8().
+        There are three conversion modes; lenient mode, strict mode, and
+        "replacing unpaired surrogates with the replacement character" (replacement) mode.
+        Lenient mode converts unpaired surrogates. Strict mode fails when there is an unpaired
+        surrogates and returns CString(). Replacement mode replaces unpaired surrogates with
+        the replacement character(U+FFFD). Replacement mode implements the algorithm defined at
+        http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode. WebSocket::send() requires
+        this algorithm to encode a string to utf-8.
+
+        * wtf/text/WTFString.cpp:
+        (WTF::String::utf8): Changed to take ConversionMode as the argument.
+        * wtf/text/WTFString.h:
+        (String):
+
 2012-11-09  Alexei Filippov  <[email protected]>
 
         Web Inspector: Fix heap snapshots counted several times by NMI

Modified: trunk/Source/WTF/wtf/text/WTFString.cpp (134172 => 134173)


--- trunk/Source/WTF/wtf/text/WTFString.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WTF/wtf/text/WTFString.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -32,6 +32,7 @@
 #include <wtf/StringExtras.h>
 #include <wtf/Vector.h>
 #include <wtf/dtoa.h>
+#include <wtf/unicode/CharacterNames.h>
 #include <wtf/unicode/UTF8.h>
 #include <wtf/unicode/Unicode.h>
 
@@ -753,7 +754,7 @@
     *buffer++ = static_cast<char>((ch & 0x3F) | 0x80);
 }
 
-CString String::utf8(bool strict) const
+CString String::utf8(ConversionMode mode) const
 {
     unsigned length = this->length();
 
@@ -784,26 +785,48 @@
     } else {
         const UChar* characters = this->characters16();
 
-        ConversionResult result = convertUTF16ToUTF8(&characters, characters + length, &buffer, buffer + bufferVector.size(), strict);
-        ASSERT(result != targetExhausted); // (length * 3) should be sufficient for any conversion
+        if (mode == StrictConversionReplacingUnpairedSurrogatesWithFFFD) {
+            const UChar* charactersEnd = characters + length;
+            char* bufferEnd = buffer + bufferVector.size();
+            while (characters < charactersEnd) {
+                // Use strict conversion to detect unpaired surrogates.
+                ConversionResult result = convertUTF16ToUTF8(&characters, charactersEnd, &buffer, bufferEnd, true);
+                ASSERT(result != targetExhausted);
+                // Conversion fails when there is an unpaired surrogate.
+                // Put replacement character (U+FFFD) instead of the unpaired surrogate.
+                if (result != conversionOK) {
+                    ASSERT((0xD800 <= *characters && *characters <= 0xDFFF));
+                    // There should be room left, since one UChar hasn't been converted.
+                    ASSERT((buffer + 3) <= bufferEnd);
+                    putUTF8Triple(buffer, replacementCharacter);
+                    ++characters;
+                }
+            }
+        } else {
+            bool strict = mode == StrictConversion;
+            ConversionResult result = convertUTF16ToUTF8(&characters, characters + length, &buffer, buffer + bufferVector.size(), strict);
+            ASSERT(result != targetExhausted); // (length * 3) should be sufficient for any conversion
 
-        // Only produced from strict conversion.
-        if (result == sourceIllegal)
-            return CString();
+            // Only produced from strict conversion.
+            if (result == sourceIllegal) {
+                ASSERT(strict);
+                return CString();
+            }
 
-        // Check for an unconverted high surrogate.
-        if (result == sourceExhausted) {
-            if (strict)
-                return CString();
-            // This should be one unpaired high surrogate. Treat it the same
-            // was as an unpaired high surrogate would have been handled in
-            // the middle of a string with non-strict conversion - which is
-            // to say, simply encode it to UTF-8.
-            ASSERT((characters + 1) == (this->characters() + length));
-            ASSERT((*characters >= 0xD800) && (*characters <= 0xDBFF));
-            // There should be room left, since one UChar hasn't been converted.
-            ASSERT((buffer + 3) <= (buffer + bufferVector.size()));
-            putUTF8Triple(buffer, *characters);
+            // Check for an unconverted high surrogate.
+            if (result == sourceExhausted) {
+                if (strict)
+                    return CString();
+                // This should be one unpaired high surrogate. Treat it the same
+                // was as an unpaired high surrogate would have been handled in
+                // the middle of a string with non-strict conversion - which is
+                // to say, simply encode it to UTF-8.
+                ASSERT((characters + 1) == (this->characters() + length));
+                ASSERT((*characters >= 0xD800) && (*characters <= 0xDBFF));
+                // There should be room left, since one UChar hasn't been converted.
+                ASSERT((buffer + 3) <= (buffer + bufferVector.size()));
+                putUTF8Triple(buffer, *characters);
+            }
         }
     }
 

Modified: trunk/Source/WTF/wtf/text/WTFString.h (134172 => 134173)


--- trunk/Source/WTF/wtf/text/WTFString.h	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WTF/wtf/text/WTFString.h	2012-11-11 14:17:46 UTC (rev 134173)
@@ -211,8 +211,15 @@
 
     WTF_EXPORT_STRING_API CString ascii() const;
     WTF_EXPORT_STRING_API CString latin1() const;
-    WTF_EXPORT_STRING_API CString utf8(bool strict = false) const;
 
+    typedef enum {
+        LenientConversion,
+        StrictConversion,
+        StrictConversionReplacingUnpairedSurrogatesWithFFFD,
+    } ConversionMode;
+
+    WTF_EXPORT_STRING_API CString utf8(ConversionMode = LenientConversion) const;
+
     UChar operator[](unsigned index) const
     {
         if (!m_impl || index >= m_impl->length())

Modified: trunk/Source/WebCore/ChangeLog (134172 => 134173)


--- trunk/Source/WebCore/ChangeLog	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/ChangeLog	2012-11-11 14:17:46 UTC (rev 134173)
@@ -1,3 +1,27 @@
+2012-11-11  Kenichi Ishibashi  <[email protected]>
+
+        WTFString::utf8() should have a mode of conversion to use replacement character
+        https://bugs.webkit.org/show_bug.cgi?id=101678
+
+        Reviewed by Alexey Proskuryakov.
+
+        Follow the change on String::utf8()
+
+        No new tests. No changes in behavior.
+
+        * Modules/websockets/WebSocket.cpp:
+        (WebCore::WebSocket::close): Pass String::StrictConversion instead of true to String::utf8().
+        * Modules/websockets/WebSocketChannel.cpp:
+        (WebCore::WebSocketChannel::send): Ditto.
+        * html/MediaFragmentURIParser.cpp:
+        (WebCore::MediaFragmentURIParser::parseFragments): Ditto.
+        * platform/graphics/blackberry/MediaPlayerPrivateBlackBerry.cpp:
+        (WebCore::MediaPlayerPrivate::notifyChallengeResult): Ditto.
+        * platform/network/blackberry/rss/RSSFilterStream.cpp:
+        (WebCore::RSSFilterStream::convertContentToHtml): Ditto.
+        * platform/network/blackberry/rss/RSSGenerator.cpp:
+        (WebCore::RSSGenerator::generateHtml): Ditto.
+
 2012-11-10  Simon Fraser  <[email protected]>
 
         Coalesce main thread scroll position updates

Modified: trunk/Source/WebCore/Modules/websockets/WebSocket.cpp (134172 => 134173)


--- trunk/Source/WebCore/Modules/websockets/WebSocket.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/Modules/websockets/WebSocket.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -341,7 +341,7 @@
             ec = INVALID_ACCESS_ERR;
             return;
         }
-        CString utf8 = reason.utf8(true);
+        CString utf8 = reason.utf8(String::StrictConversion);
         if (utf8.length() > maxReasonSizeInBytes) {
             scriptExecutionContext()->addConsoleMessage(JSMessageSource, LogMessageType, ErrorMessageLevel, "WebSocket close message is too long.");
             ec = SYNTAX_ERR;

Modified: trunk/Source/WebCore/Modules/websockets/WebSocketChannel.cpp (134172 => 134173)


--- trunk/Source/WebCore/Modules/websockets/WebSocketChannel.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/Modules/websockets/WebSocketChannel.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -137,7 +137,7 @@
 ThreadableWebSocketChannel::SendResult WebSocketChannel::send(const String& message)
 {
     LOG(Network, "WebSocketChannel %p send %s", this, message.utf8().data());
-    CString utf8 = message.utf8(true);
+    CString utf8 = message.utf8(String::StrictConversion);
     if (utf8.isNull() && message.length())
         return InvalidMessage;
     enqueueTextFrame(utf8);

Modified: trunk/Source/WebCore/html/MediaFragmentURIParser.cpp (134172 => 134173)


--- trunk/Source/WebCore/html/MediaFragmentURIParser.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/html/MediaFragmentURIParser.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -142,11 +142,11 @@
         //     name or value are not valid UTF-8 strings, then remove the name-value pair from the list.
         bool validUTF8 = true;
         if (!name.isEmpty()) {
-            name = name.utf8(true).data();
+            name = name.utf8(String::StrictConversion).data();
             validUTF8 = !name.isEmpty();
         }
         if (validUTF8 && !value.isEmpty()) {
-            value = value.utf8(true).data();
+            value = value.utf8(String::StrictConversion).data();
             validUTF8 = !value.isEmpty();
         }
         

Modified: trunk/Source/WebCore/platform/graphics/blackberry/MediaPlayerPrivateBlackBerry.cpp (134172 => 134173)


--- trunk/Source/WebCore/platform/graphics/blackberry/MediaPlayerPrivateBlackBerry.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/platform/graphics/blackberry/MediaPlayerPrivateBlackBerry.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -749,9 +749,9 @@
     if (result != AuthenticationChallengeSuccess || !url.isValid())
         return;
 
-    m_platformPlayer->reloadWithCredential(credential.user().utf8(true).data(),
-                                           credential.password().utf8(true).data(),
-                                           static_cast<MMRAuthChallenge::CredentialPersistence>(credential.persistence()));
+    m_platformPlayer->reloadWithCredential(credential.user().utf8(String::StrictConversion).data(),
+                                        credential.password().utf8(String::StrictConversion).data(),
+                                        static_cast<MMRAuthChallenge::CredentialPersistence>(credential.persistence()));
 }
 
 void MediaPlayerPrivate::onAuthenticationAccepted(const MMRAuthChallenge& authChallenge) const

Modified: trunk/Source/WebCore/platform/network/blackberry/rss/RSSFilterStream.cpp (134172 => 134173)


--- trunk/Source/WebCore/platform/network/blackberry/rss/RSSFilterStream.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/platform/network/blackberry/rss/RSSFilterStream.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -527,7 +527,7 @@
 
     OwnPtr<RSSGenerator> generator = adoptPtr(new RSSGenerator());
     String html = generator->generateHtml(parser->m_root);
-    result = html.utf8(true).data();
+    result = html.utf8(String::StrictConversion).data();
 
     return true;
 }

Modified: trunk/Source/WebCore/platform/network/blackberry/rss/RSSGenerator.cpp (134172 => 134173)


--- trunk/Source/WebCore/platform/network/blackberry/rss/RSSGenerator.cpp	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebCore/platform/network/blackberry/rss/RSSGenerator.cpp	2012-11-11 14:17:46 UTC (rev 134173)
@@ -83,10 +83,10 @@
         builder.append(articleName);
         builder.appendLiteral("\" class=\"article\">\n<a href=""
         if (!item->m_link.isEmpty())
-            builder.append(item->m_link.utf8(true).data());
+            builder.append(item->m_link.utf8(String::StrictConversion).data());
         builder.appendLiteral("\"><b>");
         if (!item->m_title.isEmpty())
-            builder.append(item->m_title.utf8(true).data());
+            builder.append(item->m_title.utf8(String::StrictConversion).data());
         else
             builder.append(s_defaultEntryTitle);
         builder.appendLiteral("</b></a>\n<br />");
@@ -94,13 +94,13 @@
         if (!item->m_author.isEmpty()) {
             builder.append(i18n("By"));
             builder.appendLiteral(" <b>");
-            builder.append(item->m_author.utf8(true).data());
+            builder.append(item->m_author.utf8(String::StrictConversion).data());
             builder.appendLiteral("</b> ");
         } else {
             if (!feed->m_author.isEmpty()) {
                 builder.append(i18n("By"));
                 builder.appendLiteral(" <b>");
-                builder.append(feed->m_author.utf8(true).data());
+                builder.append(feed->m_author.utf8(String::StrictConversion).data());
                 builder.appendLiteral("</b> ");
             }
         }
@@ -113,7 +113,7 @@
 
             for (unsigned i = 0; i < item->m_categories.size() ; ++i) {
                 builder.appendLiteral("<b>");
-                builder.append(item->m_categories[i].utf8(true).data());
+                builder.append(item->m_categories[i].utf8(String::StrictConversion).data());
                 builder.appendLiteral("</b>");
 
                 if (i < item->m_categories.size() - 1)
@@ -123,11 +123,11 @@
 
         builder.appendLiteral("<br />");
         if (!item->m_pubDate.isEmpty())
-            builder.append(item->m_pubDate.utf8(true).data());
+            builder.append(item->m_pubDate.utf8(String::StrictConversion).data());
 
         builder.appendLiteral("<br />");
         if (!item->m_description.isEmpty())
-            builder.append(item->m_description.utf8(true).data());
+            builder.append(item->m_description.utf8(String::StrictConversion).data());
         builder.appendLiteral("<br />");
 
         if (item->m_enclosure) {

Modified: trunk/Source/WebKit2/ChangeLog (134172 => 134173)


--- trunk/Source/WebKit2/ChangeLog	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebKit2/ChangeLog	2012-11-11 14:17:46 UTC (rev 134173)
@@ -1,3 +1,15 @@
+2012-11-11  Kenichi Ishibashi  <[email protected]>
+
+        WTFString::utf8() should have a mode of conversion to use replacement character
+        https://bugs.webkit.org/show_bug.cgi?id=101678
+
+        Reviewed by Alexey Proskuryakov.
+
+        Update the symbol for String::utf8().
+
+        * win/WebKit2.def:
+        * win/WebKit2CFLite.def:
+
 2012-11-10  Zeno Albisser  <[email protected]>
 
         [Qt][WK2] Use QLibraryInfo to search for executables.

Modified: trunk/Source/WebKit2/win/WebKit2.def (134172 => 134173)


--- trunk/Source/WebKit2/win/WebKit2.def	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebKit2/win/WebKit2.def	2012-11-11 14:17:46 UTC (rev 134173)
@@ -267,7 +267,7 @@
         ?treeScope@Node@WebCore@@QBEPAVTreeScope@2@XZ
         ?updateLayoutIgnorePendingStylesheets@Document@WebCore@@QAEXXZ
         ?userPreferredLanguages@WebCore@@YA?AV?$Vector@VString@WTF@@$0A@@WTF@@XZ
-        ?utf8@String@WTF@@QBE?AVCString@2@_N@Z
+        ?utf8@String@WTF@@QBE?AVCString@2@W4ConversionMode@12@@Z
         ?view@Document@WebCore@@QBEPAVFrameView@2@XZ
         ??1ContextDestructionObserver@WebCore@@MAE@XZ
         ?contextDestroyed@ContextDestructionObserver@WebCore@@UAEXXZ

Modified: trunk/Source/WebKit2/win/WebKit2CFLite.def (134172 => 134173)


--- trunk/Source/WebKit2/win/WebKit2CFLite.def	2012-11-11 13:12:54 UTC (rev 134172)
+++ trunk/Source/WebKit2/win/WebKit2CFLite.def	2012-11-11 14:17:46 UTC (rev 134173)
@@ -260,7 +260,7 @@
         ?treeScope@Node@WebCore@@QBEPAVTreeScope@2@XZ
         ?updateLayoutIgnorePendingStylesheets@Document@WebCore@@QAEXXZ
         ?userPreferredLanguages@WebCore@@YA?AV?$Vector@VString@WTF@@$0A@@WTF@@XZ
-        ?utf8@String@WTF@@QBE?AVCString@2@_N@Z
+        ?utf8@String@WTF@@QBE?AVCString@2@W4ConversionMode@12@@Z
         ?view@Document@WebCore@@QBEPAVFrameView@2@XZ
         ??1ContextDestructionObserver@WebCore@@MAE@XZ
         ?contextDestroyed@ContextDestructionObserver@WebCore@@UAEXXZ
_______________________________________________
webkit-changes mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to