Title: [236565] trunk
Revision
236565
Author
[email protected]
Date
2018-09-27 13:05:52 -0700 (Thu, 27 Sep 2018)

Log Message

URLParser should use TextEncoding through an abstract class
https://bugs.webkit.org/show_bug.cgi?id=190027

Reviewed by Andy Estes.

Source/WebCore:

URLParser uses TextEncoding for one call to encode, which is only used for encoding the query of URLs in documents with non-UTF encodings.
There are 3 call sites that specify the TextEncoding to use from the Document, and even those call sites use a UTF encoding most of the time.
All other URL parsing is done using a well-optimized path which assumes UTF-8 encoding and uses macros from ICU headers, not a TextEncoding.
Moving the logic in this way breaks URL and URLParser's dependency on TextEncoding, which makes it possible to use in a lower-level project
without also moving TextEncoding, TextCodec, TextCodecICU, ThreadGlobalData, and the rest of WebCore and _javascript_Core.

There is no observable change in behavior.  There is now one virtual function call in a code path in URLParser that is not performance-sensitive,
and TextEncodings now have a vtable, which uses a few more bytes of memory total for WebKit.

* css/parser/CSSParserContext.h:
(WebCore::CSSParserContext::completeURL const):
* css/parser/CSSParserIdioms.cpp:
(WebCore::completeURL):
* dom/Document.cpp:
(WebCore::Document::completeURL const):
* html/HTMLBaseElement.cpp:
(WebCore::HTMLBaseElement::href const):
Move the call to encodingForFormSubmission from the URL constructor to the 3 call sites that specify the encoding from the Document.
* loader/FormSubmission.cpp:
(WebCore::FormSubmission::create):
* loader/TextResourceDecoder.cpp:
(WebCore::TextResourceDecoder::encodingForURLParsing):
* loader/TextResourceDecoder.h:
* platform/URL.cpp:
(WebCore::URL::URL):
* platform/URL.h:
(WebCore::URLTextEncoding::~URLTextEncoding):
* platform/URLParser.cpp:
(WebCore::URLParser::encodeNonUTF8Query):
(WebCore::URLParser::copyURLPartsUntil):
(WebCore::URLParser::URLParser):
(WebCore::URLParser::parse):
(WebCore::URLParser::encodeQuery): Deleted.
A pointer replaces the boolean isUTF8Encoding and the TextEncoding& which had a default value of UTF8Encoding.
Now the pointer being null means that we use UTF8, and the pointer being non-null means we use that encoding.
* platform/URLParser.h:
(WebCore::URLParser::URLParser):
* platform/text/TextEncoding.cpp:
(WebCore::UTF7Encoding):
(WebCore::TextEncoding::encodingForFormSubmissionOrURLParsing const):
(WebCore::ASCIIEncoding):
(WebCore::Latin1Encoding):
(WebCore::UTF16BigEndianEncoding):
(WebCore::UTF16LittleEndianEncoding):
(WebCore::UTF8Encoding):
(WebCore::WindowsLatin1Encoding):
(WebCore::TextEncoding::encodingForFormSubmission const): Deleted.
Use NeverDestroyed because TextEncoding now has a virtual destructor.
* platform/text/TextEncoding.h:
Rename encodingForFormSubmission to encodingForFormSubmissionOrURLParsing to make it more clear that we are intentionally using it for both.

Tools:

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:
(TestWebKitAPI::checkURL):
(TestWebKitAPI::TEST_F):

Modified Paths

Diff

Modified: trunk/Source/WebCore/ChangeLog (236564 => 236565)


--- trunk/Source/WebCore/ChangeLog	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/ChangeLog	2018-09-27 20:05:52 UTC (rev 236565)
@@ -1,3 +1,61 @@
+2018-09-27  Alex Christensen  <[email protected]>
+
+        URLParser should use TextEncoding through an abstract class
+        https://bugs.webkit.org/show_bug.cgi?id=190027
+
+        Reviewed by Andy Estes.
+
+        URLParser uses TextEncoding for one call to encode, which is only used for encoding the query of URLs in documents with non-UTF encodings.
+        There are 3 call sites that specify the TextEncoding to use from the Document, and even those call sites use a UTF encoding most of the time.
+        All other URL parsing is done using a well-optimized path which assumes UTF-8 encoding and uses macros from ICU headers, not a TextEncoding.
+        Moving the logic in this way breaks URL and URLParser's dependency on TextEncoding, which makes it possible to use in a lower-level project
+        without also moving TextEncoding, TextCodec, TextCodecICU, ThreadGlobalData, and the rest of WebCore and _javascript_Core.
+
+        There is no observable change in behavior.  There is now one virtual function call in a code path in URLParser that is not performance-sensitive,
+        and TextEncodings now have a vtable, which uses a few more bytes of memory total for WebKit.
+
+        * css/parser/CSSParserContext.h:
+        (WebCore::CSSParserContext::completeURL const):
+        * css/parser/CSSParserIdioms.cpp:
+        (WebCore::completeURL):
+        * dom/Document.cpp:
+        (WebCore::Document::completeURL const):
+        * html/HTMLBaseElement.cpp:
+        (WebCore::HTMLBaseElement::href const):
+        Move the call to encodingForFormSubmission from the URL constructor to the 3 call sites that specify the encoding from the Document.
+        * loader/FormSubmission.cpp:
+        (WebCore::FormSubmission::create):
+        * loader/TextResourceDecoder.cpp:
+        (WebCore::TextResourceDecoder::encodingForURLParsing):
+        * loader/TextResourceDecoder.h:
+        * platform/URL.cpp:
+        (WebCore::URL::URL):
+        * platform/URL.h:
+        (WebCore::URLTextEncoding::~URLTextEncoding):
+        * platform/URLParser.cpp:
+        (WebCore::URLParser::encodeNonUTF8Query):
+        (WebCore::URLParser::copyURLPartsUntil):
+        (WebCore::URLParser::URLParser):
+        (WebCore::URLParser::parse):
+        (WebCore::URLParser::encodeQuery): Deleted.
+        A pointer replaces the boolean isUTF8Encoding and the TextEncoding& which had a default value of UTF8Encoding.
+        Now the pointer being null means that we use UTF8, and the pointer being non-null means we use that encoding.
+        * platform/URLParser.h:
+        (WebCore::URLParser::URLParser):
+        * platform/text/TextEncoding.cpp:
+        (WebCore::UTF7Encoding):
+        (WebCore::TextEncoding::encodingForFormSubmissionOrURLParsing const):
+        (WebCore::ASCIIEncoding):
+        (WebCore::Latin1Encoding):
+        (WebCore::UTF16BigEndianEncoding):
+        (WebCore::UTF16LittleEndianEncoding):
+        (WebCore::UTF8Encoding):
+        (WebCore::WindowsLatin1Encoding):
+        (WebCore::TextEncoding::encodingForFormSubmission const): Deleted.
+        Use NeverDestroyed because TextEncoding now has a virtual destructor.
+        * platform/text/TextEncoding.h:
+        Rename encodingForFormSubmission to encodingForFormSubmissionOrURLParsing to make it more clear that we are intentionally using it for both.
+
 2018-09-27  John Wilander  <[email protected]>
 
         Resource Load Statistics: Remove temporary compatibility fix for auto-dismiss popups

Modified: trunk/Source/WebCore/css/parser/CSSParserContext.h (236564 => 236565)


--- trunk/Source/WebCore/css/parser/CSSParserContext.h	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/css/parser/CSSParserContext.h	2018-09-27 20:05:52 UTC (rev 236565)
@@ -69,7 +69,9 @@
             return URL();
         if (charset.isEmpty())
             return URL(baseURL, url);
-        return URL(baseURL, url, TextEncoding(charset));
+        TextEncoding encoding(charset);
+        auto& encodingForURLParsing = encoding.encodingForFormSubmissionOrURLParsing();
+        return URL(baseURL, url, encodingForURLParsing == UTF8Encoding() ? nullptr : &encodingForURLParsing);
     }
 };
 

Modified: trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp (236564 => 236565)


--- trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -47,11 +47,7 @@
 
 URL completeURL(const CSSParserContext& context, const String& url)
 {
-    if (url.isNull())
-        return URL();
-    if (context.charset.isEmpty())
-        return URL(context.baseURL, url);
-    return URL(context.baseURL, url, context.charset);
+    return context.completeURL(url);
 }
 
 } // namespace WebCore

Modified: trunk/Source/WebCore/dom/Document.cpp (236564 => 236565)


--- trunk/Source/WebCore/dom/Document.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/dom/Document.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -4894,7 +4894,7 @@
     const URL& baseURL = ((baseURLOverride.isEmpty() || baseURLOverride == blankURL()) && parentDocument()) ? parentDocument()->baseURL() : baseURLOverride;
     if (!m_decoder)
         return URL(baseURL, url);
-    return URL(baseURL, url, m_decoder->encoding());
+    return URL(baseURL, url, m_decoder->encodingForURLParsing());
 }
 
 URL Document::completeURL(const String& url) const

Modified: trunk/Source/WebCore/html/HTMLBaseElement.cpp (236564 => 236565)


--- trunk/Source/WebCore/html/HTMLBaseElement.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/html/HTMLBaseElement.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -89,9 +89,8 @@
     if (attributeValue.isNull())
         return document().url();
 
-    URL url = "" ?
-        URL(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue)) :
-        URL(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue), document().decoder()->encoding());
+    auto* encoding = document().decoder() ? document().decoder()->encodingForURLParsing() : nullptr;
+    URL url(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue), encoding);
 
     if (!url.isValid())
         return URL();

Modified: trunk/Source/WebCore/loader/FormSubmission.cpp (236564 => 236565)


--- trunk/Source/WebCore/loader/FormSubmission.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/FormSubmission.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -175,7 +175,7 @@
     }
 
     auto dataEncoding = isMailtoForm ? UTF8Encoding() : encodingFromAcceptCharset(copiedAttributes.acceptCharset(), document);
-    auto domFormData = DOMFormData::create(dataEncoding.encodingForFormSubmission());
+    auto domFormData = DOMFormData::create(dataEncoding.encodingForFormSubmissionOrURLParsing());
     StringPairVector formValues;
 
     bool containsPasswordData = false;

Modified: trunk/Source/WebCore/loader/TextResourceDecoder.cpp (236564 => 236565)


--- trunk/Source/WebCore/loader/TextResourceDecoder.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/TextResourceDecoder.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -659,4 +659,16 @@
     return decoded + flush();
 }
 
+const TextEncoding* TextResourceDecoder::encodingForURLParsing()
+{
+    // For UTF-{7,16,32}, we want to use UTF-8 for the query part as
+    // we do when submitting a form. A form with GET method
+    // has its contents added to a URL as query params and it makes sense
+    // to be consistent.
+    auto& encoding = m_encoding.encodingForFormSubmissionOrURLParsing();
+    if (encoding == UTF8Encoding())
+        return nullptr;
+    return &encoding;
 }
+
+}

Modified: trunk/Source/WebCore/loader/TextResourceDecoder.h (236564 => 236565)


--- trunk/Source/WebCore/loader/TextResourceDecoder.h	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/TextResourceDecoder.h	2018-09-27 20:05:52 UTC (rev 236565)
@@ -48,6 +48,7 @@
 
     void setEncoding(const TextEncoding&, EncodingSource);
     const TextEncoding& encoding() const { return m_encoding; }
+    const TextEncoding* encodingForURLParsing();
 
     bool hasEqualEncodingForCharset(const String& charset) const;
 

Modified: trunk/Source/WebCore/platform/URL.cpp (236564 => 236565)


--- trunk/Source/WebCore/platform/URL.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URL.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -103,22 +103,12 @@
 #endif
 }
 
-URL::URL(const URL& base, const String& relative)
+URL::URL(const URL& base, const String& relative, const URLTextEncoding* encoding)
 {
-    URLParser parser(relative, base);
+    URLParser parser(relative, base, encoding);
     *this = parser.result();
 }
 
-URL::URL(const URL& base, const String& relative, const TextEncoding& encoding)
-{
-    // For UTF-{7,16,32}, we want to use UTF-8 for the query part as
-    // we do when submitting a form. A form with GET method
-    // has its contents added to a URL as query params and it makes sense
-    // to be consistent.
-    URLParser parser(relative, base, encoding.encodingForFormSubmission());
-    *this = parser.result();
-}
-
 static bool shouldTrimFromURL(UChar c)
 {
     // Browsers ignore leading/trailing whitespace and control

Modified: trunk/Source/WebCore/platform/URL.h (236564 => 236565)


--- trunk/Source/WebCore/platform/URL.h	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URL.h	2018-09-27 20:05:52 UTC (rev 236565)
@@ -47,7 +47,12 @@
 
 namespace WebCore {
 
-class TextEncoding;
+class URLTextEncoding {
+public:
+    virtual Vector<uint8_t> encodeForURLParsing(StringView) const = 0;
+    virtual ~URLTextEncoding() { };
+};
+
 struct URLHash;
 
 enum ParsedURLStringTag { ParsedURLString };
@@ -65,14 +70,13 @@
     bool isHashTableDeletedValue() const { return string().isHashTableDeletedValue(); }
 
     // Resolves the relative URL with the given base URL. If provided, the
-    // TextEncoding is used to encode non-ASCII characers. The base URL can be
+    // URLTextEncoding is used to encode non-ASCII characers. The base URL can be
     // null or empty, in which case the relative URL will be interpreted as
     // absolute.
     // FIXME: If the base URL is invalid, this always creates an invalid
     // URL. Instead I think it would be better to treat all invalid base URLs
     // the same way we treate null and empty base URLs.
-    WEBCORE_EXPORT URL(const URL& base, const String& relative);
-    URL(const URL& base, const String& relative, const TextEncoding&);
+    WEBCORE_EXPORT URL(const URL& base, const String& relative, const URLTextEncoding* = nullptr);
 
     WEBCORE_EXPORT static URL fakeURLWithRelativePart(const String&);
     WEBCORE_EXPORT static URL fileURLWithFileSystemPath(const String&);
@@ -208,7 +212,6 @@
     friend class URLParser;
     WEBCORE_EXPORT void invalidate();
     static bool protocolIs(const String&, const char*);
-    void init(const URL&, const String&, const TextEncoding&);
     void copyToBuffer(Vector<char, 512>& buffer) const;
     unsigned hostStart() const;
 
@@ -303,6 +306,7 @@
 // encoding (defaulting to UTF-8 otherwise). DANGER: If the URL has "%00"
 // in it, the resulting string will have embedded null characters!
 WEBCORE_EXPORT String decodeURLEscapeSequences(const String&);
+class TextEncoding;
 String decodeURLEscapeSequences(const String&, const TextEncoding&);
 
 // FIXME: This is a wrong concept to expose, different parts of a URL need different escaping per the URL Standard.

Modified: trunk/Source/WebCore/platform/URLParser.cpp (236564 => 236565)


--- trunk/Source/WebCore/platform/URLParser.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URLParser.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -618,9 +618,9 @@
 }
 
 template<typename CharacterType>
-void URLParser::encodeQuery(const Vector<UChar>& source, const TextEncoding& encoding, CodePointIterator<CharacterType> iterator)
+void URLParser::encodeNonUTF8Query(const Vector<UChar>& source, const URLTextEncoding& encoding, CodePointIterator<CharacterType> iterator)
 {
-    auto encoded = encoding.encode(StringView(source.data(), source.size()), UnencodableHandling::URLEncodedEntities);
+    auto encoded = encoding.encodeForURLParsing(StringView(source.data(), source.size()));
     auto* data = ""
     size_t length = encoded.size();
     
@@ -880,7 +880,7 @@
 }
 
 template<typename CharacterType>
-void URLParser::copyURLPartsUntil(const URL& base, URLPart part, const CodePointIterator<CharacterType>& iterator, bool& isUTF8Encoding)
+void URLParser::copyURLPartsUntil(const URL& base, URLPart part, const CodePointIterator<CharacterType>& iterator, const URLTextEncoding*& nonUTF8QueryEncoding)
 {
     syntaxViolation(iterator);
 
@@ -919,7 +919,7 @@
     switch (scheme(StringView(m_asciiBuffer.data(), m_url.m_schemeEnd))) {
     case Scheme::WS:
     case Scheme::WSS:
-        isUTF8Encoding = true;
+        nonUTF8QueryEncoding = nullptr;
         m_urlIsSpecial = true;
         return;
     case Scheme::File:
@@ -933,7 +933,7 @@
         return;
     case Scheme::NonSpecial:
         m_urlIsSpecial = false;
-        isUTF8Encoding = true;
+        nonUTF8QueryEncoding = nullptr;
         return;
     }
     ASSERT_NOT_REACHED();
@@ -1152,7 +1152,7 @@
     return iterator.codeUnitsSince(reinterpret_cast<const CharacterType*>(m_inputBegin));
 }
 
-URLParser::URLParser(const String& input, const URL& base, const TextEncoding& encoding)
+URLParser::URLParser(const String& input, const URL& base, const URLTextEncoding* nonUTF8QueryEncoding)
     : m_inputString(input)
 {
     if (input.isNull()) {
@@ -1165,10 +1165,10 @@
 
     if (input.is8Bit()) {
         m_inputBegin = input.characters8();
-        parse(input.characters8(), input.length(), base, encoding);
+        parse(input.characters8(), input.length(), base, nonUTF8QueryEncoding);
     } else {
         m_inputBegin = input.characters16();
-        parse(input.characters16(), input.length(), base, encoding);
+        parse(input.characters16(), input.length(), base, nonUTF8QueryEncoding);
     }
 
     ASSERT(!m_url.m_isValid
@@ -1179,7 +1179,7 @@
 #if !ASSERT_DISABLED
     if (!m_didSeeSyntaxViolation) {
         // Force a syntax violation at the beginning to make sure we get the same result.
-        URLParser parser(makeString(" ", input), base, encoding);
+        URLParser parser(makeString(" ", input), base, nonUTF8QueryEncoding);
         URL parsed = parser.result();
         if (parsed.isValid())
             ASSERT(allValuesEqual(parser.result(), m_url));
@@ -1188,13 +1188,12 @@
 }
 
 template<typename CharacterType>
-void URLParser::parse(const CharacterType* input, const unsigned length, const URL& base, const TextEncoding& encoding)
+void URLParser::parse(const CharacterType* input, const unsigned length, const URL& base, const URLTextEncoding* nonUTF8QueryEncoding)
 {
-    URL_PARSER_LOG("Parsing URL <%s> base <%s> encoding <%s>", String(input, length).utf8().data(), base.string().utf8().data(), encoding.name());
+    URL_PARSER_LOG("Parsing URL <%s> base <%s>", String(input, length).utf8().data(), base.string().utf8().data());
     m_url = { };
     ASSERT(m_asciiBuffer.isEmpty());
-    
-    bool isUTF8Encoding = encoding == UTF8Encoding();
+
     Vector<UChar> queryBuffer;
 
     unsigned endIndex = length;
@@ -1287,7 +1286,7 @@
                     break;
                 case Scheme::WS:
                 case Scheme::WSS:
-                    isUTF8Encoding = true;
+                    nonUTF8QueryEncoding = nullptr;
                     m_urlIsSpecial = true;
                     if (base.protocolIs(urlScheme))
                         state = State::SpecialRelativeOrAuthority;
@@ -1309,7 +1308,7 @@
                     ++c;
                     break;
                 case Scheme::NonSpecial:
-                    isUTF8Encoding = true;
+                    nonUTF8QueryEncoding = nullptr;
                     auto maybeSlash = c;
                     advance(maybeSlash);
                     if (!maybeSlash.atEnd() && *maybeSlash == '/') {
@@ -1353,7 +1352,7 @@
                 return;
             }
             if (base.m_cannotBeABaseURL && *c == '#') {
-                copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
                 state = State::Fragment;
                 appendToASCIIBuffer('#');
                 ++c;
@@ -1363,7 +1362,7 @@
                 state = State::Relative;
                 break;
             }
-            copyURLPartsUntil(base, URLPart::SchemeEnd, c, isUTF8Encoding);
+            copyURLPartsUntil(base, URLPart::SchemeEnd, c, nonUTF8QueryEncoding);
             appendToASCIIBuffer(':');
             state = State::File;
             break;
@@ -1413,24 +1412,23 @@
                 ++c;
                 break;
             case '?':
-                copyURLPartsUntil(base, URLPart::PathEnd, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::PathEnd, c, nonUTF8QueryEncoding);
                 appendToASCIIBuffer('?');
                 ++c;
-                if (isUTF8Encoding)
-                    state = State::UTF8Query;
-                else {
+                if (nonUTF8QueryEncoding) {
                     queryBegin = c;
                     state = State::NonUTF8Query;
-                }
+                } else
+                    state = State::UTF8Query;
                 break;
             case '#':
-                copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
                 appendToASCIIBuffer('#');
                 state = State::Fragment;
                 ++c;
                 break;
             default:
-                copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, nonUTF8QueryEncoding);
                 if (currentPosition(c) && parsedDataView(currentPosition(c) - 1) != '/') {
                     appendToASCIIBuffer('/');
                     m_url.m_pathAfterLastSlash = currentPosition(c);
@@ -1443,7 +1441,7 @@
             LOG_STATE("RelativeSlash");
             if (*c == '/' || *c == '\\') {
                 ++c;
-                copyURLPartsUntil(base, URLPart::SchemeEnd, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::SchemeEnd, c, nonUTF8QueryEncoding);
                 appendToASCIIBuffer("://", 3);
                 if (m_urlIsSpecial)
                     state = State::SpecialAuthorityIgnoreSlashes;
@@ -1453,7 +1451,7 @@
                     authorityOrHostBegin = c;
                 }
             } else {
-                copyURLPartsUntil(base, URLPart::PortEnd, c, isUTF8Encoding);
+                copyURLPartsUntil(base, URLPart::PortEnd, c, nonUTF8QueryEncoding);
                 appendToASCIIBuffer('/');
                 m_url.m_pathAfterLastSlash = base.m_hostEnd + base.m_portLength + 1;
                 state = State::Path;
@@ -1584,7 +1582,7 @@
             case '?':
                 syntaxViolation(c);
                 if (base.isValid() && base.protocolIs("file")) {
-                    copyURLPartsUntil(base, URLPart::PathEnd, c, isUTF8Encoding);
+                    copyURLPartsUntil(base, URLPart::PathEnd, c, nonUTF8QueryEncoding);
                     appendToASCIIBuffer('?');
                     ++c;
                 } else {
@@ -1598,17 +1596,16 @@
                     m_url.m_pathAfterLastSlash = m_url.m_userStart + 1;
                     m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
                 }
-                if (isUTF8Encoding)
-                    state = State::UTF8Query;
-                else {
+                if (nonUTF8QueryEncoding) {
                     queryBegin = c;
                     state = State::NonUTF8Query;
-                }
+                } else
+                    state = State::UTF8Query;
                 break;
             case '#':
                 syntaxViolation(c);
                 if (base.isValid() && base.protocolIs("file")) {
-                    copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+                    copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
                     appendToASCIIBuffer('#');
                 } else {
                     appendToASCIIBuffer("///#", 4);
@@ -1627,7 +1624,7 @@
             default:
                 syntaxViolation(c);
                 if (base.isValid() && base.protocolIs("file") && shouldCopyFileURL(c))
-                    copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, isUTF8Encoding);
+                    copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, nonUTF8QueryEncoding);
                 else {
                     appendToASCIIBuffer("///", 3);
                     m_url.m_userStart = currentPosition(c) - 1;
@@ -1693,12 +1690,11 @@
                             syntaxViolation(c);
                             appendToASCIIBuffer("/?", 2);
                             ++c;
-                            if (isUTF8Encoding)
-                                state = State::UTF8Query;
-                            else {
+                            if (nonUTF8QueryEncoding) {
                                 queryBegin = c;
                                 state = State::NonUTF8Query;
-                            }
+                            } else
+                                state = State::UTF8Query;
                             m_url.m_pathAfterLastSlash = currentPosition(c) - 1;
                             m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
                             break;
@@ -1771,12 +1767,11 @@
                 m_url.m_pathEnd = currentPosition(c);
                 appendToASCIIBuffer('?');
                 ++c;
-                if (isUTF8Encoding)
-                    state = State::UTF8Query;
-                else {
+                if (nonUTF8QueryEncoding) {
                     queryBegin = c;
                     state = State::NonUTF8Query;
-                }
+                } else
+                    state = State::UTF8Query;
                 break;
             }
             if (*c == '#') {
@@ -1794,12 +1789,11 @@
                 m_url.m_pathEnd = currentPosition(c);
                 appendToASCIIBuffer('?');
                 ++c;
-                if (isUTF8Encoding)
-                    state = State::UTF8Query;
-                else {
+                if (nonUTF8QueryEncoding) {
                     queryBegin = c;
                     state = State::NonUTF8Query;
-                }
+                } else
+                    state = State::UTF8Query;
             } else if (*c == '#') {
                 m_url.m_pathEnd = currentPosition(c);
                 m_url.m_queryEnd = m_url.m_pathEnd;
@@ -1821,10 +1815,8 @@
                 state = State::Fragment;
                 break;
             }
-            if (isUTF8Encoding)
-                utf8QueryEncode(c);
-            else
-                appendCodePoint(queryBuffer, *c);
+            ASSERT(!nonUTF8QueryEncoding);
+            utf8QueryEncode(c);
             ++c;
             break;
         case State::NonUTF8Query:
@@ -1832,7 +1824,7 @@
                 LOG_STATE("NonUTF8Query");
                 ASSERT(queryBegin != CodePointIterator<CharacterType>());
                 if (*c == '#') {
-                    encodeQuery(queryBuffer, encoding, CodePointIterator<CharacterType>(queryBegin, c));
+                    encodeNonUTF8Query(queryBuffer, *nonUTF8QueryEncoding, CodePointIterator<CharacterType>(queryBegin, c));
                     m_url.m_queryEnd = currentPosition(c);
                     state = State::Fragment;
                     break;
@@ -1868,7 +1860,7 @@
         RELEASE_ASSERT_NOT_REACHED();
     case State::SpecialRelativeOrAuthority:
         LOG_FINAL_STATE("SpecialRelativeOrAuthority");
-        copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+        copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
         break;
     case State::PathOrAuthority:
         LOG_FINAL_STATE("PathOrAuthority");
@@ -1889,7 +1881,7 @@
         RELEASE_ASSERT_NOT_REACHED();
     case State::RelativeSlash:
         LOG_FINAL_STATE("RelativeSlash");
-        copyURLPartsUntil(base, URLPart::PortEnd, c, isUTF8Encoding);
+        copyURLPartsUntil(base, URLPart::PortEnd, c, nonUTF8QueryEncoding);
         appendToASCIIBuffer('/');
         m_url.m_pathAfterLastSlash = m_url.m_hostEnd + m_url.m_portLength + 1;
         m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
@@ -1952,7 +1944,7 @@
     case State::File:
         LOG_FINAL_STATE("File");
         if (base.isValid() && base.protocolIs("file")) {
-            copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+            copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
             break;
         }
         syntaxViolation(c);
@@ -2047,7 +2039,7 @@
     case State::NonUTF8Query:
         LOG_FINAL_STATE("NonUTF8Query");
         ASSERT(queryBegin != CodePointIterator<CharacterType>());
-        encodeQuery(queryBuffer, encoding, CodePointIterator<CharacterType>(queryBegin, c));
+        encodeNonUTF8Query(queryBuffer, *nonUTF8QueryEncoding, CodePointIterator<CharacterType>(queryBegin, c));
         m_url.m_queryEnd = currentPosition(c);
         break;
     case State::Fragment:

Modified: trunk/Source/WebCore/platform/URLParser.h (236564 => 236565)


--- trunk/Source/WebCore/platform/URLParser.h	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URLParser.h	2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,7 +25,6 @@
 
 #pragma once
 
-#include "TextEncoding.h"
 #include "URL.h"
 #include <wtf/Expected.h>
 #include <wtf/Forward.h>
@@ -38,7 +37,7 @@
 
 class URLParser {
 public:
-    WEBCORE_EXPORT URLParser(const String&, const URL& = { }, const TextEncoding& = UTF8Encoding());
+    WEBCORE_EXPORT URLParser(const String&, const URL& = { }, const URLTextEncoding* = nullptr);
     URL result() { return m_url; }
 
     WEBCORE_EXPORT static bool allValuesEqual(const URL&, const URL&);
@@ -70,7 +69,7 @@
     static constexpr size_t defaultInlineBufferSize = 2048;
     using LCharBuffer = Vector<LChar, defaultInlineBufferSize>;
 
-    template<typename CharacterType> void parse(const CharacterType*, const unsigned length, const URL&, const TextEncoding&);
+    template<typename CharacterType> void parse(const CharacterType*, const unsigned length, const URL&, const URLTextEncoding*);
     template<typename CharacterType> void parseAuthority(CodePointIterator<CharacterType>);
     template<typename CharacterType> bool parseHostAndPort(CodePointIterator<CharacterType>);
     template<typename CharacterType> bool parsePort(CodePointIterator<CharacterType>&);
@@ -107,7 +106,7 @@
     void appendToASCIIBuffer(UChar32);
     void appendToASCIIBuffer(const char*, size_t);
     void appendToASCIIBuffer(const LChar* characters, size_t size) { appendToASCIIBuffer(reinterpret_cast<const char*>(characters), size); }
-    template<typename CharacterType> void encodeQuery(const Vector<UChar>& source, const TextEncoding&, CodePointIterator<CharacterType>);
+    template<typename CharacterType> void encodeNonUTF8Query(const Vector<UChar>& source, const URLTextEncoding&, CodePointIterator<CharacterType>);
     void copyASCIIStringUntil(const String&, size_t length);
     bool copyBaseWindowsDriveLetter(const URL&);
     StringView parsedDataView(size_t start, size_t length);
@@ -127,7 +126,7 @@
     void serializeIPv6(IPv6Address);
 
     enum class URLPart;
-    template<typename CharacterType> void copyURLPartsUntil(const URL& base, URLPart, const CodePointIterator<CharacterType>&, bool& isUTF8Encoding);
+    template<typename CharacterType> void copyURLPartsUntil(const URL& base, URLPart, const CodePointIterator<CharacterType>&, const URLTextEncoding*&);
     static size_t urlLengthUntilPart(const URL&, URLPart);
     void popPath();
     bool shouldPopPath(unsigned);

Modified: trunk/Source/WebCore/platform/text/TextEncoding.cpp (236564 => 236565)


--- trunk/Source/WebCore/platform/text/TextEncoding.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/text/TextEncoding.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -31,6 +31,7 @@
 #include "TextCodec.h"
 #include "TextEncodingRegistry.h"
 #include <unicode/unorm.h>
+#include <wtf/NeverDestroyed.h>
 #include <wtf/StdLibExtras.h>
 #include <wtf/text/CString.h>
 #include <wtf/text/StringView.h>
@@ -39,7 +40,7 @@
 
 static const TextEncoding& UTF7Encoding()
 {
-    static TextEncoding globalUTF7Encoding("UTF-7");
+    static NeverDestroyed<TextEncoding> globalUTF7Encoding("UTF-7");
     return globalUTF7Encoding;
 }
 
@@ -173,7 +174,7 @@
 // byte-based encoding and can contain 0x00. By extension, the same
 // should be done for UTF-32. In case of UTF-7, it is a byte-based encoding,
 // but it's fraught with problems and we'd rather steer clear of it.
-const TextEncoding& TextEncoding::encodingForFormSubmission() const
+const TextEncoding& TextEncoding::encodingForFormSubmissionOrURLParsing() const
 {
     if (isNonByteBasedEncoding() || isUTF7Encoding())
         return UTF8Encoding();
@@ -182,38 +183,38 @@
 
 const TextEncoding& ASCIIEncoding()
 {
-    static TextEncoding globalASCIIEncoding("ASCII");
+    static NeverDestroyed<TextEncoding> globalASCIIEncoding("ASCII");
     return globalASCIIEncoding;
 }
 
 const TextEncoding& Latin1Encoding()
 {
-    static TextEncoding globalLatin1Encoding("latin1");
+    static NeverDestroyed<TextEncoding> globalLatin1Encoding("latin1");
     return globalLatin1Encoding;
 }
 
 const TextEncoding& UTF16BigEndianEncoding()
 {
-    static TextEncoding globalUTF16BigEndianEncoding("UTF-16BE");
+    static NeverDestroyed<TextEncoding> globalUTF16BigEndianEncoding("UTF-16BE");
     return globalUTF16BigEndianEncoding;
 }
 
 const TextEncoding& UTF16LittleEndianEncoding()
 {
-    static TextEncoding globalUTF16LittleEndianEncoding("UTF-16LE");
+    static NeverDestroyed<TextEncoding> globalUTF16LittleEndianEncoding("UTF-16LE");
     return globalUTF16LittleEndianEncoding;
 }
 
 const TextEncoding& UTF8Encoding()
 {
-    static TextEncoding globalUTF8Encoding("UTF-8");
-    ASSERT(globalUTF8Encoding.isValid());
+    static NeverDestroyed<TextEncoding> globalUTF8Encoding("UTF-8");
+    ASSERT(globalUTF8Encoding.get().isValid());
     return globalUTF8Encoding;
 }
 
 const TextEncoding& WindowsLatin1Encoding()
 {
-    static TextEncoding globalWindowsLatin1Encoding("WinLatin-1");
+    static NeverDestroyed<TextEncoding> globalWindowsLatin1Encoding("WinLatin-1");
     return globalWindowsLatin1Encoding;
 }
 

Modified: trunk/Source/WebCore/platform/text/TextEncoding.h (236564 => 236565)


--- trunk/Source/WebCore/platform/text/TextEncoding.h	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/text/TextEncoding.h	2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,12 +25,13 @@
 
 #pragma once
 
+#include "URL.h"
 #include <pal/text/UnencodableHandling.h>
 #include <wtf/text/WTFString.h>
 
 namespace WebCore {
 
-class TextEncoding {
+class TextEncoding : public URLTextEncoding {
 public:
     TextEncoding() = default;
     WEBCORE_EXPORT TextEncoding(const char* name);
@@ -43,11 +44,12 @@
     bool isJapanese() const;
 
     const TextEncoding& closestByteBasedEquivalent() const;
-    const TextEncoding& encodingForFormSubmission() const;
+    const TextEncoding& encodingForFormSubmissionOrURLParsing() const;
 
     WEBCORE_EXPORT String decode(const char*, size_t length, bool stopOnError, bool& sawError) const;
     String decode(const char*, size_t length) const;
-    Vector<uint8_t> encode(StringView, UnencodableHandling) const;
+    WEBCORE_EXPORT Vector<uint8_t> encode(StringView, UnencodableHandling) const;
+    Vector<uint8_t> encodeForURLParsing(StringView string) const final { return encode(string, UnencodableHandling::URLEncodedEntities); }
 
     UChar backslashAsCurrencySymbol() const;
     bool isByteBasedEncoding() const { return !isNonByteBasedEncoding(); }

Modified: trunk/Tools/ChangeLog (236564 => 236565)


--- trunk/Tools/ChangeLog	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Tools/ChangeLog	2018-09-27 20:05:52 UTC (rev 236565)
@@ -1,3 +1,14 @@
+2018-09-27  Alex Christensen  <[email protected]>
+
+        URLParser should use TextEncoding through an abstract class
+        https://bugs.webkit.org/show_bug.cgi?id=190027
+
+        Reviewed by Andy Estes.
+
+        * TestWebKitAPI/Tests/WebCore/URLParser.cpp:
+        (TestWebKitAPI::checkURL):
+        (TestWebKitAPI::TEST_F):
+
 2018-09-27  Ryan Haddad  <[email protected]>
 
         iOS Simulator bots should pass '--dedicated-simulators' to run-webkit-tests

Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp (236564 => 236565)


--- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp	2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp	2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,6 +25,7 @@
 
 #include "config.h"
 #include "WTFStringUtilities.h"
+#include <WebCore/TextEncoding.h>
 #include <WebCore/URLParser.h>
 #include <wtf/MainThread.h>
 #include <wtf/text/StringBuilder.h>
@@ -210,7 +211,7 @@
     checkRelativeURL(urlString, baseString, {"", "", "", "", 0, "", "", "", urlString});
 }
 
-static void checkURL(const String& urlString, const TextEncoding& encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
+static void checkURL(const String& urlString, const TextEncoding* encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
 {
     URLParser parser(urlString, { }, encoding);
     auto url = ""
@@ -235,7 +236,7 @@
     }
 }
 
-static void checkURL(const String& urlString, const String& baseURLString, const TextEncoding& encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
+static void checkURL(const String& urlString, const String& baseURLString, const TextEncoding* encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
 {
     URLParser baseParser(baseURLString, { }, encoding);
     URLParser parser(urlString, baseParser.result(), encoding);
@@ -1285,37 +1286,37 @@
 
 TEST_F(URLParserTest, QueryEncoding)
 {
-    checkURL(utf16String(u"http://host?ß😍#ß😍"), UTF8Encoding(), {"http", "", "", "host", 0, "/", "%C3%9F%F0%9F%98%8D", "%C3%9F%F0%9F%98%8D", utf16String(u"http://host/?%C3%9F%F0%9F%98%8D#%C3%9F%F0%9F%98%8D")}, testTabsValueForSurrogatePairs);
+    checkURL(utf16String(u"http://host?ß😍#ß😍"), nullptr, {"http", "", "", "host", 0, "/", "%C3%9F%F0%9F%98%8D", "%C3%9F%F0%9F%98%8D", utf16String(u"http://host/?%C3%9F%F0%9F%98%8D#%C3%9F%F0%9F%98%8D")}, testTabsValueForSurrogatePairs);
 
     TextEncoding latin1(String("latin1"));
-    checkURL("http://host/?query with%20spaces", latin1, {"http", "", "", "host", 0, "/", "query%20with%20spaces", "", "http://host/?query%20with%20spaces"});
-    checkURL("http://host/?query", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
-    checkURL("http://host/?\tquery", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
-    checkURL("http://host/?q\tuery", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
-    checkURL("http://host/?query with SpAcEs#fragment", latin1, {"http", "", "", "host", 0, "/", "query%20with%20SpAcEs", "fragment", "http://host/?query%20with%20SpAcEs#fragment"});
-    checkURL("http://host/?que\rry\t\r\n#fragment", latin1, {"http", "", "", "host", 0, "/", "query", "fragment", "http://host/?query#fragment"});
+    checkURL("http://host/?query with%20spaces", &latin1, {"http", "", "", "host", 0, "/", "query%20with%20spaces", "", "http://host/?query%20with%20spaces"});
+    checkURL("http://host/?query", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+    checkURL("http://host/?\tquery", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+    checkURL("http://host/?q\tuery", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+    checkURL("http://host/?query with SpAcEs#fragment", &latin1, {"http", "", "", "host", 0, "/", "query%20with%20SpAcEs", "fragment", "http://host/?query%20with%20SpAcEs#fragment"});
+    checkURL("http://host/?que\rry\t\r\n#fragment", &latin1, {"http", "", "", "host", 0, "/", "query", "fragment", "http://host/?query#fragment"});
 
     TextEncoding unrecognized(String("unrecognized invalid encoding name"));
-    checkURL("http://host/?query", unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
-    checkURL("http://host/?", unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
+    checkURL("http://host/?query", &unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
+    checkURL("http://host/?", &unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
 
     TextEncoding iso88591(String("ISO-8859-1"));
     String withUmlauts = utf16String<4>({0xDC, 0x430, 0x451, '\0'});
-    checkURL(makeString("ws://host/path?", withUmlauts), iso88591, {"ws", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "ws://host/path?%C3%9C%D0%B0%D1%91"});
-    checkURL(makeString("wss://host/path?", withUmlauts), iso88591, {"wss", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "wss://host/path?%C3%9C%D0%B0%D1%91"});
-    checkURL(makeString("asdf://host/path?", withUmlauts), iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "asdf://host/path?%C3%9C%D0%B0%D1%91"});
-    checkURL(makeString("https://host/path?", withUmlauts), iso88591, {"https", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "https://host/path?%DC%26%231072%3B%26%231105%3B"});
-    checkURL(makeString("gopher://host/path?", withUmlauts), iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "gopher://host/path?%DC%26%231072%3B%26%231105%3B"});
-    checkURL(makeString("/path?", withUmlauts, "#fragment"), "ws://example.com/", iso88591, {"ws", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "ws://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
-    checkURL(makeString("/path?", withUmlauts, "#fragment"), "wss://example.com/", iso88591, {"wss", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "wss://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
-    checkURL(makeString("/path?", withUmlauts, "#fragment"), "asdf://example.com/", iso88591, {"asdf", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
-    checkURL(makeString("/path?", withUmlauts, "#fragment"), "https://example.com/", iso88591, {"https", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "https://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
-    checkURL(makeString("/path?", withUmlauts, "#fragment"), "gopher://example.com/", iso88591, {"gopher", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
-    checkURL(makeString("gopher://host/path?", withUmlauts, "#fragment"), "asdf://example.com/?doesntmatter", iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://host/path?%DC%26%231072%3B%26%231105%3B#fragment"});
-    checkURL(makeString("asdf://host/path?", withUmlauts, "#fragment"), "http://example.com/?doesntmatter", iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://host/path?%C3%9C%D0%B0%D1%91#fragment"});
+    checkURL(makeString("ws://host/path?", withUmlauts), &iso88591, {"ws", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "ws://host/path?%C3%9C%D0%B0%D1%91"});
+    checkURL(makeString("wss://host/path?", withUmlauts), &iso88591, {"wss", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "wss://host/path?%C3%9C%D0%B0%D1%91"});
+    checkURL(makeString("asdf://host/path?", withUmlauts), &iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "asdf://host/path?%C3%9C%D0%B0%D1%91"});
+    checkURL(makeString("https://host/path?", withUmlauts), &iso88591, {"https", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "https://host/path?%DC%26%231072%3B%26%231105%3B"});
+    checkURL(makeString("gopher://host/path?", withUmlauts), &iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "gopher://host/path?%DC%26%231072%3B%26%231105%3B"});
+    checkURL(makeString("/path?", withUmlauts, "#fragment"), "ws://example.com/", &iso88591, {"ws", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "ws://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+    checkURL(makeString("/path?", withUmlauts, "#fragment"), "wss://example.com/", &iso88591, {"wss", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "wss://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+    checkURL(makeString("/path?", withUmlauts, "#fragment"), "asdf://example.com/", &iso88591, {"asdf", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+    checkURL(makeString("/path?", withUmlauts, "#fragment"), "https://example.com/", &iso88591, {"https", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "https://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+    checkURL(makeString("/path?", withUmlauts, "#fragment"), "gopher://example.com/", &iso88591, {"gopher", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+    checkURL(makeString("gopher://host/path?", withUmlauts, "#fragment"), "asdf://example.com/?doesntmatter", &iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://host/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+    checkURL(makeString("asdf://host/path?", withUmlauts, "#fragment"), "http://example.com/?doesntmatter", &iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://host/path?%C3%9C%D0%B0%D1%91#fragment"});
 
-    checkURL("http://host/pa'th?qu'ery#fr'agment", UTF8Encoding(), {"http", "", "", "host", 0, "/pa'th", "qu%27ery", "fr'agment", "http://host/pa'th?qu%27ery#fr'agment"});
-    checkURL("asdf://host/pa'th?qu'ery#fr'agment", UTF8Encoding(), {"asdf", "", "", "host", 0, "/pa'th", "qu'ery", "fr'agment", "asdf://host/pa'th?qu'ery#fr'agment"});
+    checkURL("http://host/pa'th?qu'ery#fr'agment", nullptr, {"http", "", "", "host", 0, "/pa'th", "qu%27ery", "fr'agment", "http://host/pa'th?qu%27ery#fr'agment"});
+    checkURL("asdf://host/pa'th?qu'ery#fr'agment", nullptr, {"asdf", "", "", "host", 0, "/pa'th", "qu'ery", "fr'agment", "asdf://host/pa'th?qu'ery#fr'agment"});
     // FIXME: Add more tests with other encodings and things like non-ascii characters, emoji and unmatched surrogate pairs.
 }
 
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to