Log Message
URLParser should use TextEncoding through an abstract class https://bugs.webkit.org/show_bug.cgi?id=190027
Reviewed by Andy Estes. Source/WebCore: URLParser uses TextEncoding for one call to encode, which is only used for encoding the query of URLs in documents with non-UTF encodings. There are 3 call sites that specify the TextEncoding to use from the Document, and even those call sites use a UTF encoding most of the time. All other URL parsing is done using a well-optimized path which assumes UTF-8 encoding and uses macros from ICU headers, not a TextEncoding. Moving the logic in this way breaks URL and URLParser's dependency on TextEncoding, which makes it possible to use in a lower-level project without also moving TextEncoding, TextCodec, TextCodecICU, ThreadGlobalData, and the rest of WebCore and _javascript_Core. There is no observable change in behavior. There is now one virtual function call in a code path in URLParser that is not performance-sensitive, and TextEncodings now have a vtable, which uses a few more bytes of memory total for WebKit. * css/parser/CSSParserContext.h: (WebCore::CSSParserContext::completeURL const): * css/parser/CSSParserIdioms.cpp: (WebCore::completeURL): * dom/Document.cpp: (WebCore::Document::completeURL const): * html/HTMLBaseElement.cpp: (WebCore::HTMLBaseElement::href const): Move the call to encodingForFormSubmission from the URL constructor to the 3 call sites that specify the encoding from the Document. * loader/FormSubmission.cpp: (WebCore::FormSubmission::create): * loader/TextResourceDecoder.cpp: (WebCore::TextResourceDecoder::encodingForURLParsing): * loader/TextResourceDecoder.h: * platform/URL.cpp: (WebCore::URL::URL): * platform/URL.h: (WebCore::URLTextEncoding::~URLTextEncoding): * platform/URLParser.cpp: (WebCore::URLParser::encodeNonUTF8Query): (WebCore::URLParser::copyURLPartsUntil): (WebCore::URLParser::URLParser): (WebCore::URLParser::parse): (WebCore::URLParser::encodeQuery): Deleted. A pointer replaces the boolean isUTF8Encoding and the TextEncoding& which had a default value of UTF8Encoding. Now the pointer being null means that we use UTF8, and the pointer being non-null means we use that encoding. * platform/URLParser.h: (WebCore::URLParser::URLParser): * platform/text/TextEncoding.cpp: (WebCore::UTF7Encoding): (WebCore::TextEncoding::encodingForFormSubmissionOrURLParsing const): (WebCore::ASCIIEncoding): (WebCore::Latin1Encoding): (WebCore::UTF16BigEndianEncoding): (WebCore::UTF16LittleEndianEncoding): (WebCore::UTF8Encoding): (WebCore::WindowsLatin1Encoding): (WebCore::TextEncoding::encodingForFormSubmission const): Deleted. Use NeverDestroyed because TextEncoding now has a virtual destructor. * platform/text/TextEncoding.h: Rename encodingForFormSubmission to encodingForFormSubmissionOrURLParsing to make it more clear that we are intentionally using it for both. Tools: * TestWebKitAPI/Tests/WebCore/URLParser.cpp: (TestWebKitAPI::checkURL): (TestWebKitAPI::TEST_F):
Modified Paths
- trunk/Source/WebCore/ChangeLog
- trunk/Source/WebCore/css/parser/CSSParserContext.h
- trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp
- trunk/Source/WebCore/dom/Document.cpp
- trunk/Source/WebCore/html/HTMLBaseElement.cpp
- trunk/Source/WebCore/loader/FormSubmission.cpp
- trunk/Source/WebCore/loader/TextResourceDecoder.cpp
- trunk/Source/WebCore/loader/TextResourceDecoder.h
- trunk/Source/WebCore/platform/URL.cpp
- trunk/Source/WebCore/platform/URL.h
- trunk/Source/WebCore/platform/URLParser.cpp
- trunk/Source/WebCore/platform/URLParser.h
- trunk/Source/WebCore/platform/text/TextEncoding.cpp
- trunk/Source/WebCore/platform/text/TextEncoding.h
- trunk/Tools/ChangeLog
- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp
Diff
Modified: trunk/Source/WebCore/ChangeLog (236564 => 236565)
--- trunk/Source/WebCore/ChangeLog 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/ChangeLog 2018-09-27 20:05:52 UTC (rev 236565)
@@ -1,3 +1,61 @@
+2018-09-27 Alex Christensen <[email protected]>
+
+ URLParser should use TextEncoding through an abstract class
+ https://bugs.webkit.org/show_bug.cgi?id=190027
+
+ Reviewed by Andy Estes.
+
+ URLParser uses TextEncoding for one call to encode, which is only used for encoding the query of URLs in documents with non-UTF encodings.
+ There are 3 call sites that specify the TextEncoding to use from the Document, and even those call sites use a UTF encoding most of the time.
+ All other URL parsing is done using a well-optimized path which assumes UTF-8 encoding and uses macros from ICU headers, not a TextEncoding.
+ Moving the logic in this way breaks URL and URLParser's dependency on TextEncoding, which makes it possible to use in a lower-level project
+ without also moving TextEncoding, TextCodec, TextCodecICU, ThreadGlobalData, and the rest of WebCore and _javascript_Core.
+
+ There is no observable change in behavior. There is now one virtual function call in a code path in URLParser that is not performance-sensitive,
+ and TextEncodings now have a vtable, which uses a few more bytes of memory total for WebKit.
+
+ * css/parser/CSSParserContext.h:
+ (WebCore::CSSParserContext::completeURL const):
+ * css/parser/CSSParserIdioms.cpp:
+ (WebCore::completeURL):
+ * dom/Document.cpp:
+ (WebCore::Document::completeURL const):
+ * html/HTMLBaseElement.cpp:
+ (WebCore::HTMLBaseElement::href const):
+ Move the call to encodingForFormSubmission from the URL constructor to the 3 call sites that specify the encoding from the Document.
+ * loader/FormSubmission.cpp:
+ (WebCore::FormSubmission::create):
+ * loader/TextResourceDecoder.cpp:
+ (WebCore::TextResourceDecoder::encodingForURLParsing):
+ * loader/TextResourceDecoder.h:
+ * platform/URL.cpp:
+ (WebCore::URL::URL):
+ * platform/URL.h:
+ (WebCore::URLTextEncoding::~URLTextEncoding):
+ * platform/URLParser.cpp:
+ (WebCore::URLParser::encodeNonUTF8Query):
+ (WebCore::URLParser::copyURLPartsUntil):
+ (WebCore::URLParser::URLParser):
+ (WebCore::URLParser::parse):
+ (WebCore::URLParser::encodeQuery): Deleted.
+ A pointer replaces the boolean isUTF8Encoding and the TextEncoding& which had a default value of UTF8Encoding.
+ Now the pointer being null means that we use UTF8, and the pointer being non-null means we use that encoding.
+ * platform/URLParser.h:
+ (WebCore::URLParser::URLParser):
+ * platform/text/TextEncoding.cpp:
+ (WebCore::UTF7Encoding):
+ (WebCore::TextEncoding::encodingForFormSubmissionOrURLParsing const):
+ (WebCore::ASCIIEncoding):
+ (WebCore::Latin1Encoding):
+ (WebCore::UTF16BigEndianEncoding):
+ (WebCore::UTF16LittleEndianEncoding):
+ (WebCore::UTF8Encoding):
+ (WebCore::WindowsLatin1Encoding):
+ (WebCore::TextEncoding::encodingForFormSubmission const): Deleted.
+ Use NeverDestroyed because TextEncoding now has a virtual destructor.
+ * platform/text/TextEncoding.h:
+ Rename encodingForFormSubmission to encodingForFormSubmissionOrURLParsing to make it more clear that we are intentionally using it for both.
+
2018-09-27 John Wilander <[email protected]>
Resource Load Statistics: Remove temporary compatibility fix for auto-dismiss popups
Modified: trunk/Source/WebCore/css/parser/CSSParserContext.h (236564 => 236565)
--- trunk/Source/WebCore/css/parser/CSSParserContext.h 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/css/parser/CSSParserContext.h 2018-09-27 20:05:52 UTC (rev 236565)
@@ -69,7 +69,9 @@
return URL();
if (charset.isEmpty())
return URL(baseURL, url);
- return URL(baseURL, url, TextEncoding(charset));
+ TextEncoding encoding(charset);
+ auto& encodingForURLParsing = encoding.encodingForFormSubmissionOrURLParsing();
+ return URL(baseURL, url, encodingForURLParsing == UTF8Encoding() ? nullptr : &encodingForURLParsing);
}
};
Modified: trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp (236564 => 236565)
--- trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/css/parser/CSSParserIdioms.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -47,11 +47,7 @@
URL completeURL(const CSSParserContext& context, const String& url)
{
- if (url.isNull())
- return URL();
- if (context.charset.isEmpty())
- return URL(context.baseURL, url);
- return URL(context.baseURL, url, context.charset);
+ return context.completeURL(url);
}
} // namespace WebCore
Modified: trunk/Source/WebCore/dom/Document.cpp (236564 => 236565)
--- trunk/Source/WebCore/dom/Document.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/dom/Document.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -4894,7 +4894,7 @@
const URL& baseURL = ((baseURLOverride.isEmpty() || baseURLOverride == blankURL()) && parentDocument()) ? parentDocument()->baseURL() : baseURLOverride;
if (!m_decoder)
return URL(baseURL, url);
- return URL(baseURL, url, m_decoder->encoding());
+ return URL(baseURL, url, m_decoder->encodingForURLParsing());
}
URL Document::completeURL(const String& url) const
Modified: trunk/Source/WebCore/html/HTMLBaseElement.cpp (236564 => 236565)
--- trunk/Source/WebCore/html/HTMLBaseElement.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/html/HTMLBaseElement.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -89,9 +89,8 @@
if (attributeValue.isNull())
return document().url();
- URL url = "" ?
- URL(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue)) :
- URL(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue), document().decoder()->encoding());
+ auto* encoding = document().decoder() ? document().decoder()->encodingForURLParsing() : nullptr;
+ URL url(document().url(), stripLeadingAndTrailingHTMLSpaces(attributeValue), encoding);
if (!url.isValid())
return URL();
Modified: trunk/Source/WebCore/loader/FormSubmission.cpp (236564 => 236565)
--- trunk/Source/WebCore/loader/FormSubmission.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/FormSubmission.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -175,7 +175,7 @@
}
auto dataEncoding = isMailtoForm ? UTF8Encoding() : encodingFromAcceptCharset(copiedAttributes.acceptCharset(), document);
- auto domFormData = DOMFormData::create(dataEncoding.encodingForFormSubmission());
+ auto domFormData = DOMFormData::create(dataEncoding.encodingForFormSubmissionOrURLParsing());
StringPairVector formValues;
bool containsPasswordData = false;
Modified: trunk/Source/WebCore/loader/TextResourceDecoder.cpp (236564 => 236565)
--- trunk/Source/WebCore/loader/TextResourceDecoder.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/TextResourceDecoder.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -659,4 +659,16 @@
return decoded + flush();
}
+const TextEncoding* TextResourceDecoder::encodingForURLParsing()
+{
+ // For UTF-{7,16,32}, we want to use UTF-8 for the query part as
+ // we do when submitting a form. A form with GET method
+ // has its contents added to a URL as query params and it makes sense
+ // to be consistent.
+ auto& encoding = m_encoding.encodingForFormSubmissionOrURLParsing();
+ if (encoding == UTF8Encoding())
+ return nullptr;
+ return &encoding;
}
+
+}
Modified: trunk/Source/WebCore/loader/TextResourceDecoder.h (236564 => 236565)
--- trunk/Source/WebCore/loader/TextResourceDecoder.h 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/loader/TextResourceDecoder.h 2018-09-27 20:05:52 UTC (rev 236565)
@@ -48,6 +48,7 @@
void setEncoding(const TextEncoding&, EncodingSource);
const TextEncoding& encoding() const { return m_encoding; }
+ const TextEncoding* encodingForURLParsing();
bool hasEqualEncodingForCharset(const String& charset) const;
Modified: trunk/Source/WebCore/platform/URL.cpp (236564 => 236565)
--- trunk/Source/WebCore/platform/URL.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URL.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -103,22 +103,12 @@
#endif
}
-URL::URL(const URL& base, const String& relative)
+URL::URL(const URL& base, const String& relative, const URLTextEncoding* encoding)
{
- URLParser parser(relative, base);
+ URLParser parser(relative, base, encoding);
*this = parser.result();
}
-URL::URL(const URL& base, const String& relative, const TextEncoding& encoding)
-{
- // For UTF-{7,16,32}, we want to use UTF-8 for the query part as
- // we do when submitting a form. A form with GET method
- // has its contents added to a URL as query params and it makes sense
- // to be consistent.
- URLParser parser(relative, base, encoding.encodingForFormSubmission());
- *this = parser.result();
-}
-
static bool shouldTrimFromURL(UChar c)
{
// Browsers ignore leading/trailing whitespace and control
Modified: trunk/Source/WebCore/platform/URL.h (236564 => 236565)
--- trunk/Source/WebCore/platform/URL.h 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URL.h 2018-09-27 20:05:52 UTC (rev 236565)
@@ -47,7 +47,12 @@
namespace WebCore {
-class TextEncoding;
+class URLTextEncoding {
+public:
+ virtual Vector<uint8_t> encodeForURLParsing(StringView) const = 0;
+ virtual ~URLTextEncoding() { };
+};
+
struct URLHash;
enum ParsedURLStringTag { ParsedURLString };
@@ -65,14 +70,13 @@
bool isHashTableDeletedValue() const { return string().isHashTableDeletedValue(); }
// Resolves the relative URL with the given base URL. If provided, the
- // TextEncoding is used to encode non-ASCII characers. The base URL can be
+ // URLTextEncoding is used to encode non-ASCII characers. The base URL can be
// null or empty, in which case the relative URL will be interpreted as
// absolute.
// FIXME: If the base URL is invalid, this always creates an invalid
// URL. Instead I think it would be better to treat all invalid base URLs
// the same way we treate null and empty base URLs.
- WEBCORE_EXPORT URL(const URL& base, const String& relative);
- URL(const URL& base, const String& relative, const TextEncoding&);
+ WEBCORE_EXPORT URL(const URL& base, const String& relative, const URLTextEncoding* = nullptr);
WEBCORE_EXPORT static URL fakeURLWithRelativePart(const String&);
WEBCORE_EXPORT static URL fileURLWithFileSystemPath(const String&);
@@ -208,7 +212,6 @@
friend class URLParser;
WEBCORE_EXPORT void invalidate();
static bool protocolIs(const String&, const char*);
- void init(const URL&, const String&, const TextEncoding&);
void copyToBuffer(Vector<char, 512>& buffer) const;
unsigned hostStart() const;
@@ -303,6 +306,7 @@
// encoding (defaulting to UTF-8 otherwise). DANGER: If the URL has "%00"
// in it, the resulting string will have embedded null characters!
WEBCORE_EXPORT String decodeURLEscapeSequences(const String&);
+class TextEncoding;
String decodeURLEscapeSequences(const String&, const TextEncoding&);
// FIXME: This is a wrong concept to expose, different parts of a URL need different escaping per the URL Standard.
Modified: trunk/Source/WebCore/platform/URLParser.cpp (236564 => 236565)
--- trunk/Source/WebCore/platform/URLParser.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URLParser.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -618,9 +618,9 @@
}
template<typename CharacterType>
-void URLParser::encodeQuery(const Vector<UChar>& source, const TextEncoding& encoding, CodePointIterator<CharacterType> iterator)
+void URLParser::encodeNonUTF8Query(const Vector<UChar>& source, const URLTextEncoding& encoding, CodePointIterator<CharacterType> iterator)
{
- auto encoded = encoding.encode(StringView(source.data(), source.size()), UnencodableHandling::URLEncodedEntities);
+ auto encoded = encoding.encodeForURLParsing(StringView(source.data(), source.size()));
auto* data = ""
size_t length = encoded.size();
@@ -880,7 +880,7 @@
}
template<typename CharacterType>
-void URLParser::copyURLPartsUntil(const URL& base, URLPart part, const CodePointIterator<CharacterType>& iterator, bool& isUTF8Encoding)
+void URLParser::copyURLPartsUntil(const URL& base, URLPart part, const CodePointIterator<CharacterType>& iterator, const URLTextEncoding*& nonUTF8QueryEncoding)
{
syntaxViolation(iterator);
@@ -919,7 +919,7 @@
switch (scheme(StringView(m_asciiBuffer.data(), m_url.m_schemeEnd))) {
case Scheme::WS:
case Scheme::WSS:
- isUTF8Encoding = true;
+ nonUTF8QueryEncoding = nullptr;
m_urlIsSpecial = true;
return;
case Scheme::File:
@@ -933,7 +933,7 @@
return;
case Scheme::NonSpecial:
m_urlIsSpecial = false;
- isUTF8Encoding = true;
+ nonUTF8QueryEncoding = nullptr;
return;
}
ASSERT_NOT_REACHED();
@@ -1152,7 +1152,7 @@
return iterator.codeUnitsSince(reinterpret_cast<const CharacterType*>(m_inputBegin));
}
-URLParser::URLParser(const String& input, const URL& base, const TextEncoding& encoding)
+URLParser::URLParser(const String& input, const URL& base, const URLTextEncoding* nonUTF8QueryEncoding)
: m_inputString(input)
{
if (input.isNull()) {
@@ -1165,10 +1165,10 @@
if (input.is8Bit()) {
m_inputBegin = input.characters8();
- parse(input.characters8(), input.length(), base, encoding);
+ parse(input.characters8(), input.length(), base, nonUTF8QueryEncoding);
} else {
m_inputBegin = input.characters16();
- parse(input.characters16(), input.length(), base, encoding);
+ parse(input.characters16(), input.length(), base, nonUTF8QueryEncoding);
}
ASSERT(!m_url.m_isValid
@@ -1179,7 +1179,7 @@
#if !ASSERT_DISABLED
if (!m_didSeeSyntaxViolation) {
// Force a syntax violation at the beginning to make sure we get the same result.
- URLParser parser(makeString(" ", input), base, encoding);
+ URLParser parser(makeString(" ", input), base, nonUTF8QueryEncoding);
URL parsed = parser.result();
if (parsed.isValid())
ASSERT(allValuesEqual(parser.result(), m_url));
@@ -1188,13 +1188,12 @@
}
template<typename CharacterType>
-void URLParser::parse(const CharacterType* input, const unsigned length, const URL& base, const TextEncoding& encoding)
+void URLParser::parse(const CharacterType* input, const unsigned length, const URL& base, const URLTextEncoding* nonUTF8QueryEncoding)
{
- URL_PARSER_LOG("Parsing URL <%s> base <%s> encoding <%s>", String(input, length).utf8().data(), base.string().utf8().data(), encoding.name());
+ URL_PARSER_LOG("Parsing URL <%s> base <%s>", String(input, length).utf8().data(), base.string().utf8().data());
m_url = { };
ASSERT(m_asciiBuffer.isEmpty());
-
- bool isUTF8Encoding = encoding == UTF8Encoding();
+
Vector<UChar> queryBuffer;
unsigned endIndex = length;
@@ -1287,7 +1286,7 @@
break;
case Scheme::WS:
case Scheme::WSS:
- isUTF8Encoding = true;
+ nonUTF8QueryEncoding = nullptr;
m_urlIsSpecial = true;
if (base.protocolIs(urlScheme))
state = State::SpecialRelativeOrAuthority;
@@ -1309,7 +1308,7 @@
++c;
break;
case Scheme::NonSpecial:
- isUTF8Encoding = true;
+ nonUTF8QueryEncoding = nullptr;
auto maybeSlash = c;
advance(maybeSlash);
if (!maybeSlash.atEnd() && *maybeSlash == '/') {
@@ -1353,7 +1352,7 @@
return;
}
if (base.m_cannotBeABaseURL && *c == '#') {
- copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
state = State::Fragment;
appendToASCIIBuffer('#');
++c;
@@ -1363,7 +1362,7 @@
state = State::Relative;
break;
}
- copyURLPartsUntil(base, URLPart::SchemeEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::SchemeEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer(':');
state = State::File;
break;
@@ -1413,24 +1412,23 @@
++c;
break;
case '?':
- copyURLPartsUntil(base, URLPart::PathEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PathEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('?');
++c;
- if (isUTF8Encoding)
- state = State::UTF8Query;
- else {
+ if (nonUTF8QueryEncoding) {
queryBegin = c;
state = State::NonUTF8Query;
- }
+ } else
+ state = State::UTF8Query;
break;
case '#':
- copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('#');
state = State::Fragment;
++c;
break;
default:
- copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, nonUTF8QueryEncoding);
if (currentPosition(c) && parsedDataView(currentPosition(c) - 1) != '/') {
appendToASCIIBuffer('/');
m_url.m_pathAfterLastSlash = currentPosition(c);
@@ -1443,7 +1441,7 @@
LOG_STATE("RelativeSlash");
if (*c == '/' || *c == '\\') {
++c;
- copyURLPartsUntil(base, URLPart::SchemeEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::SchemeEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer("://", 3);
if (m_urlIsSpecial)
state = State::SpecialAuthorityIgnoreSlashes;
@@ -1453,7 +1451,7 @@
authorityOrHostBegin = c;
}
} else {
- copyURLPartsUntil(base, URLPart::PortEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PortEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('/');
m_url.m_pathAfterLastSlash = base.m_hostEnd + base.m_portLength + 1;
state = State::Path;
@@ -1584,7 +1582,7 @@
case '?':
syntaxViolation(c);
if (base.isValid() && base.protocolIs("file")) {
- copyURLPartsUntil(base, URLPart::PathEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PathEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('?');
++c;
} else {
@@ -1598,17 +1596,16 @@
m_url.m_pathAfterLastSlash = m_url.m_userStart + 1;
m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
}
- if (isUTF8Encoding)
- state = State::UTF8Query;
- else {
+ if (nonUTF8QueryEncoding) {
queryBegin = c;
state = State::NonUTF8Query;
- }
+ } else
+ state = State::UTF8Query;
break;
case '#':
syntaxViolation(c);
if (base.isValid() && base.protocolIs("file")) {
- copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('#');
} else {
appendToASCIIBuffer("///#", 4);
@@ -1627,7 +1624,7 @@
default:
syntaxViolation(c);
if (base.isValid() && base.protocolIs("file") && shouldCopyFileURL(c))
- copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PathAfterLastSlash, c, nonUTF8QueryEncoding);
else {
appendToASCIIBuffer("///", 3);
m_url.m_userStart = currentPosition(c) - 1;
@@ -1693,12 +1690,11 @@
syntaxViolation(c);
appendToASCIIBuffer("/?", 2);
++c;
- if (isUTF8Encoding)
- state = State::UTF8Query;
- else {
+ if (nonUTF8QueryEncoding) {
queryBegin = c;
state = State::NonUTF8Query;
- }
+ } else
+ state = State::UTF8Query;
m_url.m_pathAfterLastSlash = currentPosition(c) - 1;
m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
break;
@@ -1771,12 +1767,11 @@
m_url.m_pathEnd = currentPosition(c);
appendToASCIIBuffer('?');
++c;
- if (isUTF8Encoding)
- state = State::UTF8Query;
- else {
+ if (nonUTF8QueryEncoding) {
queryBegin = c;
state = State::NonUTF8Query;
- }
+ } else
+ state = State::UTF8Query;
break;
}
if (*c == '#') {
@@ -1794,12 +1789,11 @@
m_url.m_pathEnd = currentPosition(c);
appendToASCIIBuffer('?');
++c;
- if (isUTF8Encoding)
- state = State::UTF8Query;
- else {
+ if (nonUTF8QueryEncoding) {
queryBegin = c;
state = State::NonUTF8Query;
- }
+ } else
+ state = State::UTF8Query;
} else if (*c == '#') {
m_url.m_pathEnd = currentPosition(c);
m_url.m_queryEnd = m_url.m_pathEnd;
@@ -1821,10 +1815,8 @@
state = State::Fragment;
break;
}
- if (isUTF8Encoding)
- utf8QueryEncode(c);
- else
- appendCodePoint(queryBuffer, *c);
+ ASSERT(!nonUTF8QueryEncoding);
+ utf8QueryEncode(c);
++c;
break;
case State::NonUTF8Query:
@@ -1832,7 +1824,7 @@
LOG_STATE("NonUTF8Query");
ASSERT(queryBegin != CodePointIterator<CharacterType>());
if (*c == '#') {
- encodeQuery(queryBuffer, encoding, CodePointIterator<CharacterType>(queryBegin, c));
+ encodeNonUTF8Query(queryBuffer, *nonUTF8QueryEncoding, CodePointIterator<CharacterType>(queryBegin, c));
m_url.m_queryEnd = currentPosition(c);
state = State::Fragment;
break;
@@ -1868,7 +1860,7 @@
RELEASE_ASSERT_NOT_REACHED();
case State::SpecialRelativeOrAuthority:
LOG_FINAL_STATE("SpecialRelativeOrAuthority");
- copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
break;
case State::PathOrAuthority:
LOG_FINAL_STATE("PathOrAuthority");
@@ -1889,7 +1881,7 @@
RELEASE_ASSERT_NOT_REACHED();
case State::RelativeSlash:
LOG_FINAL_STATE("RelativeSlash");
- copyURLPartsUntil(base, URLPart::PortEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::PortEnd, c, nonUTF8QueryEncoding);
appendToASCIIBuffer('/');
m_url.m_pathAfterLastSlash = m_url.m_hostEnd + m_url.m_portLength + 1;
m_url.m_pathEnd = m_url.m_pathAfterLastSlash;
@@ -1952,7 +1944,7 @@
case State::File:
LOG_FINAL_STATE("File");
if (base.isValid() && base.protocolIs("file")) {
- copyURLPartsUntil(base, URLPart::QueryEnd, c, isUTF8Encoding);
+ copyURLPartsUntil(base, URLPart::QueryEnd, c, nonUTF8QueryEncoding);
break;
}
syntaxViolation(c);
@@ -2047,7 +2039,7 @@
case State::NonUTF8Query:
LOG_FINAL_STATE("NonUTF8Query");
ASSERT(queryBegin != CodePointIterator<CharacterType>());
- encodeQuery(queryBuffer, encoding, CodePointIterator<CharacterType>(queryBegin, c));
+ encodeNonUTF8Query(queryBuffer, *nonUTF8QueryEncoding, CodePointIterator<CharacterType>(queryBegin, c));
m_url.m_queryEnd = currentPosition(c);
break;
case State::Fragment:
Modified: trunk/Source/WebCore/platform/URLParser.h (236564 => 236565)
--- trunk/Source/WebCore/platform/URLParser.h 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/URLParser.h 2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,7 +25,6 @@
#pragma once
-#include "TextEncoding.h"
#include "URL.h"
#include <wtf/Expected.h>
#include <wtf/Forward.h>
@@ -38,7 +37,7 @@
class URLParser {
public:
- WEBCORE_EXPORT URLParser(const String&, const URL& = { }, const TextEncoding& = UTF8Encoding());
+ WEBCORE_EXPORT URLParser(const String&, const URL& = { }, const URLTextEncoding* = nullptr);
URL result() { return m_url; }
WEBCORE_EXPORT static bool allValuesEqual(const URL&, const URL&);
@@ -70,7 +69,7 @@
static constexpr size_t defaultInlineBufferSize = 2048;
using LCharBuffer = Vector<LChar, defaultInlineBufferSize>;
- template<typename CharacterType> void parse(const CharacterType*, const unsigned length, const URL&, const TextEncoding&);
+ template<typename CharacterType> void parse(const CharacterType*, const unsigned length, const URL&, const URLTextEncoding*);
template<typename CharacterType> void parseAuthority(CodePointIterator<CharacterType>);
template<typename CharacterType> bool parseHostAndPort(CodePointIterator<CharacterType>);
template<typename CharacterType> bool parsePort(CodePointIterator<CharacterType>&);
@@ -107,7 +106,7 @@
void appendToASCIIBuffer(UChar32);
void appendToASCIIBuffer(const char*, size_t);
void appendToASCIIBuffer(const LChar* characters, size_t size) { appendToASCIIBuffer(reinterpret_cast<const char*>(characters), size); }
- template<typename CharacterType> void encodeQuery(const Vector<UChar>& source, const TextEncoding&, CodePointIterator<CharacterType>);
+ template<typename CharacterType> void encodeNonUTF8Query(const Vector<UChar>& source, const URLTextEncoding&, CodePointIterator<CharacterType>);
void copyASCIIStringUntil(const String&, size_t length);
bool copyBaseWindowsDriveLetter(const URL&);
StringView parsedDataView(size_t start, size_t length);
@@ -127,7 +126,7 @@
void serializeIPv6(IPv6Address);
enum class URLPart;
- template<typename CharacterType> void copyURLPartsUntil(const URL& base, URLPart, const CodePointIterator<CharacterType>&, bool& isUTF8Encoding);
+ template<typename CharacterType> void copyURLPartsUntil(const URL& base, URLPart, const CodePointIterator<CharacterType>&, const URLTextEncoding*&);
static size_t urlLengthUntilPart(const URL&, URLPart);
void popPath();
bool shouldPopPath(unsigned);
Modified: trunk/Source/WebCore/platform/text/TextEncoding.cpp (236564 => 236565)
--- trunk/Source/WebCore/platform/text/TextEncoding.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/text/TextEncoding.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -31,6 +31,7 @@
#include "TextCodec.h"
#include "TextEncodingRegistry.h"
#include <unicode/unorm.h>
+#include <wtf/NeverDestroyed.h>
#include <wtf/StdLibExtras.h>
#include <wtf/text/CString.h>
#include <wtf/text/StringView.h>
@@ -39,7 +40,7 @@
static const TextEncoding& UTF7Encoding()
{
- static TextEncoding globalUTF7Encoding("UTF-7");
+ static NeverDestroyed<TextEncoding> globalUTF7Encoding("UTF-7");
return globalUTF7Encoding;
}
@@ -173,7 +174,7 @@
// byte-based encoding and can contain 0x00. By extension, the same
// should be done for UTF-32. In case of UTF-7, it is a byte-based encoding,
// but it's fraught with problems and we'd rather steer clear of it.
-const TextEncoding& TextEncoding::encodingForFormSubmission() const
+const TextEncoding& TextEncoding::encodingForFormSubmissionOrURLParsing() const
{
if (isNonByteBasedEncoding() || isUTF7Encoding())
return UTF8Encoding();
@@ -182,38 +183,38 @@
const TextEncoding& ASCIIEncoding()
{
- static TextEncoding globalASCIIEncoding("ASCII");
+ static NeverDestroyed<TextEncoding> globalASCIIEncoding("ASCII");
return globalASCIIEncoding;
}
const TextEncoding& Latin1Encoding()
{
- static TextEncoding globalLatin1Encoding("latin1");
+ static NeverDestroyed<TextEncoding> globalLatin1Encoding("latin1");
return globalLatin1Encoding;
}
const TextEncoding& UTF16BigEndianEncoding()
{
- static TextEncoding globalUTF16BigEndianEncoding("UTF-16BE");
+ static NeverDestroyed<TextEncoding> globalUTF16BigEndianEncoding("UTF-16BE");
return globalUTF16BigEndianEncoding;
}
const TextEncoding& UTF16LittleEndianEncoding()
{
- static TextEncoding globalUTF16LittleEndianEncoding("UTF-16LE");
+ static NeverDestroyed<TextEncoding> globalUTF16LittleEndianEncoding("UTF-16LE");
return globalUTF16LittleEndianEncoding;
}
const TextEncoding& UTF8Encoding()
{
- static TextEncoding globalUTF8Encoding("UTF-8");
- ASSERT(globalUTF8Encoding.isValid());
+ static NeverDestroyed<TextEncoding> globalUTF8Encoding("UTF-8");
+ ASSERT(globalUTF8Encoding.get().isValid());
return globalUTF8Encoding;
}
const TextEncoding& WindowsLatin1Encoding()
{
- static TextEncoding globalWindowsLatin1Encoding("WinLatin-1");
+ static NeverDestroyed<TextEncoding> globalWindowsLatin1Encoding("WinLatin-1");
return globalWindowsLatin1Encoding;
}
Modified: trunk/Source/WebCore/platform/text/TextEncoding.h (236564 => 236565)
--- trunk/Source/WebCore/platform/text/TextEncoding.h 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Source/WebCore/platform/text/TextEncoding.h 2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,12 +25,13 @@
#pragma once
+#include "URL.h"
#include <pal/text/UnencodableHandling.h>
#include <wtf/text/WTFString.h>
namespace WebCore {
-class TextEncoding {
+class TextEncoding : public URLTextEncoding {
public:
TextEncoding() = default;
WEBCORE_EXPORT TextEncoding(const char* name);
@@ -43,11 +44,12 @@
bool isJapanese() const;
const TextEncoding& closestByteBasedEquivalent() const;
- const TextEncoding& encodingForFormSubmission() const;
+ const TextEncoding& encodingForFormSubmissionOrURLParsing() const;
WEBCORE_EXPORT String decode(const char*, size_t length, bool stopOnError, bool& sawError) const;
String decode(const char*, size_t length) const;
- Vector<uint8_t> encode(StringView, UnencodableHandling) const;
+ WEBCORE_EXPORT Vector<uint8_t> encode(StringView, UnencodableHandling) const;
+ Vector<uint8_t> encodeForURLParsing(StringView string) const final { return encode(string, UnencodableHandling::URLEncodedEntities); }
UChar backslashAsCurrencySymbol() const;
bool isByteBasedEncoding() const { return !isNonByteBasedEncoding(); }
Modified: trunk/Tools/ChangeLog (236564 => 236565)
--- trunk/Tools/ChangeLog 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Tools/ChangeLog 2018-09-27 20:05:52 UTC (rev 236565)
@@ -1,3 +1,14 @@
+2018-09-27 Alex Christensen <[email protected]>
+
+ URLParser should use TextEncoding through an abstract class
+ https://bugs.webkit.org/show_bug.cgi?id=190027
+
+ Reviewed by Andy Estes.
+
+ * TestWebKitAPI/Tests/WebCore/URLParser.cpp:
+ (TestWebKitAPI::checkURL):
+ (TestWebKitAPI::TEST_F):
+
2018-09-27 Ryan Haddad <[email protected]>
iOS Simulator bots should pass '--dedicated-simulators' to run-webkit-tests
Modified: trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp (236564 => 236565)
--- trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp 2018-09-27 19:44:40 UTC (rev 236564)
+++ trunk/Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp 2018-09-27 20:05:52 UTC (rev 236565)
@@ -25,6 +25,7 @@
#include "config.h"
#include "WTFStringUtilities.h"
+#include <WebCore/TextEncoding.h>
#include <WebCore/URLParser.h>
#include <wtf/MainThread.h>
#include <wtf/text/StringBuilder.h>
@@ -210,7 +211,7 @@
checkRelativeURL(urlString, baseString, {"", "", "", "", 0, "", "", "", urlString});
}
-static void checkURL(const String& urlString, const TextEncoding& encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
+static void checkURL(const String& urlString, const TextEncoding* encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
{
URLParser parser(urlString, { }, encoding);
auto url = ""
@@ -235,7 +236,7 @@
}
}
-static void checkURL(const String& urlString, const String& baseURLString, const TextEncoding& encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
+static void checkURL(const String& urlString, const String& baseURLString, const TextEncoding* encoding, const ExpectedParts& parts, TestTabs testTabs = TestTabs::Yes)
{
URLParser baseParser(baseURLString, { }, encoding);
URLParser parser(urlString, baseParser.result(), encoding);
@@ -1285,37 +1286,37 @@
TEST_F(URLParserTest, QueryEncoding)
{
- checkURL(utf16String(u"http://host?ß😍#ß😍"), UTF8Encoding(), {"http", "", "", "host", 0, "/", "%C3%9F%F0%9F%98%8D", "%C3%9F%F0%9F%98%8D", utf16String(u"http://host/?%C3%9F%F0%9F%98%8D#%C3%9F%F0%9F%98%8D")}, testTabsValueForSurrogatePairs);
+ checkURL(utf16String(u"http://host?ß😍#ß😍"), nullptr, {"http", "", "", "host", 0, "/", "%C3%9F%F0%9F%98%8D", "%C3%9F%F0%9F%98%8D", utf16String(u"http://host/?%C3%9F%F0%9F%98%8D#%C3%9F%F0%9F%98%8D")}, testTabsValueForSurrogatePairs);
TextEncoding latin1(String("latin1"));
- checkURL("http://host/?query with%20spaces", latin1, {"http", "", "", "host", 0, "/", "query%20with%20spaces", "", "http://host/?query%20with%20spaces"});
- checkURL("http://host/?query", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
- checkURL("http://host/?\tquery", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
- checkURL("http://host/?q\tuery", latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
- checkURL("http://host/?query with SpAcEs#fragment", latin1, {"http", "", "", "host", 0, "/", "query%20with%20SpAcEs", "fragment", "http://host/?query%20with%20SpAcEs#fragment"});
- checkURL("http://host/?que\rry\t\r\n#fragment", latin1, {"http", "", "", "host", 0, "/", "query", "fragment", "http://host/?query#fragment"});
+ checkURL("http://host/?query with%20spaces", &latin1, {"http", "", "", "host", 0, "/", "query%20with%20spaces", "", "http://host/?query%20with%20spaces"});
+ checkURL("http://host/?query", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+ checkURL("http://host/?\tquery", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+ checkURL("http://host/?q\tuery", &latin1, {"http", "", "", "host", 0, "/", "query", "", "http://host/?query"});
+ checkURL("http://host/?query with SpAcEs#fragment", &latin1, {"http", "", "", "host", 0, "/", "query%20with%20SpAcEs", "fragment", "http://host/?query%20with%20SpAcEs#fragment"});
+ checkURL("http://host/?que\rry\t\r\n#fragment", &latin1, {"http", "", "", "host", 0, "/", "query", "fragment", "http://host/?query#fragment"});
TextEncoding unrecognized(String("unrecognized invalid encoding name"));
- checkURL("http://host/?query", unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
- checkURL("http://host/?", unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
+ checkURL("http://host/?query", &unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
+ checkURL("http://host/?", &unrecognized, {"http", "", "", "host", 0, "/", "", "", "http://host/?"});
TextEncoding iso88591(String("ISO-8859-1"));
String withUmlauts = utf16String<4>({0xDC, 0x430, 0x451, '\0'});
- checkURL(makeString("ws://host/path?", withUmlauts), iso88591, {"ws", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "ws://host/path?%C3%9C%D0%B0%D1%91"});
- checkURL(makeString("wss://host/path?", withUmlauts), iso88591, {"wss", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "wss://host/path?%C3%9C%D0%B0%D1%91"});
- checkURL(makeString("asdf://host/path?", withUmlauts), iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "asdf://host/path?%C3%9C%D0%B0%D1%91"});
- checkURL(makeString("https://host/path?", withUmlauts), iso88591, {"https", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "https://host/path?%DC%26%231072%3B%26%231105%3B"});
- checkURL(makeString("gopher://host/path?", withUmlauts), iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "gopher://host/path?%DC%26%231072%3B%26%231105%3B"});
- checkURL(makeString("/path?", withUmlauts, "#fragment"), "ws://example.com/", iso88591, {"ws", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "ws://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
- checkURL(makeString("/path?", withUmlauts, "#fragment"), "wss://example.com/", iso88591, {"wss", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "wss://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
- checkURL(makeString("/path?", withUmlauts, "#fragment"), "asdf://example.com/", iso88591, {"asdf", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
- checkURL(makeString("/path?", withUmlauts, "#fragment"), "https://example.com/", iso88591, {"https", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "https://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
- checkURL(makeString("/path?", withUmlauts, "#fragment"), "gopher://example.com/", iso88591, {"gopher", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
- checkURL(makeString("gopher://host/path?", withUmlauts, "#fragment"), "asdf://example.com/?doesntmatter", iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://host/path?%DC%26%231072%3B%26%231105%3B#fragment"});
- checkURL(makeString("asdf://host/path?", withUmlauts, "#fragment"), "http://example.com/?doesntmatter", iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://host/path?%C3%9C%D0%B0%D1%91#fragment"});
+ checkURL(makeString("ws://host/path?", withUmlauts), &iso88591, {"ws", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "ws://host/path?%C3%9C%D0%B0%D1%91"});
+ checkURL(makeString("wss://host/path?", withUmlauts), &iso88591, {"wss", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "wss://host/path?%C3%9C%D0%B0%D1%91"});
+ checkURL(makeString("asdf://host/path?", withUmlauts), &iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "", "asdf://host/path?%C3%9C%D0%B0%D1%91"});
+ checkURL(makeString("https://host/path?", withUmlauts), &iso88591, {"https", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "https://host/path?%DC%26%231072%3B%26%231105%3B"});
+ checkURL(makeString("gopher://host/path?", withUmlauts), &iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "", "gopher://host/path?%DC%26%231072%3B%26%231105%3B"});
+ checkURL(makeString("/path?", withUmlauts, "#fragment"), "ws://example.com/", &iso88591, {"ws", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "ws://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+ checkURL(makeString("/path?", withUmlauts, "#fragment"), "wss://example.com/", &iso88591, {"wss", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "wss://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+ checkURL(makeString("/path?", withUmlauts, "#fragment"), "asdf://example.com/", &iso88591, {"asdf", "", "", "example.com", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://example.com/path?%C3%9C%D0%B0%D1%91#fragment"});
+ checkURL(makeString("/path?", withUmlauts, "#fragment"), "https://example.com/", &iso88591, {"https", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "https://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+ checkURL(makeString("/path?", withUmlauts, "#fragment"), "gopher://example.com/", &iso88591, {"gopher", "", "", "example.com", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://example.com/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+ checkURL(makeString("gopher://host/path?", withUmlauts, "#fragment"), "asdf://example.com/?doesntmatter", &iso88591, {"gopher", "", "", "host", 0, "/path", "%DC%26%231072%3B%26%231105%3B", "fragment", "gopher://host/path?%DC%26%231072%3B%26%231105%3B#fragment"});
+ checkURL(makeString("asdf://host/path?", withUmlauts, "#fragment"), "http://example.com/?doesntmatter", &iso88591, {"asdf", "", "", "host", 0, "/path", "%C3%9C%D0%B0%D1%91", "fragment", "asdf://host/path?%C3%9C%D0%B0%D1%91#fragment"});
- checkURL("http://host/pa'th?qu'ery#fr'agment", UTF8Encoding(), {"http", "", "", "host", 0, "/pa'th", "qu%27ery", "fr'agment", "http://host/pa'th?qu%27ery#fr'agment"});
- checkURL("asdf://host/pa'th?qu'ery#fr'agment", UTF8Encoding(), {"asdf", "", "", "host", 0, "/pa'th", "qu'ery", "fr'agment", "asdf://host/pa'th?qu'ery#fr'agment"});
+ checkURL("http://host/pa'th?qu'ery#fr'agment", nullptr, {"http", "", "", "host", 0, "/pa'th", "qu%27ery", "fr'agment", "http://host/pa'th?qu%27ery#fr'agment"});
+ checkURL("asdf://host/pa'th?qu'ery#fr'agment", nullptr, {"asdf", "", "", "host", 0, "/pa'th", "qu'ery", "fr'agment", "asdf://host/pa'th?qu'ery#fr'agment"});
// FIXME: Add more tests with other encodings and things like non-ascii characters, emoji and unmatched surrogate pairs.
}
_______________________________________________ webkit-changes mailing list [email protected] https://lists.webkit.org/mailman/listinfo/webkit-changes
