Title: [202599] trunk
Revision
202599
Author
[email protected]
Date
2016-06-28 18:04:05 -0700 (Tue, 28 Jun 2016)

Log Message

Implement "replacement" codec
https://bugs.webkit.org/show_bug.cgi?id=159180
<rdar://problem/26015178>

Reviewed by Brent Fulgham.

LayoutTests/imported/w3c:

* web-platform-tests/dom/nodes/Document-characterSet-normalization-expected.txt:

Source/WebCore:

Test: fast/encoding/charset-replacement.html

Add support for "replacement" codec according to the spec:
https://encoding.spec.whatwg.org/#replacement
According to the spec, encoding labels {"csiso2022kr", "hz-gb-2312", "iso-2022-cn",
"iso-2022-cn-ext", "iso-2022-kr"} are used to conduct certain attacks that abuse
a mismatch between encodings supported on the server and the client. Therefore,
they are grouped under the "replacement" codec, which does the following things
to prevent those attacks.
1) Decode: terminates with a single U+FFFD.
2) Encode: treated as UTF-8.

Furthermore, the "replacement" codec is a specification convenience to group those
vulnerable encoding labels. Therefore, it should not be able to use directly.

This change is based on the following Blink changes:
https://codereview.chromium.org/265973003, and
https://codereview.chromium.org/261013007.

* CMakeLists.txt:
* WebCore.xcodeproj/project.pbxproj:
* platform/text/TextAllInOne.cpp:
* platform/text/TextCodecReplacement.cpp: Added.
(WebCore::TextCodecReplacement::create):
(WebCore::TextCodecReplacement::TextCodecReplacement):
(WebCore::TextCodecReplacement::registerEncodingNames):
(WebCore::TextCodecReplacement::registerCodecs):
(WebCore::TextCodecReplacement::decode):
* platform/text/TextCodecReplacement.h: Added.
* platform/text/TextEncoding.cpp:
(WebCore::TextEncoding::TextEncoding):
* platform/text/TextEncodingRegistry.cpp:
(WebCore::isReplacementEncoding):
(WebCore::extendTextCodecMaps):
* platform/text/TextEncodingRegistry.h:

LayoutTests:

* fast/encoding/char-decoding-expected.txt:
* fast/encoding/char-decoding.html:
* fast/encoding/char-encoding-expected.txt:
* fast/encoding/char-encoding.html:
* fast/encoding/charset-replacement-expected.txt: Added.
* fast/encoding/charset-replacement.html: Added.

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (202598 => 202599)


--- trunk/LayoutTests/ChangeLog	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/ChangeLog	2016-06-29 01:04:05 UTC (rev 202599)
@@ -1,3 +1,18 @@
+2016-06-28  Jiewen Tan  <[email protected]>
+
+        Implement "replacement" codec
+        https://bugs.webkit.org/show_bug.cgi?id=159180
+        <rdar://problem/26015178>
+
+        Reviewed by Brent Fulgham.
+
+        * fast/encoding/char-decoding-expected.txt:
+        * fast/encoding/char-decoding.html:
+        * fast/encoding/char-encoding-expected.txt:
+        * fast/encoding/char-encoding.html:
+        * fast/encoding/charset-replacement-expected.txt: Added.
+        * fast/encoding/charset-replacement.html: Added.
+
 2016-06-28  Michael Saboff  <[email protected]>
 
         REGRESSION (r200946): Improper backtracking from last alternative in sticky patterns

Modified: trunk/LayoutTests/fast/encoding/char-decoding-expected.txt (202598 => 202599)


--- trunk/LayoutTests/fast/encoding/char-decoding-expected.txt	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/fast/encoding/char-decoding-expected.txt	2016-06-29 01:04:05 UTC (rev 202599)
@@ -190,6 +190,11 @@
 PASS decode('csUnicode', '%69%D8%D6%DE') is 'U+D869/U+DED6'
 PASS decode('UTF-16BE', '%D8%69%DE%D6') is 'U+D869/U+DED6'
 PASS decode('unicodeFFFE', '%D8%69%DE%D6') is 'U+D869/U+DED6'
+PASS decode('csiso2022kr', '%41%42%43%61%62%63%31%32%33%A0') is 'U+FFFD'
+PASS decode('hz-gb-2312', '%41%42%43%61%62%63%31%32%33%A0') is 'U+FFFD'
+PASS decode('iso-2022-cn', '%41%42%43%61%62%63%31%32%33%A0') is 'U+FFFD'
+PASS decode('iso-2022-cn-ext', '%41%42%43%61%62%63%31%32%33%A0') is 'U+FFFD'
+PASS decode('iso-2022-kr', '%41%42%43%61%62%63%31%32%33%A0') is 'U+FFFD'
 PASS successfullyParsed is true
 
 TEST COMPLETE

Modified: trunk/LayoutTests/fast/encoding/char-decoding.html (202598 => 202599)


--- trunk/LayoutTests/fast/encoding/char-decoding.html	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/fast/encoding/char-decoding.html	2016-06-29 01:04:05 UTC (rev 202599)
@@ -105,6 +105,13 @@
 testDecode('UTF-16BE', '%D8%69%DE%D6', 'U+D869/U+DED6');
 testDecode('unicodeFFFE', '%D8%69%DE%D6', 'U+D869/U+DED6');
 
+// Replacement encodings should decode as replacement (U+FFFD) then EOF
+testDecode("csiso2022kr", "%41%42%43%61%62%63%31%32%33%A0", "U+FFFD");
+testDecode("hz-gb-2312", "%41%42%43%61%62%63%31%32%33%A0", "U+FFFD");
+testDecode("iso-2022-cn", "%41%42%43%61%62%63%31%32%33%A0", "U+FFFD");
+testDecode("iso-2022-cn-ext", "%41%42%43%61%62%63%31%32%33%A0", "U+FFFD");
+testDecode("iso-2022-kr", "%41%42%43%61%62%63%31%32%33%A0", "U+FFFD");
+
 </script>
 <script src=""
 </body>

Modified: trunk/LayoutTests/fast/encoding/char-encoding-expected.txt (202598 => 202599)


--- trunk/LayoutTests/fast/encoding/char-encoding-expected.txt	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/fast/encoding/char-encoding-expected.txt	2016-06-29 01:04:05 UTC (rev 202599)
@@ -16,6 +16,11 @@
 PASS encode('GBK', 'U+1E3F') is '%A8%BC'
 PASS encode('GBK', 'U+22EF') is '%A1%AD'
 PASS encode('GBK', 'U+301C') is '%A1%AB'
+PASS encode('csiso2022kr', 'U+00A0') is '%C2%A0'
+PASS encode('hz-gb-2312', 'U+00A0') is '%C2%A0'
+PASS encode('iso-2022-cn', 'U+00A0') is '%C2%A0'
+PASS encode('iso-2022-cn-ext', 'U+00A0') is '%C2%A0'
+PASS encode('iso-2022-kr', 'U+00A0') is '%C2%A0'
 PASS successfullyParsed is true
 
 TEST COMPLETE

Modified: trunk/LayoutTests/fast/encoding/char-encoding.html (202598 => 202599)


--- trunk/LayoutTests/fast/encoding/char-encoding.html	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/fast/encoding/char-encoding.html	2016-06-29 01:04:05 UTC (rev 202599)
@@ -33,6 +33,12 @@
 testEncode('GBK', 'U+1E3F', '%A8%BC');
 testEncode('GBK', 'U+22EF', '%A1%AD');
 testEncode('GBK', 'U+301C', '%A1%AB');
+// Replacement encodings - should encode as UTF-8
+testEncode("csiso2022kr", "U+00A0", "%C2%A0");
+testEncode("hz-gb-2312", "U+00A0", "%C2%A0");
+testEncode("iso-2022-cn", "U+00A0", "%C2%A0");
+testEncode("iso-2022-cn-ext", "U+00A0", "%C2%A0");
+testEncode("iso-2022-kr", "U+00A0", "%C2%A0");
 
 // Turning on this test causes a download to occur. FIXME: A bug?
 // testEncode('UTF-8', 'U+221A', '%E2%88%9A');

Added: trunk/LayoutTests/fast/encoding/charset-replacement-expected.txt (0 => 202599)


--- trunk/LayoutTests/fast/encoding/charset-replacement-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/encoding/charset-replacement-expected.txt	2016-06-29 01:04:05 UTC (rev 202599)
@@ -0,0 +1,4 @@
+ALERT: ISO-8859-1
+Test PASSED if the encoding of this document is the default encoding.
+Test FAILED if you see a U+FFFD character in a dumped render tree.
+

Added: trunk/LayoutTests/fast/encoding/charset-replacement.html (0 => 202599)


--- trunk/LayoutTests/fast/encoding/charset-replacement.html	                        (rev 0)
+++ trunk/LayoutTests/fast/encoding/charset-replacement.html	2016-06-29 01:04:05 UTC (rev 202599)
@@ -0,0 +1,15 @@
+<!DOCTYPE html>
+<html>
+<head>
+    <meta charset=rEpLaCeMeNt>
+    <script>
+    if (window.testRunner)
+        testRunner.dumpAsText();
+    alert(document.characterSet);
+    </script>
+</head>
+<body>
+    Test PASSED if the encoding of this document is the default encoding.<br>
+    Test FAILED if you see a U+FFFD character in a dumped render tree.<br>
+</body>
+</html>
\ No newline at end of file

Modified: trunk/LayoutTests/imported/w3c/ChangeLog (202598 => 202599)


--- trunk/LayoutTests/imported/w3c/ChangeLog	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/imported/w3c/ChangeLog	2016-06-29 01:04:05 UTC (rev 202599)
@@ -1,3 +1,13 @@
+2016-06-28  Jiewen Tan  <[email protected]>
+
+        Implement "replacement" codec
+        https://bugs.webkit.org/show_bug.cgi?id=159180
+        <rdar://problem/26015178>
+
+        Reviewed by Brent Fulgham.
+
+        * web-platform-tests/dom/nodes/Document-characterSet-normalization-expected.txt:
+
 2016-06-27  Youenn Fablet  <[email protected]>
 
         Remove didFailAccessControlCheck ThreadableLoaderClient callback

Modified: trunk/LayoutTests/imported/w3c/web-platform-tests/dom/nodes/Document-characterSet-normalization-expected.txt (202598 => 202599)


--- trunk/LayoutTests/imported/w3c/web-platform-tests/dom/nodes/Document-characterSet-normalization-expected.txt	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/LayoutTests/imported/w3c/web-platform-tests/dom/nodes/Document-characterSet-normalization-expected.txt	2016-06-29 01:04:05 UTC (rev 202599)
@@ -638,19 +638,19 @@
 PASS Name "EUC-KR" has label "windows-949" (characterSet) 
 PASS Name "EUC-KR" has label "windows-949" (inputEncoding) 
 PASS Name "EUC-KR" has label "windows-949" (charset) 
-FAIL Name "replacement" has label "csiso2022kr" (characterSet) assert_equals: expected "replacement" but got "ISO-2022-KR"
-FAIL Name "replacement" has label "csiso2022kr" (inputEncoding) assert_equals: expected "replacement" but got "ISO-2022-KR"
-FAIL Name "replacement" has label "csiso2022kr" (charset) assert_equals: expected "replacement" but got "ISO-2022-KR"
-FAIL Name "replacement" has label "hz-gb-2312" (characterSet) assert_equals: expected "replacement" but got "HZ-GB-2312"
-FAIL Name "replacement" has label "hz-gb-2312" (inputEncoding) assert_equals: expected "replacement" but got "HZ-GB-2312"
-FAIL Name "replacement" has label "hz-gb-2312" (charset) assert_equals: expected "replacement" but got "HZ-GB-2312"
-FAIL Name "replacement" has label "iso-2022-cn" (characterSet) assert_equals: expected "replacement" but got "ISO-2022-CN"
-FAIL Name "replacement" has label "iso-2022-cn" (inputEncoding) assert_equals: expected "replacement" but got "ISO-2022-CN"
-FAIL Name "replacement" has label "iso-2022-cn" (charset) assert_equals: expected "replacement" but got "ISO-2022-CN"
-FAIL Name "replacement" has label "iso-2022-cn-ext" (characterSet) assert_equals: expected "replacement" but got "ISO-2022-CN-EXT"
-FAIL Name "replacement" has label "iso-2022-cn-ext" (inputEncoding) assert_equals: expected "replacement" but got "ISO-2022-CN-EXT"
-FAIL Name "replacement" has label "iso-2022-cn-ext" (charset) assert_equals: expected "replacement" but got "ISO-2022-CN-EXT"
-FAIL Name "replacement" has label "iso-2022-kr" (characterSet) assert_equals: expected "replacement" but got "ISO-2022-KR"
-FAIL Name "replacement" has label "iso-2022-kr" (inputEncoding) assert_equals: expected "replacement" but got "ISO-2022-KR"
-FAIL Name "replacement" has label "iso-2022-kr" (charset) assert_equals: expected "replacement" but got "ISO-2022-KR"
+PASS Name "replacement" has label "csiso2022kr" (characterSet) 
+PASS Name "replacement" has label "csiso2022kr" (inputEncoding) 
+PASS Name "replacement" has label "csiso2022kr" (charset) 
+PASS Name "replacement" has label "hz-gb-2312" (characterSet) 
+PASS Name "replacement" has label "hz-gb-2312" (inputEncoding) 
+PASS Name "replacement" has label "hz-gb-2312" (charset) 
+PASS Name "replacement" has label "iso-2022-cn" (characterSet) 
+PASS Name "replacement" has label "iso-2022-cn" (inputEncoding) 
+PASS Name "replacement" has label "iso-2022-cn" (charset) 
+PASS Name "replacement" has label "iso-2022-cn-ext" (characterSet) 
+PASS Name "replacement" has label "iso-2022-cn-ext" (inputEncoding) 
+PASS Name "replacement" has label "iso-2022-cn-ext" (charset) 
+PASS Name "replacement" has label "iso-2022-kr" (characterSet) 
+PASS Name "replacement" has label "iso-2022-kr" (inputEncoding) 
+PASS Name "replacement" has label "iso-2022-kr" (charset) 
 

Modified: trunk/Source/WebCore/CMakeLists.txt (202598 => 202599)


--- trunk/Source/WebCore/CMakeLists.txt	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/CMakeLists.txt	2016-06-29 01:04:05 UTC (rev 202599)
@@ -2378,6 +2378,7 @@
     platform/text/TextCodec.cpp
     platform/text/TextCodecICU.cpp
     platform/text/TextCodecLatin1.cpp
+    platform/text/TextCodecReplacement.cpp
     platform/text/TextCodecUTF16.cpp
     platform/text/TextCodecUTF8.cpp
     platform/text/TextCodecUserDefined.cpp

Modified: trunk/Source/WebCore/ChangeLog (202598 => 202599)


--- trunk/Source/WebCore/ChangeLog	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/ChangeLog	2016-06-29 01:04:05 UTC (rev 202599)
@@ -1,3 +1,47 @@
+2016-06-28  Jiewen Tan  <[email protected]>
+
+        Implement "replacement" codec
+        https://bugs.webkit.org/show_bug.cgi?id=159180
+        <rdar://problem/26015178>
+
+        Reviewed by Brent Fulgham.
+
+        Test: fast/encoding/charset-replacement.html
+
+        Add support for "replacement" codec according to the spec:
+        https://encoding.spec.whatwg.org/#replacement
+        According to the spec, encoding labels {"csiso2022kr", "hz-gb-2312", "iso-2022-cn",
+        "iso-2022-cn-ext", "iso-2022-kr"} are used to conduct certain attacks that abuse
+        a mismatch between encodings supported on the server and the client. Therefore,
+        they are grouped under the "replacement" codec, which does the following things
+        to prevent those attacks.
+        1) Decode: terminates with a single U+FFFD.
+        2) Encode: treated as UTF-8.
+
+        Furthermore, the "replacement" codec is a specification convenience to group those
+        vulnerable encoding labels. Therefore, it should not be able to use directly.
+
+        This change is based on the following Blink changes:
+        https://codereview.chromium.org/265973003, and
+        https://codereview.chromium.org/261013007.
+
+        * CMakeLists.txt:
+        * WebCore.xcodeproj/project.pbxproj:
+        * platform/text/TextAllInOne.cpp:
+        * platform/text/TextCodecReplacement.cpp: Added.
+        (WebCore::TextCodecReplacement::create):
+        (WebCore::TextCodecReplacement::TextCodecReplacement):
+        (WebCore::TextCodecReplacement::registerEncodingNames):
+        (WebCore::TextCodecReplacement::registerCodecs):
+        (WebCore::TextCodecReplacement::decode):
+        * platform/text/TextCodecReplacement.h: Added.
+        * platform/text/TextEncoding.cpp:
+        (WebCore::TextEncoding::TextEncoding):
+        * platform/text/TextEncodingRegistry.cpp:
+        (WebCore::isReplacementEncoding):
+        (WebCore::extendTextCodecMaps):
+        * platform/text/TextEncodingRegistry.h:
+
 2016-06-28  Dean Jackson  <[email protected]>
 
         Remove incorrect comments in HTMLCanvasElement

Modified: trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj (202598 => 202599)


--- trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj	2016-06-29 01:04:05 UTC (rev 202599)
@@ -2331,6 +2331,8 @@
 		555B87ED1CAAF0AB00349425 /* ImageDecoderCG.h in Headers */ = {isa = PBXBuildFile; fileRef = 555B87EB1CAAF0AB00349425 /* ImageDecoderCG.h */; };
 		572A7F211C6E5719009C6149 /* SimulatedClick.h in Headers */ = {isa = PBXBuildFile; fileRef = 572A7F201C6E5719009C6149 /* SimulatedClick.h */; };
 		572A7F231C6E5A66009C6149 /* SimulatedClick.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 572A7F221C6E5A66009C6149 /* SimulatedClick.cpp */; };
+		57EF5E601D20C83900171E60 /* TextCodecReplacement.h in Headers */ = {isa = PBXBuildFile; fileRef = 57EF5E5F1D20C83900171E60 /* TextCodecReplacement.h */; };
+		57EF5E621D20D28700171E60 /* TextCodecReplacement.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 57EF5E611D20D28700171E60 /* TextCodecReplacement.cpp */; };
 		580371611A66F00A00BAF519 /* ClipRect.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 5803715F1A66F00A00BAF519 /* ClipRect.cpp */; };
 		580371621A66F00A00BAF519 /* ClipRect.h in Headers */ = {isa = PBXBuildFile; fileRef = 580371601A66F00A00BAF519 /* ClipRect.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		580371641A66F1D300BAF519 /* LayerFragment.h in Headers */ = {isa = PBXBuildFile; fileRef = 580371631A66F1D300BAF519 /* LayerFragment.h */; settings = {ATTRIBUTES = (Private, ); }; };
@@ -9997,6 +9999,8 @@
 		55D408F71A7C631800C78450 /* SVGImageClients.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SVGImageClients.h; sourceTree = "<group>"; };
 		572A7F201C6E5719009C6149 /* SimulatedClick.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SimulatedClick.h; sourceTree = "<group>"; };
 		572A7F221C6E5A66009C6149 /* SimulatedClick.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SimulatedClick.cpp; sourceTree = "<group>"; };
+		57EF5E5F1D20C83900171E60 /* TextCodecReplacement.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = TextCodecReplacement.h; sourceTree = "<group>"; };
+		57EF5E611D20D28700171E60 /* TextCodecReplacement.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = TextCodecReplacement.cpp; sourceTree = "<group>"; };
 		5803715F1A66F00A00BAF519 /* ClipRect.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ClipRect.cpp; sourceTree = "<group>"; };
 		580371601A66F00A00BAF519 /* ClipRect.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ClipRect.h; sourceTree = "<group>"; };
 		580371631A66F1D300BAF519 /* LayerFragment.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = LayerFragment.h; sourceTree = "<group>"; };
@@ -22349,6 +22353,8 @@
 				B2C3DA0C0D006C1D00EF6F26 /* TextCodecICU.h */,
 				B2C3DA0D0D006C1D00EF6F26 /* TextCodecLatin1.cpp */,
 				B2C3DA0E0D006C1D00EF6F26 /* TextCodecLatin1.h */,
+				57EF5E611D20D28700171E60 /* TextCodecReplacement.cpp */,
+				57EF5E5F1D20C83900171E60 /* TextCodecReplacement.h */,
 				B2C3DA0F0D006C1D00EF6F26 /* TextCodecUserDefined.cpp */,
 				B2C3DA100D006C1D00EF6F26 /* TextCodecUserDefined.h */,
 				B2C3DA110D006C1D00EF6F26 /* TextCodecUTF16.cpp */,
@@ -27911,6 +27917,7 @@
 				BC3BE9950E9C1C7C00835588 /* RenderScrollbarPart.h in Headers */,
 				BC3BE9990E9C1E5D00835588 /* RenderScrollbarTheme.h in Headers */,
 				458FE40A1589DF0B005609E6 /* RenderSearchField.h in Headers */,
+				57EF5E601D20C83900171E60 /* TextCodecReplacement.h in Headers */,
 				0F11A54F0F39233100C37884 /* RenderSelectionInfo.h in Headers */,
 				AB247A6D0AFD6383003FA5FD /* RenderSlider.h in Headers */,
 				31955A88160D199200858025 /* RenderSnapshottedPlugIn.h in Headers */,
@@ -30189,6 +30196,7 @@
 				F5C041DA0FFCA7CE00839D4A /* HTMLDataListElement.cpp in Sources */,
 				D359D789129CA2710006E5D2 /* HTMLDetailsElement.cpp in Sources */,
 				A8EA79F90A1916DF00A8EF5F /* HTMLDirectoryElement.cpp in Sources */,
+				57EF5E621D20D28700171E60 /* TextCodecReplacement.cpp in Sources */,
 				A8EA7CB10A192B9C00A8EF5F /* HTMLDivElement.cpp in Sources */,
 				A8EA79F50A1916DF00A8EF5F /* HTMLDListElement.cpp in Sources */,
 				93F19A9108245E59001E9ABC /* HTMLDocument.cpp in Sources */,

Modified: trunk/Source/WebCore/platform/text/TextAllInOne.cpp (202598 => 202599)


--- trunk/Source/WebCore/platform/text/TextAllInOne.cpp	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/platform/text/TextAllInOne.cpp	2016-06-29 01:04:05 UTC (rev 202599)
@@ -30,6 +30,7 @@
 #include "TextCodec.cpp"
 #include "TextCodecICU.cpp"
 #include "TextCodecLatin1.cpp"
+#include "TextCodecReplacement.cpp"
 #include "TextCodecUTF16.cpp"
 #include "TextCodecUTF8.cpp"
 #include "TextCodecUserDefined.cpp"

Added: trunk/Source/WebCore/platform/text/TextCodecReplacement.cpp (0 => 202599)


--- trunk/Source/WebCore/platform/text/TextCodecReplacement.cpp	                        (rev 0)
+++ trunk/Source/WebCore/platform/text/TextCodecReplacement.cpp	2016-06-29 01:04:05 UTC (rev 202599)
@@ -0,0 +1,71 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "TextCodecReplacement.h"
+
+namespace WebCore {
+
+static const UChar ReplacementCharacter[1] = { 0xFFFD };
+
+std::unique_ptr<TextCodec> TextCodecReplacement::create(const TextEncoding&, const void*)
+{
+    return std::make_unique<TextCodecReplacement>();
+}
+
+TextCodecReplacement::TextCodecReplacement()
+{
+}
+
+void TextCodecReplacement::registerEncodingNames(EncodingNameRegistrar registrar)
+{
+    // The 'replacement' itself is not a valid label. It is the name of
+    // a group of legacy encoding labels. Hence, it cannot be used directly.
+    registrar("replacement", "replacement");
+
+    // The labels
+    registrar("csiso2022kr", "replacement");
+    registrar("hz-gb-2312", "replacement");
+    registrar("iso-2022-cn", "replacement");
+    registrar("iso-2022-cn-ext", "replacement");
+    registrar("iso-2022-kr", "replacement");
+}
+
+void TextCodecReplacement::registerCodecs(TextCodecRegistrar registrar)
+{
+    registrar("replacement", create, 0);
+}
+
+String TextCodecReplacement::decode(const char*, size_t, bool, bool, bool& sawError)
+{
+    sawError = true;
+    if (m_sentEOF)
+        return emptyString();
+
+    m_sentEOF = true;
+    return ReplacementCharacter;
+}
+
+} // namespace WebCore

Added: trunk/Source/WebCore/platform/text/TextCodecReplacement.h (0 => 202599)


--- trunk/Source/WebCore/platform/text/TextCodecReplacement.h	                        (rev 0)
+++ trunk/Source/WebCore/platform/text/TextCodecReplacement.h	2016-06-29 01:04:05 UTC (rev 202599)
@@ -0,0 +1,51 @@
+/*
+ * Copyright (C) 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef TextCodecReplacement_h
+#define TextCodecReplacement_h
+
+#include "TextCodecUTF8.h"
+
+namespace WebCore {
+
+class TextCodecReplacement : public TextCodecUTF8 {
+public:
+    static std::unique_ptr<TextCodec> create(const TextEncoding&, const void*);
+
+    TextCodecReplacement();
+
+    static void registerEncodingNames(EncodingNameRegistrar);
+    static void registerCodecs(TextCodecRegistrar);
+
+private:
+    String decode(const char*, size_t length, bool flush, bool stopOnError, bool& sawError) override;
+
+    bool m_sentEOF { false };
+
+};
+
+} // namespace WebCore
+
+#endif /* TextCodecReplacement_h */

Modified: trunk/Source/WebCore/platform/text/TextEncoding.cpp (202598 => 202599)


--- trunk/Source/WebCore/platform/text/TextEncoding.cpp	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/platform/text/TextEncoding.cpp	2016-06-29 01:04:05 UTC (rev 202599)
@@ -47,6 +47,9 @@
     : m_name(atomicCanonicalTextEncodingName(name))
     , m_backslashAsCurrencySymbol(backslashAsCurrencySymbol())
 {
+    // Aliases are valid, but not "replacement" itself.
+    if (m_name && isReplacementEncoding(name))
+        m_name = nullptr;
 }
 
 TextEncoding::TextEncoding(const String& name)
@@ -53,6 +56,9 @@
     : m_name(atomicCanonicalTextEncodingName(name))
     , m_backslashAsCurrencySymbol(backslashAsCurrencySymbol())
 {
+    // Aliases are valid, but not "replacement" itself.
+    if (m_name && isReplacementEncoding(name))
+        m_name = nullptr;
 }
 
 String TextEncoding::decode(const char* data, size_t length, bool stopOnError, bool& sawError) const

Modified: trunk/Source/WebCore/platform/text/TextEncodingRegistry.cpp (202598 => 202599)


--- trunk/Source/WebCore/platform/text/TextEncodingRegistry.cpp	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/platform/text/TextEncodingRegistry.cpp	2016-06-29 01:04:05 UTC (rev 202599)
@@ -29,6 +29,7 @@
 
 #include "TextCodecICU.h"
 #include "TextCodecLatin1.h"
+#include "TextCodecReplacement.h"
 #include "TextCodecUserDefined.h"
 #include "TextCodecUTF16.h"
 #include "TextCodecUTF8.h"
@@ -267,6 +268,22 @@
     return canonicalEncodingName && japaneseEncodings && japaneseEncodings->contains(canonicalEncodingName);
 }
 
+bool isReplacementEncoding(const char* alias)
+{
+    if (!alias)
+        return false;
+
+    if (strlen(alias) != 11)
+        return false;
+
+    return !strcasecmp(alias, "replacement");
+}
+
+bool isReplacementEncoding(const String& alias)
+{
+    return equalLettersIgnoringASCIICase(alias, "replacement");
+}
+
 bool shouldShowBackslashAsCurrencySymbolIn(const char* canonicalEncodingName)
 {
     return canonicalEncodingName && nonBackslashEncodings && nonBackslashEncodings->contains(canonicalEncodingName);
@@ -274,6 +291,9 @@
 
 static void extendTextCodecMaps()
 {
+    TextCodecReplacement::registerEncodingNames(addToTextEncodingNameMap);
+    TextCodecReplacement::registerCodecs(addToTextCodecMap);
+
     TextCodecICU::registerEncodingNames(addToTextEncodingNameMap);
     TextCodecICU::registerCodecs(addToTextCodecMap);
 

Modified: trunk/Source/WebCore/platform/text/TextEncodingRegistry.h (202598 => 202599)


--- trunk/Source/WebCore/platform/text/TextEncodingRegistry.h	2016-06-29 00:47:35 UTC (rev 202598)
+++ trunk/Source/WebCore/platform/text/TextEncodingRegistry.h	2016-06-29 01:04:05 UTC (rev 202599)
@@ -46,6 +46,8 @@
     bool noExtendedTextEncodingNameUsed();
     bool isJapaneseEncoding(const char* canonicalEncodingName);
     bool shouldShowBackslashAsCurrencySymbolIn(const char* canonicalEncodingName);
+    bool isReplacementEncoding(const char* alias);
+    bool isReplacementEncoding(const String& alias);
 
     WEBCORE_EXPORT String defaultTextEncodingNameForSystemLanguage();
 
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to