Title: [253648] trunk
Revision
253648
Author
ysuz...@apple.com
Date
2019-12-17 14:12:00 -0800 (Tue, 17 Dec 2019)

Log Message

[JSC] 8Bit JSRopeString can contain 16Bit string in its rope
https://bugs.webkit.org/show_bug.cgi?id=205323

Reviewed by Mark Lam.

JSTests:

* stress/8bit-resolve-can-encounter-16bit-string.js: Added.
(foo):

Source/_javascript_Core:

When resolving JSRopeString, it is possible that 8Bit JSRopeString becomes 16Bit resolved JSString.
This happens when we attempt to resolve it to produce AtomicStringImpl, and 16Bit version of the
resolved content is already in AtomicStringTable. This means that 16Bit flag never changes after resolving
JSString, but that of JSRopeString is some sort of hint, which can be changed.

This means that 8Bit JSRopeString can include 16Bit JSString, since some of children can be changed from
8Bit JSRopeString to resolved 16Bit JSString. Even in that case, we can still ensure that resolved string
can be represented as 8Bit. Let's see the example.

    A => B + C, 8Bit Rope
    B => D + E, 8Bit Rope
    C => 8Bit String

And when we convert B to 16Bit String since content of `D + E` is already registered as 16Bit String in AtomicStringTable.

    A => B + C, 8Bit Rope
    B => 16Bit String
    C => 8Bit String

When resolving A, creating 8Bit string buffer is totally OK since we know that whole A string can be represented in 8Bit.
When copying the content of B into 8Bit buffer, we should ignore upper 8Bit since they must be zero.

In this patch, we completely share the implementation of resolveRopeInternalNoSubstring and resolveRopeSlowCase in 8Bit and
16Bit case: we take result buffer CharacterType, but the underlying code must check `is8Bit()` for each fiber.

* runtime/JSCJSValue.cpp:
(JSC::JSValue::dumpInContextAssumingStructure const):
* runtime/JSString.cpp:
(JSC::JSRopeString::resolveRopeInternal8 const):
(JSC::JSRopeString::resolveRopeInternal16 const):
(JSC::JSRopeString::resolveRopeInternalNoSubstring const):
(JSC::JSRopeString::resolveRopeWithFunction const):
(JSC::JSRopeString::resolveRopeSlowCase const):
(JSC::JSRopeString::resolveRopeInternal8NoSubstring const): Deleted.
(JSC::JSRopeString::resolveRopeInternal16NoSubstring const): Deleted.
(JSC::JSRopeString::resolveRopeSlowCase8 const): Deleted.
* runtime/JSString.h:

Source/WTF:

* wtf/text/StringImpl.h:
(WTF::StringImpl::copyCharacters):

Modified Paths

Added Paths

Diff

Modified: trunk/JSTests/ChangeLog (253647 => 253648)


--- trunk/JSTests/ChangeLog	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/JSTests/ChangeLog	2019-12-17 22:12:00 UTC (rev 253648)
@@ -1,3 +1,13 @@
+2019-12-17  Yusuke Suzuki  <ysuz...@apple.com>
+
+        [JSC] 8Bit JSRopeString can contain 16Bit string in its rope
+        https://bugs.webkit.org/show_bug.cgi?id=205323
+
+        Reviewed by Mark Lam.
+
+        * stress/8bit-resolve-can-encounter-16bit-string.js: Added.
+        (foo):
+
 2019-12-13  Yusuke Suzuki  <ysuz...@apple.com>
 
         [JSC] Remove JSFixedArray, and use JSImmutableButterfly instead

Added: trunk/JSTests/stress/8bit-resolve-can-encounter-16bit-string.js (0 => 253648)


--- trunk/JSTests/stress/8bit-resolve-can-encounter-16bit-string.js	                        (rev 0)
+++ trunk/JSTests/stress/8bit-resolve-can-encounter-16bit-string.js	2019-12-17 22:12:00 UTC (rev 253648)
@@ -0,0 +1,11 @@
+function foo() {
+    const s0 = ''.padStart(2049, ()=>{});
+    const s1 = s0.padStart(2050);
+    (0)[s0];
+    s1[0];
+}
+
+const s2 = [10, foo].toLocaleString();
+const s3 = eval(s2);
+s3();
+foo();

Modified: trunk/Source/_javascript_Core/ChangeLog (253647 => 253648)


--- trunk/Source/_javascript_Core/ChangeLog	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/_javascript_Core/ChangeLog	2019-12-17 22:12:00 UTC (rev 253648)
@@ -1,3 +1,48 @@
+2019-12-17  Yusuke Suzuki  <ysuz...@apple.com>
+
+        [JSC] 8Bit JSRopeString can contain 16Bit string in its rope
+        https://bugs.webkit.org/show_bug.cgi?id=205323
+
+        Reviewed by Mark Lam.
+
+        When resolving JSRopeString, it is possible that 8Bit JSRopeString becomes 16Bit resolved JSString.
+        This happens when we attempt to resolve it to produce AtomicStringImpl, and 16Bit version of the
+        resolved content is already in AtomicStringTable. This means that 16Bit flag never changes after resolving
+        JSString, but that of JSRopeString is some sort of hint, which can be changed.
+
+        This means that 8Bit JSRopeString can include 16Bit JSString, since some of children can be changed from
+        8Bit JSRopeString to resolved 16Bit JSString. Even in that case, we can still ensure that resolved string
+        can be represented as 8Bit. Let's see the example.
+
+            A => B + C, 8Bit Rope
+            B => D + E, 8Bit Rope
+            C => 8Bit String
+
+        And when we convert B to 16Bit String since content of `D + E` is already registered as 16Bit String in AtomicStringTable.
+
+            A => B + C, 8Bit Rope
+            B => 16Bit String
+            C => 8Bit String
+
+        When resolving A, creating 8Bit string buffer is totally OK since we know that whole A string can be represented in 8Bit.
+        When copying the content of B into 8Bit buffer, we should ignore upper 8Bit since they must be zero.
+
+        In this patch, we completely share the implementation of resolveRopeInternalNoSubstring and resolveRopeSlowCase in 8Bit and
+        16Bit case: we take result buffer CharacterType, but the underlying code must check `is8Bit()` for each fiber.
+
+        * runtime/JSCJSValue.cpp:
+        (JSC::JSValue::dumpInContextAssumingStructure const):
+        * runtime/JSString.cpp:
+        (JSC::JSRopeString::resolveRopeInternal8 const):
+        (JSC::JSRopeString::resolveRopeInternal16 const):
+        (JSC::JSRopeString::resolveRopeInternalNoSubstring const):
+        (JSC::JSRopeString::resolveRopeWithFunction const):
+        (JSC::JSRopeString::resolveRopeSlowCase const):
+        (JSC::JSRopeString::resolveRopeInternal8NoSubstring const): Deleted.
+        (JSC::JSRopeString::resolveRopeInternal16NoSubstring const): Deleted.
+        (JSC::JSRopeString::resolveRopeSlowCase8 const): Deleted.
+        * runtime/JSString.h:
+
 2019-12-17  Carlos Garcia Campos  <cgar...@igalia.com>
 
         [GLIB] jsc_context_evaluate_in_object should take the API lock before calling setGlobalScopeExtension

Modified: trunk/Source/_javascript_Core/runtime/JSCJSValue.cpp (253647 => 253648)


--- trunk/Source/_javascript_Core/runtime/JSCJSValue.cpp	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/_javascript_Core/runtime/JSCJSValue.cpp	2019-12-17 22:12:00 UTC (rev 253648)
@@ -274,6 +274,11 @@
                     out.print(" (symbol)");
             } else
                 out.print(" (unresolved)");
+            if (string->is8Bit())
+                out.print(",8Bit:(1)");
+            else
+                out.print(",8Bit:(0)");
+            out.print(",length:(", string->length(), ")");
             out.print(": ", impl);
         } else if (structure->classInfo()->isSubClassOf(RegExp::info()))
             out.print("RegExp: ", *jsCast<RegExp*>(asCell()));

Modified: trunk/Source/_javascript_Core/runtime/JSString.cpp (253647 => 253648)


--- trunk/Source/_javascript_Core/runtime/JSString.cpp	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/_javascript_Core/runtime/JSString.cpp	2019-12-17 22:12:00 UTC (rev 253648)
@@ -153,28 +153,9 @@
         return;
     }
     
-    resolveRopeInternal8NoSubstring(buffer);
+    resolveRopeInternalNoSubstring(buffer);
 }
 
-void JSRopeString::resolveRopeInternal8NoSubstring(LChar* buffer) const
-{
-    for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i) {
-        if (fiber(i)->isRope()) {
-            resolveRopeSlowCase8(buffer);
-            return;
-        }
-    }
-
-    LChar* position = buffer;
-    for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i) {
-        const StringImpl& fiberString = *fiber(i)->valueInternal().impl();
-        unsigned length = fiberString.length();
-        StringImpl::copyCharacters(position, fiberString.characters8(), length);
-        position += length;
-    }
-    ASSERT((buffer + length()) == position);
-}
-
 void JSRopeString::resolveRopeInternal16(UChar* buffer) const
 {
     if (isSubstring()) {
@@ -183,10 +164,11 @@
         return;
     }
     
-    resolveRopeInternal16NoSubstring(buffer);
+    resolveRopeInternalNoSubstring(buffer);
 }
 
-void JSRopeString::resolveRopeInternal16NoSubstring(UChar* buffer) const
+template<typename CharacterType>
+void JSRopeString::resolveRopeInternalNoSubstring(CharacterType* buffer) const
 {
     for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i) {
         if (fiber(i)->isRope()) {
@@ -195,7 +177,7 @@
         }
     }
 
-    UChar* position = buffer;
+    CharacterType* position = buffer;
     for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i) {
         const StringImpl& fiberString = *fiber(i)->valueInternal().impl();
         unsigned length = fiberString.length();
@@ -306,7 +288,7 @@
         }
         vm.heap.reportExtraMemoryAllocated(newImpl->cost());
 
-        resolveRopeInternal8NoSubstring(buffer);
+        resolveRopeInternalNoSubstring(buffer);
         convertToNonRope(function(newImpl.releaseNonNull()));
         return valueInternal();
     }
@@ -319,7 +301,7 @@
     }
     vm.heap.reportExtraMemoryAllocated(newImpl->cost());
     
-    resolveRopeInternal16NoSubstring(buffer);
+    resolveRopeInternalNoSubstring(buffer);
     convertToNonRope(function(newImpl.releaseNonNull()));
     return valueInternal();
 }
@@ -341,45 +323,10 @@
 // Vector before performing any concatenation, but by working backwards we likely
 // only fill the queue with the number of substrings at any given level in a
 // rope-of-ropes.)
-void JSRopeString::resolveRopeSlowCase8(LChar* buffer) const
+template<typename CharacterType>
+void JSRopeString::resolveRopeSlowCase(CharacterType* buffer) const
 {
-    LChar* position = buffer + length(); // We will be working backwards over the rope.
-    Vector<JSString*, 32, UnsafeVectorOverflow> workQueue; // Putting strings into a Vector is only OK because there are no GC points in this method.
-    
-    for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i)
-        workQueue.append(fiber(i));
-
-    while (!workQueue.isEmpty()) {
-        JSString* currentFiber = workQueue.last();
-        workQueue.removeLast();
-
-        const LChar* characters;
-        
-        if (currentFiber->isRope()) {
-            JSRopeString* currentFiberAsRope = static_cast<JSRopeString*>(currentFiber);
-            if (!currentFiberAsRope->isSubstring()) {
-                for (size_t i = 0; i < s_maxInternalRopeLength && currentFiberAsRope->fiber(i); ++i)
-                    workQueue.append(currentFiberAsRope->fiber(i));
-                continue;
-            }
-            ASSERT(!currentFiberAsRope->substringBase()->isRope());
-            characters =
-                currentFiberAsRope->substringBase()->valueInternal().characters8() +
-                currentFiberAsRope->substringOffset();
-        } else
-            characters = currentFiber->valueInternal().characters8();
-        
-        unsigned length = currentFiber->length();
-        position -= length;
-        StringImpl::copyCharacters(position, characters, length);
-    }
-
-    ASSERT(buffer == position);
-}
-
-void JSRopeString::resolveRopeSlowCase(UChar* buffer) const
-{
-    UChar* position = buffer + length(); // We will be working backwards over the rope.
+    CharacterType* position = buffer + length(); // We will be working backwards over the rope.
     Vector<JSString*, 32, UnsafeVectorOverflow> workQueue; // These strings are kept alive by the parent rope, so using a Vector is OK.
 
     for (size_t i = 0; i < s_maxInternalRopeLength && fiber(i); ++i)

Modified: trunk/Source/_javascript_Core/runtime/JSString.h (253647 => 253648)


--- trunk/Source/_javascript_Core/runtime/JSString.h	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/_javascript_Core/runtime/JSString.h	2019-12-17 22:12:00 UTC (rev 253648)
@@ -591,13 +591,11 @@
     template<typename Function> const String& resolveRopeWithFunction(JSGlobalObject* nullOrGlobalObjectForOOM, Function&&) const;
     JS_EXPORT_PRIVATE AtomString resolveRopeToAtomString(JSGlobalObject*) const;
     JS_EXPORT_PRIVATE RefPtr<AtomStringImpl> resolveRopeToExistingAtomString(JSGlobalObject*) const;
-    void resolveRopeSlowCase8(LChar*) const;
-    void resolveRopeSlowCase(UChar*) const;
+    template<typename CharacterType> NEVER_INLINE void resolveRopeSlowCase(CharacterType*) const;
+    template<typename CharacterType> void resolveRopeInternalNoSubstring(CharacterType*) const;
     void outOfMemory(JSGlobalObject* nullOrGlobalObjectForOOM) const;
     void resolveRopeInternal8(LChar*) const;
-    void resolveRopeInternal8NoSubstring(LChar*) const;
     void resolveRopeInternal16(UChar*) const;
-    void resolveRopeInternal16NoSubstring(UChar*) const;
     StringView unsafeView(JSGlobalObject*) const;
     StringViewWithUnderlyingString viewWithUnderlyingString(JSGlobalObject*) const;
 

Modified: trunk/Source/WTF/ChangeLog (253647 => 253648)


--- trunk/Source/WTF/ChangeLog	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/WTF/ChangeLog	2019-12-17 22:12:00 UTC (rev 253648)
@@ -1,3 +1,13 @@
+2019-12-17  Yusuke Suzuki  <ysuz...@apple.com>
+
+        [JSC] 8Bit JSRopeString can contain 16Bit string in its rope
+        https://bugs.webkit.org/show_bug.cgi?id=205323
+
+        Reviewed by Mark Lam.
+
+        * wtf/text/StringImpl.h:
+        (WTF::StringImpl::copyCharacters):
+
 2019-12-17  Tim Horton  <timothy_hor...@apple.com>
 
         macCatalyst: Cursor should update on mouse movement and style change

Modified: trunk/Source/WTF/wtf/text/StringImpl.h (253647 => 253648)


--- trunk/Source/WTF/wtf/text/StringImpl.h	2019-12-17 21:47:56 UTC (rev 253647)
+++ trunk/Source/WTF/wtf/text/StringImpl.h	2019-12-17 22:12:00 UTC (rev 253648)
@@ -370,8 +370,7 @@
     ALWAYS_INLINE static StringImpl* empty() { return reinterpret_cast<StringImpl*>(&s_emptyAtomString); }
 
     // FIXME: Does this really belong in StringImpl?
-    template<typename CharacterType> static void copyCharacters(CharacterType* destination, const CharacterType* source, unsigned numCharacters);
-    static void copyCharacters(UChar* destination, const LChar* source, unsigned numCharacters);
+    template<typename SourceCharacterType, typename DestinationCharacterType> static void copyCharacters(DestinationCharacterType* destination, const SourceCharacterType* source, unsigned numCharacters);
 
     // Some string features, like reference counting and the atomicity flag, are not
     // thread-safe. We achieve thread safety by isolation, giving each thread
@@ -1074,21 +1073,25 @@
     m_refCount = tempRefCount;
 }
 
-template<typename CharacterType> inline void StringImpl::copyCharacters(CharacterType* destination, const CharacterType* source, unsigned numCharacters)
+template<typename SourceCharacterType, typename DestinationCharacterType>
+inline void StringImpl::copyCharacters(DestinationCharacterType* destination, const SourceCharacterType* source, unsigned numCharacters)
 {
-    if (numCharacters == 1) {
-        *destination = *source;
-        return;
+    static_assert(std::is_same<SourceCharacterType, LChar>::value || std::is_same<SourceCharacterType, UChar>::value);
+    static_assert(std::is_same<DestinationCharacterType, LChar>::value || std::is_same<DestinationCharacterType, UChar>::value);
+    if constexpr (std::is_same<SourceCharacterType, DestinationCharacterType>::value) {
+        if (numCharacters == 1) {
+            *destination = *source;
+            return;
+        }
+        memcpy(destination, source, numCharacters * sizeof(DestinationCharacterType));
+    } else {
+        // FIXME: We should ensure that UChar -> LChar copying happens when UChar only contains Latin-1.
+        // https://bugs.webkit.org/show_bug.cgi?id=205355
+        for (unsigned i = 0; i < numCharacters; ++i)
+            destination[i] = source[i];
     }
-    memcpy(destination, source, numCharacters * sizeof(CharacterType));
 }
 
-ALWAYS_INLINE void StringImpl::copyCharacters(UChar* destination, const LChar* source, unsigned numCharacters)
-{
-    for (unsigned i = 0; i < numCharacters; ++i)
-        destination[i] = source[i];
-}
-
 inline UChar StringImpl::at(unsigned i) const
 {
     ASSERT_WITH_SECURITY_IMPLICATION(i < m_length);
_______________________________________________
webkit-changes mailing list
webkit-changes@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to