garydgregory commented on code in PR #1719:
URL: https://github.com/apache/commons-lang/pull/1719#discussion_r3444044104


##########
src/test/java/org/apache/commons/lang3/StringUtilsTest.java:
##########
@@ -3054,6 +3054,18 @@ void testTruncate_StringIntInt() {
         assertEquals("", StringUtils.truncate("abcdefghijklmno", 
Integer.MAX_VALUE, Integer.MAX_VALUE));
     }
 
+    @Test
+    void testTruncate_StringIntInt_surrogatePair() {
+        // U+1F600 GRINNING FACE is a single supplementary code point stored 
as a surrogate pair
+        final String grin = "😀";
+        // a cut that would land between the two halves keeps the result well 
formed instead of emitting a lone surrogate
+        assertEquals("a", StringUtils.truncate("a" + grin + "b", 0, 2));
+        assertEquals(grin, StringUtils.truncate("a" + grin + "b", 1, 2));
+        assertEquals("ab", StringUtils.truncate("ab" + grin, 0, 3));
+        // an offset that lands inside a pair skips the orphaned low surrogate
+        assertEquals("a", StringUtils.truncate(grin + "ab", 1, 2));

Review Comment:
   You're missing assertions for an input string that _only_ contains the 
`grin`, with permutations of the other parameters.
   



##########
src/test/java/org/apache/commons/lang3/StringUtilsAbbreviateTest.java:
##########
@@ -170,6 +171,31 @@ void testAbbreviate_StringStringIntInt() {
         assertEquals("....fg", StringUtils.abbreviate("abcdefg", "....", 5, 
6));
     }
 
+    @Test
+    void testAbbreviateSurrogatePair() {
+        // U+1F600 GRINNING FACE is a single supplementary code point stored 
as a surrogate pair
+        final String grin = "😀";
+        // the head cut backs off the pair so the marker is never preceded by 
a lone high surrogate
+        assertEquals("...", StringUtils.abbreviate(grin + "abcdef", 4));
+        assertEquals(grin + "...", StringUtils.abbreviate(grin + "abcdef", 5));
+        // a trailing supplementary code point is kept whole rather than 
sliced into a lone low surrogate
+        assertEquals("..." + grin, StringUtils.abbreviate("abcdef" + grin, 6, 
5));
+        // results stay within maxWidth and never contain an unpaired surrogate
+        for (int width = 4; width <= 8; width++) {
+            final String result = StringUtils.abbreviate("a" + grin + "b" + 
grin + "cd", width);
+            assertTrue(result.length() <= width, () -> "result longer than 
maxWidth: " + result);
+            for (int i = 0; i < result.length(); i++) {
+                final char ch = result.charAt(i);
+                if (Character.isHighSurrogate(ch)) {
+                    assertTrue(i + 1 < result.length() && 
Character.isLowSurrogate(result.charAt(i + 1)), "lone high surrogate in: " + 
result);
+                    i++; // skip the paired low surrogate
+                } else {
+                    assertFalse(Character.isLowSurrogate(ch), "lone low 
surrogate in: " + result);
+                }
+            }
+        }
+    }
+

Review Comment:
   You're missing assertions for an input string that _only_ contains the 
`grin`, with permutations of the other parameters.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to