Re: [PR] fix(result_set): preserve unicode characters in stringified nested va… [superset]

via GitHub Wed, 13 May 2026 17:15:29 -0700


shojiiii commented on code in PR #39712:
URL: https://github.com/apache/superset/pull/39712#discussion_r3238211920



##########
tests/unit_tests/result_set_test.py:
##########
@@ -146,6 +147,49 @@ def test_stringify_with_null_timestamps():
     assert np.array_equal(result_set, expected)
 
 
[email protected](
+    ("nested_value", "expected"),
+    [
+        pytest.param(
+            ["ASCII", "plain text"],
+            '["ASCII", "plain text"]',
+            id="ascii",
+        ),
+        pytest.param(
+            ["日本語", "ひらがな"],
+            '["日本語", "ひらがな"]',
+            id="japanese",
+        ),
+        pytest.param(
+            ["móre", "áccent"],
+            '["móre", "áccent"]',
+            id="accented-latin",
+        ),
+        pytest.param(
+            ["emoji", "😁"],
+            '["emoji", "😁"]',
+            id="emoji",
+        ),
+    ],
+)
+def test_stringify_nested_values_preserves_unicode(
+    nested_value: list[str], expected: str
+) -> None:
+    """
+    Nested values should be stringified without escaping Unicode characters.
+    """
+
+    data = [(nested_value,)]
+    description: DbapiDescription = [
+        ("tags", "ARRAY<STRING>", None, None, None, None, False)
+    ]
+
+    result_set = SupersetResultSet(data, description, BaseEngineSpec)
+    df = result_set.to_pandas_df()
+
+    assert df["tags"].iloc[0] == expected

Review Comment:
   Updated the test to avoid asserting exact JSON formatting in 
[ff5ae81](https://github.com/apache/superset/pull/39712/commits/ff5ae81619e5408bb2a7a5d156693c1f02a9c4db).
     
   It now checks semantic equality via json.loads(), verifies that the 
serialized value does not contain \u escapes, and confirms that the Unicode 
characters a represent in the raw string.



##########
tests/unit_tests/result_set_test.py:
##########
@@ -146,6 +147,49 @@ def test_stringify_with_null_timestamps():
     assert np.array_equal(result_set, expected)
 
 
[email protected](
+    ("nested_value", "expected"),
+    [
+        pytest.param(
+            ["ASCII", "plain text"],
+            '["ASCII", "plain text"]',
+            id="ascii",
+        ),
+        pytest.param(
+            ["日本語", "ひらがな"],
+            '["日本語", "ひらがな"]',
+            id="japanese",
+        ),
+        pytest.param(
+            ["móre", "áccent"],
+            '["móre", "áccent"]',
+            id="accented-latin",
+        ),
+        pytest.param(
+            ["emoji", "😁"],
+            '["emoji", "😁"]',
+            id="emoji",
+        ),
+    ],
+)

Review Comment:
   Updated the test to avoid asserting exact JSON formatting in 
[ff5ae81](https://github.com/apache/superset/pull/39712/commits/ff5ae81619e5408bb2a7a5d156693c1f02a9c4db).
     
   It now checks semantic equality via json.loads(), verifies that the 
serialized value does not contain \u escapes, and confirms that the Unicode 
characters a represent in the raw string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix(result_set): preserve unicode characters in stringified nested va… [superset]

Reply via email to