rusackas opened a new pull request, #41533: URL: https://github.com/apache/superset/pull/41533
### SUMMARY Non-ASCII text inside **array / struct / JSON column values** (e.g. the CJK and Cyrillic strings produced by `array_agg`) was displayed as `\uXXXX` escape sequences in SQL Lab, Explore, and on dashboards — the "unicode gibberish" reported in #19388 and #22904. Plain string columns were never affected; only nested values that get JSON-serialized for the result grid. The fix is deliberately narrow: - `superset.utils.json.dumps` gains an opt-in `ensure_ascii: bool = True` parameter. The default is unchanged, so metadata serialization keeps escaping non-ASCII for narrow charset columns (notably MySQL `utf8`/utf8mb3). In particular the `position_json` emoji-escaping from #39737 stays intact. - Only the result-set `stringify` path (`superset/result_set.py`) opts into `ensure_ascii=False`. That is the single, DRY chokepoint through which array/struct values flow to SQL Lab, Explore and dashboards. It affects the query result payload only — never anything persisted to the metadata database. This avoids the breaking change / MySQL charset migration that a global `ensure_ascii=False` would have required, and it leaves no regression against the #39737 emoji-truncation guard. This **supersedes #33720** (same goal, narrower implementation), whose author appears to be MIA. Credit to @Quatters, retained as co-author on the commit. ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF Before (array_agg of Cyrillic text in SQL Lab): ``` ["Лонгсливы", "Свитшоты"] ``` After: ``` ["Лонгсливы", "Свитшоты"] ``` ### TESTING INSTRUCTIONS Automated (unit): ```bash pytest tests/unit_tests/result_set_test.py tests/unit_tests/utils/json_tests.py ``` `test_stringify_values_preserves_non_ascii_characters` reproduces both linked issues and fails without the fix. The #39737 emoji tests (`tests/integration_tests/dashboards/test_update_emoji.py`) remain green since the metadata path is untouched. Manual: 1. Point Superset at a Postgres analytics DB containing non-ASCII text. 2. In SQL Lab, run a query that wraps the text in `array_agg(...)` (or select an array/JSON column). 3. Confirm the result grid shows the characters verbatim, not `\uXXXX`. 4. Repeat via Explore / a dashboard table chart. ### ADDITIONAL INFORMATION - [x] Has associated issue: Closes #19388, Closes #22904 - [ ] Required feature flags: - [ ] Changes UI - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351)) - [ ] Migration is atomic, supports rollback & is backwards-compatible - [ ] Confirm DB migration upgrade and downgrade tested - [ ] Runtime estimates and downtime expectations provided - [ ] Introduces new feature or API - [ ] Removes existing feature or API 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
