Fhocutt has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/199814

Change subject: Make non-Latin characters display in json reports - WIP
......................................................................

Make non-Latin characters display in json reports - WIP

WIP. Does not yet display the *correct* non-Latin characters.

Add new utility methods to wikimetrics/utils.py that enable the display of
accented and other non-ascii characters in usernames in the json reports by
setting ensure_ascii=False in json.dumps (defaults to True).

Modify report_result_json in reports.py to call the new unicode_json_response
method and add_user_names_to_json to decode the bytestrings stored in the
WikiUserKey to UTF-8. (This part probably still has problems.)

This patch allows non-ascii characters to be printed, but the names
"Skalníková" and "Castelán" display as "Skalníková" and "Castelán".
There are still encoding problems.

Bug: T93023
Change-Id: Ide98e20eb54523353153ccd212df511a9298bd16
---
M wikimetrics/controllers/reports.py
M wikimetrics/utils.py
2 files changed, 15 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/wikimetrics 
refs/changes/14/199814/1

diff --git a/wikimetrics/controllers/reports.py 
b/wikimetrics/controllers/reports.py
index 92e0cfe..989a2b3 100644
--- a/wikimetrics/controllers/reports.py
+++ b/wikimetrics/controllers/reports.py
@@ -12,7 +12,7 @@
     Report, RunReport, ReportStore, WikiUserStore, WikiUserKey,
 )
 from wikimetrics.utils import (
-    json_response, json_error, json_redirect, thirty_days_ago, stringify
+    json_response, json_error, json_redirect, thirty_days_ago, stringify, 
unicode_json_response
 )
 from wikimetrics.enums import Aggregation, TimeseriesChoices
 from wikimetrics.exceptions import UnauthorizedReportAccessError
@@ -383,7 +383,7 @@
             user_names = get_usernames_for_task_result(result[result_key])
             json_result_with_names = add_user_names_to_json(json_result,
                                                             user_names)
-            return json_response(json_result_with_names)
+            return unicode_json_response(json_result_with_names)
         else:
             return json_response(json_result)
     else:
@@ -400,8 +400,8 @@
     """
     new_individual_ids = {}
     for individual in json_result['result'][Aggregation.IND]:
-        user_name = user_names[WikiUserKey.fromstr(individual)]
-        new_id_string = '{}|{}'.format(user_name, individual)
+        user_name = user_names[WikiUserKey.fromstr(individual)].decode('utf-8')
+        new_id_string = u'{}|{}'.format(user_name, individual.decode('utf-8'))
         new_individual_ids[individual] = new_id_string
 
     json_with_names = deepcopy(json_result)
diff --git a/wikimetrics/utils.py b/wikimetrics/utils.py
index 7d4d658..a6171a0 100644
--- a/wikimetrics/utils.py
+++ b/wikimetrics/utils.py
@@ -43,6 +43,17 @@
     return date_object.strftime(PRETTY_TIMESTAMP)
 
 
+def unicode_json_string(obj):
+    return json.dumps(obj, cls=BetterEncoder, indent=4, ensure_ascii=False)
+
+def unicode_stringify(*args, **kwargs):
+    return unicode_json_string(dict(*args, **kwargs))
+
+def unicode_json_response(*args, **kwargs):
+    data = unicode_stringify(*args, **kwargs)
+    return Response(data, mimetype='application/json')
+
+
 def json_string(obj):
     return json.dumps(obj, cls=BetterEncoder, indent=4)
 

-- 
To view, visit https://gerrit.wikimedia.org/r/199814
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ide98e20eb54523353153ccd212df511a9298bd16
Gerrit-PatchSet: 1
Gerrit-Project: analytics/wikimetrics
Gerrit-Branch: master
Gerrit-Owner: Fhocutt <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to