https://bugzilla.wikimedia.org/show_bug.cgi?id=58784
Web browser: --- Bug ID: 58784 Summary: jsub and utf8 Product: Wikimedia Labs Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Unprioritized Component: tools Assignee: m...@uberbox.org Reporter: sigm...@gmail.com CC: benap...@gmail.com, t...@tim-landscheidt.de Classification: Unclassified Mobile Platform: --- A Python 3 script containing this code was executed with jsub: import sys print(sys.stdout.encoding) The resulting .out file contained "ANSI_X3.4-1968". Normally, people set the encoding to utf8. When people assume that the encoding is utf8, but it isn't, terrible things happen. Another Python 3 script containing this code was executed with jsub: print("Talk:Gülen movement") The resulting .err file contained this: Traceback (most recent call last): File "...", line 5, in <module> print("Talk:G\xfclen movement") UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 6: ordinal not in range(128) jsub is written in Perl, which is perfectly capable of using utf8 as its output encoding. Unicode is important enough to all of us, which leads me to propose that jsub be edited for this. I am not an expert with Perl, but I would try to add "use utf8;\nuse open qw/:std :utf8/;" to the top of the file, right under "use warnings;". On a slightly related note, scripts running as regular CGI also use the "ANSI_X3.4-1968" encoding. This may be out of scope of this bug though. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l