[Wikidata-bugs] [Maniphest] [Commented On] T161263: Wikidata does not accept characters ending in \x85 (Cyrillic х, Armenian Յ, Arabic م etc.) in labels/aliases/descriptions

2017-03-24 Thread Drbug
Drbug added a comment.
I'm now even more convinced that the problem is with the code that replaces 0x85 (NEL) with 0x0D+0x0A (CR+LF).
Because xD1 x0D (or xD1 x0A), xD3 x0D (or xD3 x0A), etc. are malformed UTF-8 sequences indeed.TASK DETAILhttps://phabricator.wikimedia.org/T161263EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: thiemowmde, DrbugCc: thiemowmde, Lydia_Pintscher, aude, gerritbot, Ladsgroup, Lea_Lacroix_WMDE, Drbug, MaxBioHazard, Mahir256, TerraCodes, Jay8g, Aklapper, Base, NickK, Adik2382, Jrbranaa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T161263: Wikidata does not accept lowercase Cyrillic х and several characters of other alphabets in labels/aliases/descriptions

2017-03-24 Thread Drbug
Drbug added a comment.
May it be related to the fact that Unicode NEL character (Next Line) is U+0085?
Hence, it should be 0xC2 0x85, but some code that checks for new lines might check just 0x85 instead by mistake.TASK DETAILhttps://phabricator.wikimedia.org/T161263EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DrbugCc: Drbug, MaxBioHazard, Mahir256, TerraCodes, Jay8g, Aklapper, Base, NickK, Jrbranaa, QZanden, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs