[Wikidata-bugs] [Maniphest] [Commented On] T161263: Wikidata does not accept characters ending in \x85 (Cyrillic х, Armenian Յ, Arabic م etc.) in labels/aliases/descriptions
Drbug added a comment. I'm now even more convinced that the problem is with the code that replaces 0x85 (NEL) with 0x0D+0x0A (CR+LF). Because xD1 x0D (or xD1 x0A), xD3 x0D (or xD3 x0A), etc. are malformed UTF-8 sequences indeed.TASK DETAILhttps://phabricator.wikimedia.org/T161263EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: thiemowmde, DrbugCc: thiemowmde, Lydia_Pintscher, aude, gerritbot, Ladsgroup, Lea_Lacroix_WMDE, Drbug, MaxBioHazard, Mahir256, TerraCodes, Jay8g, Aklapper, Base, NickK, Adik2382, Jrbranaa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T161263: Wikidata does not accept lowercase Cyrillic х and several characters of other alphabets in labels/aliases/descriptions
Drbug added a comment. May it be related to the fact that Unicode NEL character (Next Line) is U+0085? Hence, it should be 0xC2 0x85, but some code that checks for new lines might check just 0x85 instead by mistake.TASK DETAILhttps://phabricator.wikimedia.org/T161263EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DrbugCc: Drbug, MaxBioHazard, Mahir256, TerraCodes, Jay8g, Aklapper, Base, NickK, Jrbranaa, QZanden, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs