[Wikitech-l] Re: Avoid invisible characters in page titles

2023-01-17 Thread Brian Wolff
Even in english, you still have emoiji that use ZWJ characters. e.g. ️‍ has "invisible" characters. There are all sorts of control characters in unicode that usually do nothing but sometimes do something. There may be some invisible characters that make sense to strip or normalize. Indeed we

[Wikitech-l] Re: Avoid invisible characters in page titles

2023-01-17 Thread Amir Sarabadani
Disallowing invisible characters or cleaning them is a bad idea. Invisible characters are actually heavily used in many languages including Persian (and part of the official manual of style of the language taught in schools) it is downright wrong to check and fix those in many wikis in those

[Wikitech-l] Re: Avoid invisible characters in page titles

2023-01-17 Thread Martin Domdey
Thank you, it seems that there's nobody working on it anymore, right? Kind regards, Martin ... Am Di., 17. Jan. 2023 um 12:28 Uhr schrieb Andre Klapper < aklap...@wikimedia.org>: > Hi, > > On Tue, 2023-01-17 at 12:03 +0100, Martin Domdey wrote: > > isn't it better to avoid invisible

[Wikitech-l] Re: Avoid invisible characters in page titles

2023-01-17 Thread Andre Klapper
Hi, On Tue, 2023-01-17 at 12:03 +0100, Martin Domdey wrote: > isn't it better to avoid invisible characters in page titles > while creating the pages?  > > Please look here, there has been problems with invisible characters > working with it when parsing or page linking those page titles with >