Ranjithsiji added a comment.
In Malayalam Unicode there is a character called Chillu <https://unicode.org/L2/L2005/05214-chillu.pdf>. (ൺ,ൻ,ർ,ൽ,ൾ) are the chillu in Malayalam. **Now the problem, ** according to Unicode encoding these **chillu can be created by two ways**. The Malayalam Unicode table <https://jrgraphix.net/r/Unicode/0D00-0D7F> (enwiki <https://en.wikipedia.org/wiki/Malayalam_(Unicode_block)>) says - ൺ -> 0d7a, - ൻ -> 0d7b, - ർ -> d7c, - ൽ ->0d7d, - ൾ ->0d7e, - ൿ -> 0d7f are chillu. **These are called atomic chillu**. But there is another way to create a chillu like ന+്+zwj -> 0d28+0d4d+200D. That is [na ന]+[virāma ്]+[ZWJ]. The last thing is called Zero Width Joiner <https://en.wikipedia.org/wiki/Zero-width_joiner> aka ZWJ (Unicode website <https://unicode.org/L2/L2005/05307-zwj-zwnj.pdf>). So we can represent a single character in two different ways, aka two different letter sequence in Malayalam. And we are trying to avoid that. Because this cause a big problem with search and hyperlinks. To resolve this problem there was a fix introduced in Mediawiki called $wgFixMalayalamUnicode <https://www.mediawiki.org/wiki/Manual:$wgFixMalayalamUnicode>. If this set as true then all the chillu created by ZWJ will be replaced by the single character chillu (atomic chillu) on save of a page in mediawiki. **This is happening in Malayalam wikipedia and not happening in Wikidata.** So there is a good mix of ZWJ chillu and Atomic chillu in Wikidata. **What may be the solution** Enable the $wgFixMalayalamUnicode <https://www.mediawiki.org/wiki/Manual:$wgFixMalayalamUnicode> in Mediawiki used in Wikidata. (But $wgFixMalayalamUnicode is depricated. The release note of 1.35 <https://www.mediawiki.org/wiki/Release_notes/1.35> says it is true by default.) Do a simple edit on all pages which uses ZWJ Chillu in Wikidata. So How to resolve this problem? TASK DETAIL https://phabricator.wikimedia.org/T266955 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ranjithsiji Cc: Ranjithsiji, Adithyak1997, Aklapper, Akuckartz, Nandana, Akhiljaxxn, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs