jayvdb added a comment.

In https://phabricator.wikimedia.org/T102461#1465867, @XZise wrote:

> Even if the current MediaWiki implementation is returning normalized strings, 
> don't we still need to normalize any other title not returned by the API?


There will certainly be problems if a Link has a unnormalised unicode title.
The main problems will exist because the API does unicode normalisation of all 
input parameters and output result, but does not provide information about 
normalisation that occurred: T29849: API: add normalized info also for unicode 
normalization of titles <https://phabricator.wikimedia.org/T29849> .

To workaround that, Pywikibot would need to ask the API about each Link to 
determine the Link has the correct normalised title; i.e. its title matches the 
API's title for the same patch.  I see that for Malayalam and Arabic, 
Language::normalize is performing other conversions, which are probably not 
included in `unicodedata.normalize`.  We already have another case of title 
normalisation occurring server side which broke out Link object as it doesnt 
and couldnt detect the correct title - I cant find the task now - will need to 
search more.


TASK DETAIL
  https://phabricator.wikimedia.org/T102461

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jayvdb
Cc: XZise, Ricordisamoa, gerritbot, Aklapper, jayvdb, pywikibot-bugs-list, 
Anshoe, Malyacko, P.Copp



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to