matej_suchanek created this task.
matej_suchanek added projects: Pywikibot, Pywikibot-Wikidata, Regression, 
Performance.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper.

TASK DESCRIPTION
  Run this code:
  
    >>> import pywikibot
    >>> repo = pywikibot.Site('wikidata', 'wikidata')
    >>> item = pywikibot.ItemPage(repo, 'Q16503')
    >>> data = item.get()
  
  The last line will take many seconds while the respective API 
<https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&ids=Q16503&redirects=yes&props=info%7Csitelinks%7Caliases%7Clabels%7Cdescriptions%7Cclaims%7Cdatatype>
 call takes a while. The reason is that during this operation all sitelinks are 
initialized AND (some of them) parsed in `SiteLink._parse_namespace` which a) 
creates a new site object via `APISite.fromDBName` (not a cached one as 
`pywikibot.Site` would do), b) does an API call for each site to get the 
namespace information (this can be very slow for many sites). Note that 
combination of both caused my bot to crash on `MemoryError`, with trace to 
these methods.
  
  This all is quite unexpected for bot operators who don't care about sitelinks 
(or who do but not about what namespace they link to). Some lazy initialization 
should be introduced, probably in all fromDBName, SiteLink and ItemPage.

TASK DETAIL
  https://phabricator.wikimedia.org/T226157

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: matej_suchanek
Cc: Lokal_Profil, Aklapper, matej_suchanek, pywikibot-bugs-list, Viztor, 
DannyS712, Wenyi, Darkminds3113, Jayprakash12345, Tbscho, MayS, Vali.matei, 
Mdupont, JJMC89, Dvorapa, Altostratus, Avicennasis, Volker_E, Wong128hk, 
mys_721tx, GWicke, Dinoguy1000, jayvdb, Ricordisamoa, Masti, Alchimista, Rxy, 
Jay8g
_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to