[Pywikipedia-bugs] [Maniphest] [Changed CC] T76391: Only load item in harvest_template.py when needed

jayvdb Mon, 01 Dec 2014 20:18:07 -0800

jayvdb added a subscriber: jayvdb.
jayvdb added a comment.

item.get is needed before item.claims can be accessed (on the next line).


We could 
1) replace item.claims with a different API call that only gets the list of 
properties used on the item
2) extend option 1 to be a generic approach to lazy load item data
3) move "has claims for all properties" check further down in the process.

The problem with option 3 is that immediately after this check, 
harvest_template does a page.get() , and page.get() is probably more expensive 
than item.get(), at least on English Wikipedia where article text size exceeds 
typical wikidata item JSON size.  This may not be true for smaller wikis where 
the average article text size is smaller (but I would expect it is true for 
most of the top 10 wikipedia)

TASK DETAIL
  https://phabricator.wikimedia.org/T76391

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

To: Pywikibugs, jayvdb
Cc: pywikipedia-bugs, Multichill, jayvdb



_______________________________________________
Pywikipedia-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-bugs

[Pywikipedia-bugs] [Maniphest] [Changed CC] T76391: Only load item in harvest_template.py when needed

Reply via email to