Carlb added a comment.
OK, so what happened?
- The script retrieved [[pt:*]] and found a bunch of interwikis on that page:
[[en:*]] [[`~:*]] [[de:Asterix]] [[fr:Astérix]] [[it:Asterix (fumetto)]]
[[ja:*]] [[nl:Asterix]] [[pl:Asterix]]
- As [[pt:*]] already has a Wikibase item, it tried to follow each of the
interwikis on the page to see if they could be merged to the existing item
- self.try_to_merge(item) calls self.get_items() to retrieve the Wikibase
Q-item number for every one of those other pages. Presumably, if it comes back
with more than one Q-item number, that's a conflicting link (as appeared in the
"Weird Al" Yankovic page example a few lines earlier) so the script will skip
those. That seems to be the only reason it's retrieving all those items.
- get_items() finds no repository at all on fr:uncyc (which is true, because
it's an externally-hosted project). It should just treat that as their being no
Q-item linked from the French page, but it doesn't do that... it fails to
handle the error and exits.
So now what? If scripts/interwikidata.py lines 156-169 look like this:
def get_items(self):
"""Return all items of pages linked through the interwiki."""
wd_data = set()
for iw_page in self.iwlangs.values():
if not iw_page.exists():
warning('Interwiki {} does not exist, skipping...'
.format(iw_page.title(as_link=True)))
continue
try:
wd_data.add(pywikibot.ItemPage.fromPage(iw_page))
except pywikibot.NoPage:
output('Interwiki {} does not have an item'
.format(iw_page.title(as_link=True)))
return wd_data
then there's a handler for NoPage but not one for an externally-hosted
project having no direct access to the repo.
Change that routine to this and the script will run:
def get_items(self):
"""Return all items of pages linked through the interwiki."""
wd_data = set()
print 'get_items: ',self.iwlangs,' : ' ,self.iwlangs.values()
for iw_page in self.iwlangs.values():
if not iw_page.exists():
warning('Interwiki {} does not exist, skipping...'
.format(iw_page.title(as_link=True)))
continue
try:
print ('- wd_data ',wd_data)
print ('- adding ',pywikibot.ItemPage.fromPage(iw_page))
wd_data.add(pywikibot.ItemPage.fromPage(iw_page))
except pywikibot.NoPage:
output('Interwiki {} does not have an item'
.format(iw_page.title(as_link=True)))
except pywikibot.WikiBaseError:
output('Site {} has no Wikibase repository'
.format(iw_page.title(as_link=True)))
print ('wd_data: ',wd_data)
return wd_data
as a WikiBaseError (which will occur if a wiki has no repo access) will be
treated the same way as the page being missing or containing no Wikibase link.
TASK DETAIL
https://phabricator.wikimedia.org/T221556
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Carlb
Cc: Dvorapa, Ladsgroup, Xqt, Aklapper, pywikibot-bugs-list, Carlb, alaa_wmde,
DannyS712, Nandana, Wenyi, Lahi, Gq86, GoranSMilovanovic, QZanden, Tbscho,
MayS, LawExplorer, Mdupont, JJMC89, _jensen, rosalieper, Avicennasis,
mys_721tx, Wikidata-bugs, aude, jayvdb, Dalba, Masti, Alchimista, Mbch331, Rxy
_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs