Platonides wrote: > Nicolas Dumazet wrote: >> Honestly, the pywikipedia team has a bit changed these last months, >> and the API edit will soon be available : I've been telling myself for >> days that interwiki.py will need sooner or later a rewrite. But this >> is not this easy. >> >> I understand your concept of "interwiki class", but finding such a >> class does not appear to be this obvious. >> >> If you have a general pseudo-algorithm being able to outline a >> specific class of articles on the same subject, please share it. But I >> think that the actual behavior -- starting from a specific page, >> building the interwikik links graph, and indexing the cycles -- if not >> optimal, can not be avoided this easily. > > No, it can't be avoided, but Purodha is right in that using the > toolserver dbs would be faster. Now, i don't know how is interwiki.py > structured, buy i think it claims for different pluggable modules for > whatever is doing get_interwikis_from_page() So you could have one > acting as it's now, another obtaining the data via the API, and yet > another one directly querying the langlinks table at the toolserver. > > Directly querying the langlinks table not only saves time querying the > wiki, but allows for querying interwikis for only those wikis you're > writing to. > This also opens the ability of completely changing the source wiki > concept, and going instead querying each wiki db for links to a target wiki.
Coincidentally, yesterday I released a MediaWiki extension which, if accepted on Wikimedia projects, may make interwiki bots much less busy. See http://meta.wikimedia.org/wiki/A_newer_look_at_the_interwiki_link _______________________________________________ Toolserver-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/toolserver-l
