A third approach: list them on 10 or 20 pages by 10000 or 5000 items, and monitor the connected changes.
Roy Smith <r...@panix.com> ezt írta (időpont: 2025. jan. 15., Sze, 1:26): > I've got a list of O(100,000) pages. I want to get a feed of all the > edits to those pages in real time. What's the most efficient way? The two > strategies I can see are Site.recentchanges() and filter them on the client > side, or put all the pages on my watchlist and call Site.watchlist_revs() > periodically. Is there something more clever than those? > Back-of-the-envelope, using numbers from > https://stats.wikimedia.org/#/en.wikipedia.org, 6 million edits per month > is 200k per day or 2-3 per second. So the recent_changes method doesn't > seem too bad, but who knows... > > The goal here is to build a real-time activity ticker to display at > Wikipedia:Meetup/NYC/Wikipedia > Day 2025 > <https://en.wikipedia.org/wiki/Wikipedia:Meetup/NYC/Wikipedia_Day_2025>. > Maybe with some snazzy graphical front end. The set of pages is everything > that's (recursively) a member of Category:New York City > <https://en.wikipedia.org/wiki/Category:New_York_City> although I expect > I'll probably apply some sort of filter like "has coordinates within the > geographic limits of NYC". I'll prep the set off-line, so at the point > where I'm running this, I'll have a pre-computed fixed set of pages. > _______________________________________________ > pywikibot mailing list -- pywikibot@lists.wikimedia.org > Public archives at > https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/message/J6L6I5XOD3FBYUQS3LFHTVNYJANKRSFQ/ > To unsubscribe send an email to pywikibot-le...@lists.wikimedia.org > -- Bináris
_______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/message/VYEV2VAHX7RJCSQKH7WJJCBJKTIQZTPA/ To unsubscribe send an email to pywikibot-le...@lists.wikimedia.org