| jayvdb added a comment. |
I agree with @valhallasw that this should be page based resumption.
The simplest solution (MVP) would, on pause, continue to run the existing generator, and write out a page title list to a file instead of processing the pages.
Then the resumption would use the page title list in the stored file using -file:xxx.
A better implementation would, on pause, store the next page title to be processed, and all of the generator arguments.
Then the resumption would create a new generator which resumes where the last generator stopped.
We need both approaches (e.g. the former is better if the input generator was -file:xxx, and the second approach will not be possible for some API generators that do not support a 'start from' title argument.
And there are some generators that are not pause/resume-able, like -random, in which case the user should simply re-run their original command to continue.
Resumption at the http/network level depends on the MediaWiki api maintaining/respecting old continuation data. This is not unreasonable in many cases, as the API continuation data is often page titles, etc.
However, while we may be able to pause/resume the http API process, the user may have hit pause at the first record of 5000 records in the last http API resultset, so that implementation will still need to handle the case of injecting cached data into the API layer before switching to resuming fetching from the http layer.
Cc: valhallasw, Aklapper, jayvdb, Zppix, DrTrigon, pywikibot-bugs-list, Pahadiahimanshu, Manrajsinghgrover, Mdupont, Jay8g
_______________________________________________ pywikibot-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs
