Hi Binaris
I did not write the any of the threaded stuff in wikipedia.py but I
have used it a couple of times. I think what you should do is provide
a callable _object_ and not a callback function. You can then iterate
through the list of callback objects and look at the errors if there
are any. Here is a sample program I wrote to illustrate the concept:
import wikipedia as pywikibot
from time import sleep
pages = [
'User:HRoestBot/CallbackTest1',
'User:HRoestBot/CallbackTest2',
]
class CallbackObject(object):
def __init__(self):
self.done = False
def __call__(self, page, error):
self.page = page
self.error = error
self.done = True
Callbacks = []
for mypage in pages:
print(mypage);
callb = CallbackObject()
page = pywikibot.Page(pywikibot.getSite(), mypage)
Callbacks.append(callb)
page.put_async('some text', callback=callb)
# Waiting until all pages are saved on Wikipedia
while True:
if all( [c.done for c in Callbacks] ): break
print "Still Waiting"
sleep(5)
# Now we can look at the errors
for obj in Callbacks:
print obj.page, obj.error
if not obj.error is None:
# do something to handle errors
The output of such a program may then be
$ python test.py
unicode test: triggers problem #3081100
HRoestBot/CallbackTest1
HRoestBot/CallbackTest2
Sleeping for 4.0 seconds, 2012-02-24 09:32:57
Still Waiting
Still Waiting
Updating page [[HRoestBot/CallbackTest1]] via API
Still Waiting
Sleeping for 19.3 seconds, 2012-02-24 09:33:18
Still Waiting
Updating page [[HRoestBot/CallbackTest2]] via API
Still Waiting
[[de:HRoestBot/CallbackTest1]] An edit conflict has occured.
[[de:HRoestBot/CallbackTest2]] An edit conflict has occured.
hr@hr:~/projects/private/pywikipedia_gitsvn$
At least that is how I do it. I hope that helps to understand. You can
also use pywikibot.page_put_queue.qsize() and
pywikibot.page_put_queue.empty() to check whether the queue is empty
or not but this might still lead to problems because the page is
fetched from the queue and *then* page.put is called on it. So until
page.put() finishes, the queue will be empty even though the bot is
still putting the page. See the function def async_put(), it seems to
me much safer to rely on the Callback objects to be sure that all the
put-calls are done.
You can also look at _flush() method in wikipedia.py to see how it
determines whether all pages are put and its save to exit or not.
Hannes
On 23 February 2012 21:30, Bináris <[email protected]> wrote:
> I made a big effort to understand this stuff with put_async and threading,
> but here is a point I can't get over.
>
> I read a lot and understood that means of waiting for a thread is join, and
> in wikipedia.py, join must be a method of _putthread which is the Thread
> object. Noe, wherever I write the line
> _putthread.join()
> (I tried put_async, async_put and even replace.py which I know is not a good
> solution) it freezes my command window as if the thread never terminated.
> _putthread.join(time) waits for the given time, but this is not appropriate,
> just for test.
> Does any script really use this callback at all? Which one?
> Line 8054 in wikipedia.py says "an explicit end-of-Queue marker is needed",
> and this is supposed to be a call for async_put with None as value of page.
> But I don't see anywhere this dummy call for async_put. May that be a bug or
> I just misunderstand?
> (Btw, Python 2.4 should be forgotten.)
>
> Please, I need your help.
>
> --
> Bináris
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l