Hi Binaris

I did not write the any of the threaded stuff in wikipedia.py but I
have used it a couple of times. I think what you should do is provide
a callable _object_ and not a callback function. You can then iterate
through the list of callback objects and look at the errors if there
are any. Here is a sample program I wrote to illustrate the concept:

import wikipedia as pywikibot
from time import sleep

pages = [
    'User:HRoestBot/CallbackTest1',
    'User:HRoestBot/CallbackTest2',
]

class CallbackObject(object):
    def __init__(self):
        self.done = False

    def __call__(self, page, error):
        self.page = page
        self.error = error
        self.done = True

Callbacks = []
for mypage in pages:
    print(mypage);
    callb = CallbackObject()
    page = pywikibot.Page(pywikibot.getSite(), mypage)
    Callbacks.append(callb)
    page.put_async('some text', callback=callb)

# Waiting until all pages are saved on Wikipedia
while True:
    if all( [c.done for c in Callbacks] ): break
    print "Still Waiting"
    sleep(5)

# Now we can look at the errors
for obj in Callbacks:
    print obj.page, obj.error
    if not obj.error is None:
      # do something to handle errors


The output of such a program may then be

$ python test.py
unicode test: triggers problem #3081100
HRoestBot/CallbackTest1
HRoestBot/CallbackTest2
Sleeping for 4.0 seconds, 2012-02-24 09:32:57
Still Waiting
Still Waiting
Updating page [[HRoestBot/CallbackTest1]] via API
Still Waiting
Sleeping for 19.3 seconds, 2012-02-24 09:33:18
Still Waiting
Updating page [[HRoestBot/CallbackTest2]] via API
Still Waiting
[[de:HRoestBot/CallbackTest1]] An edit conflict has occured.
[[de:HRoestBot/CallbackTest2]] An edit conflict has occured.
hr@hr:~/projects/private/pywikipedia_gitsvn$


At least that is how I do it. I hope that helps to understand. You can
also use pywikibot.page_put_queue.qsize() and
pywikibot.page_put_queue.empty() to check whether the queue is empty
or not but this might still lead to problems because the page is
fetched from the queue and *then* page.put is called on it. So until
page.put() finishes, the queue will be empty even though the bot is
still putting the page. See the function def async_put(), it seems to
me much safer to rely on the Callback objects to be sure that all the
put-calls are done.

You can also look at _flush() method in wikipedia.py to see how it
determines whether all pages are put and its save to exit or not.

Hannes

On 23 February 2012 21:30, Bináris <[email protected]> wrote:
> I made a big effort to understand this stuff with put_async and threading,
> but here is a point I can't get over.
>
> I read a lot and understood that means of waiting for a thread is join, and
> in wikipedia.py, join must be a method of _putthread which is the Thread
> object. Noe, wherever I write the line
> _putthread.join()
> (I tried put_async, async_put and even replace.py which I know is not a good
> solution) it freezes my command window as if the thread never terminated.
> _putthread.join(time) waits for the given time, but this is not appropriate,
> just for test.
> Does any script really use this callback at all? Which one?
> Line 8054 in wikipedia.py says "an explicit end-of-Queue marker is needed",
> and this is supposed to be a call for async_put with None as value of page.
> But I don't see anywhere this dummy call for async_put. May that be a bug or
> I just misunderstand?
> (Btw, Python 2.4 should be forgotten.)
>
> Please, I need your help.
>
> --
> Bináris
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to