-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 So first thanks a lot for your time and mails!
> it does re-loading attempts AND it applies correct unicode encoding > to the html page contents. Both is not done by urlopen as far as I > know...(?) > > > Right. Then abstract that stuff out of the Site urlopen into a > seperate module, and use that. > > The issue is not the bot logging out, the issue is you're using a > function for something it was never supposed to do. When you do > that, you shouldn't be surprised things break. This cannot be the way to go. You are right I am "abusing" the function. Since in the docu/description there is written "Low-level routine to get a URL from the wiki." and I am using it for arbitrary (non-wiki) URLs. But there is also nothing mentioned about any login try to the current (or any other wiki) at all... The pywikipedia team did a really good job in writing this function and it seems strange to me to copy the whole function as it is, just dropping 1 or 2 lines of code to achieve what I need (and then I would also have to maintain that code in parallel). As far as I can see at the moment, the problem is the call to 'self._getUserDataOld' at the end. I am not an expert in this, but I tried to investigate it as good as possible. That is also the reason why I asked following (may be stupid) questions: > So I do not understand how the initial login (by cookies) is done > and at what place in the code? Then I do not understand why the > later (re)login is done in a different way? And last I do no > under- stand why 'LoginManager' ask for a password but does not > need it, if there are cookies present? (this requested user input > seems to break my bot then...) I was able to answer the first question: 'site._load' is resposible for the very first login AND is also able to re-login for me. 'getUrl' is NOT able to re-login EVEN when accessing a page from dewiki... AND THIS SHOULD WORK as far as I can see (so we have a bug here). The other two questions I was not able to answer myself... At the moment to me it looks like adding a keyword argument to 'getUrl' called 'noLogin' similar to 'getSite' preventing 'getUrl' from calling '_getUserDataOld' at the end should solve my problem. And this should not be in any contradiction to 'getUrl' as it is described. Greetings and have a nice day DrTrigon -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8lrQcACgkQAXWvBxzBrDCZlQCfaJkYuTjkDkcmDok2cGslmNH2 W5QAoJini5QSuytsMWmzmahAswnbJPkR =sQ+7 -----END PGP SIGNATURE----- _______________________________________________ Pywikipedia-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
