-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So first thanks a lot for your time and mails!

> it does re-loading attempts AND it applies correct unicode encoding
> to the html page contents. Both is not done by urlopen as far as I
> know...(?)
> 
> 
> Right. Then abstract that stuff out of the Site urlopen into a
> seperate module, and use that.
> 
> The issue is not the bot logging out, the issue is you're using a 
> function for something it was never supposed to do. When you do
> that, you shouldn't be surprised things break.

This cannot be the way to go. You are right I am "abusing" the function.
Since in the docu/description there is written

"Low-level routine to get a URL from the wiki."

and I am using it for arbitrary (non-wiki) URLs. But there is also
nothing mentioned about any login try to the current (or any other
wiki) at all... The pywikipedia team did a really good job in writing
this function and it seems strange to me to copy the whole function
as it is, just dropping 1 or 2 lines of code to achieve what I need
(and then I would also have to maintain that code in parallel).
As far as I can see at the moment, the problem is the call to
'self._getUserDataOld' at the end. I am not an expert in this, but I
tried to investigate it as good as possible. That is also the reason
why I asked following (may be stupid) questions:

> So I do not understand how the initial login (by cookies) is done 
> and at what place in the code? Then I do not understand why the 
> later (re)login is done in a different way? And last I do no
> under- stand why 'LoginManager' ask for a password but does not
> need it, if there are cookies present? (this requested user input
> seems to break my bot then...)

I was able to answer the first question: 'site._load' is resposible for
the very first login AND is also able to re-login for me. 'getUrl' is
NOT able to re-login EVEN when accessing a page from dewiki... AND THIS
SHOULD WORK as far as I can see (so we have a bug here). The other two
questions I was not able to answer myself...

At the moment to me it looks like adding a keyword argument to 'getUrl'
called 'noLogin' similar to 'getSite' preventing 'getUrl' from calling
'_getUserDataOld' at the end should solve my problem. And this should
not be in any contradiction to 'getUrl' as it is described.

Greetings and have a nice day
DrTrigon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8lrQcACgkQAXWvBxzBrDCZlQCfaJkYuTjkDkcmDok2cGslmNH2
W5QAoJini5QSuytsMWmzmahAswnbJPkR
=sQ+7
-----END PGP SIGNATURE-----

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to