On 14 February 2012 00:46, Bináris <[email protected]> wrote:

> I went back to this conversation with Russell, and tried to use it in an
> other way. I have console encoding problems with this command with Cyrillic
> letters:
> replace.py -catr:Венгрия . @ -lang:ru -excepttext:"[[hu:"
> -save:magyarok.txt -always
> One way is to urlencode the Russian category. Other way is to insert it
> into a script. (DOS batch files won't work, I already tried.)
> So what I did:
> import replace
> replace.main(u'-catr:Венгрия', '.', '@', '-lang:ru',
> '-excepttext:"[[hu:"', '-save:magyarok.txt')
> This results in an error message:
>   File "C:\Pywikipedia\replace.py", line 582, in main
>     for arg in pywikibot.handleArgs(*args):
>   File "C:\Pywikipedia\wikipedia.py", line 7795, in handleArgs
>     arg = _decodeArg(arg)
>   File "C:\Pywikipedia\wikipedia.py", line 7767, in _decodeArg
>     return unicode(arg, config.console_encoding)
> TypeError: decoding Unicode is not supported
> If I omit u from before -catr, no error is thrown, but the name is
> erroneously decoded.
> Now comes the tick! I went to line 7795 of current wikipedia.py (r9894) as
> shown above, and commented it out. Now my script runs perfectly! I love it!


What happens is the following. In the context of line 7767, arg=
u'-catr:Венгрия' (type=Unicode). The line then tries to *decode* a Unicode
string, which makes no sense: you can only decode a str representation.

The sensible solution would be to add a check, for instance something like

return arg is isinstance(arg, unicode) else unicode(arg,
config.console_encoding)

(which mght not work for python 2.4, though, so having a normal if/else
might be preferrable).

Merlijn
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to