Hi!

First I am not an expert here, second my thoughts:

- sounds good! :)
- the two cases (new, append) are not really needed
   if you just use append, and delete the list by yourself
   in the file browser (but this is a philosophical issue)
- for your save/append code, have a look at [1] and maybe
   [2] also. In [1] is code quite similar to your proposal
   and this code is already in use and works. As visible from
   [2] also a '.decode('latin-1')' is needed for me, this
   may differ for you, since unicode is quite mysterious... ;))

Hope this helps a bit!
Greetings

[1] 
https://fisheye.toolserver.org/browse/drtrigon/pywikipedia/dtbext/dtbext_basic.py?r=HEAD#l286
[2] 
https://fisheye.toolserver.org/browse/drtrigon/pywikipedia/sum_disc.py?r=HEAD#l533


Am 22.10.2010 11:16, schrieb Bináris:
> Hi!
>
> My old problem is that repalce.py can't write the pages to work on into
> a file on my disk. I have used a modificated version for years that does
> no changes but writes the title of the involved pages to a subpage on
> Wikipedia in automated mode, and then I can make the replacements from
> that page much more quickly than directly from dump or living Wikipedia.
> This is slow and generates a plenty of dummy edits.
>
> In other words, replace.py has a tool to get the titles from a file
> (-file) or from a wikipage (-links), but has no tool to generate this file.
>
> Now I am ready to rewrite it. This way we can start it and the bot will
> find all the possible articles to work on and save the titles without
> editing Wikipedia (and without artificial delay), meanwhile we can have
> the lunch or run a marathon or sleep. Then we make the replacements from
> this with -file.
>
> My idea is that replace.py should have two new parameters:
> -save writes the results into a new file instead of editing articles. It
> overwrites existing file without notice.
> -saveappend writes into a file or appends to the existing one.
> OR:
> -save writes and appends (primary mode)
> -savenew writes and overwrites
>
> The help is here:
> http://docs.python.org/howto/unicode.html#reading-and-writing-unicode-data
> So we have to import codecs.
> My script is:
> articles=codecs.open('cikkek.txt','a',encoding='utf-8')
> ...
> tutuzuzu=u'# %s\n' %page.aslink() <-- needs rewrite to the new syntax
> articles.write(unicode(tutuzuzu)) <-- needs further testing, if nicode()
> is really needed
> articles.flush()
>
> It works fine except '\n' is a unix-styled newline that has to be
> converted by lfcr.py in order to make it readable with notepad.exe.
> This is with constant filename, that should be developed to get from
> command line.
>
> Your opinions before I begin?
> --
> Bináris
>
>
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l


_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to