Searching by using a text dump sounds more reasonable to me. If you insist on changing replace.py, make sure you are removing all occurences of both put and put_async.
Best regards, Merlijn 'valhallasw' van Deen On 12 April 2010 09:54, Chris Watkins <[email protected]> wrote: > So I haven't found a way to make a list of matches without replacing. I > suspect there's a very simple way, or it would take very simple changes to > replace.py. > > > I tried editing replace.py myself, to make it do everything except replace > the files. Then I could hack the log files to get the list I want. But I had > no success - I'm not coder, so it was guesswork. > > I copied replace.py to a new file intended to do everything except put > files, and called it *replacenoput.py* (i.e. "replace," but no "put") > > My first attempt was to remove this section (commented it out first, but > then removed to be sure): > > if self.acceptall and new_text != original_text: > try: > page.put(new_text, self.editSummary) > except wikipedia.EditConflict: > wikipedia.output(u'Skipping %s because of edit > conflict' > % (page.title(),)) > except wikipedia.SpamfilterError, e: > wikipedia.output( > u'Cannot change %s because of blacklist entry %s' > % (page.title(), e.url)) > except wikipedia.PageNotSaved, error: > wikipedia.output(u'Error putting page: %s' > % (error.args,)) > except wikipedia.LockedPage: > wikipedia.output(u'Skipping %s (locked page)' > % (page.title(),)) > > > Fail - it made the changes all the same. > > Then I figured out that wikipedia.py was being used to put the files. So I > copied that to a new file *wikipedianoput.py* and changed every wikipedia > reference in *replacenoput.py* to wikipedianoput. > > Then I scanned through wikipedianoput.py looking for what I need to > block... but I couldn't tell. > > Can anyone help? Or even better, is there a more elegant way? > > Thanks > Chris > > > On Fri, Apr 2, 2010 at 00:12, Daniel Mietchen < > [email protected]> wrote: > >> Hi Chris, >> >> On Thu, Apr 1, 2010 at 2:26 PM, Chris Watkins >> <[email protected]> wrote: >> > Thanks Daniel... I'm confused though. >> > >> > On Thu, Apr 1, 2010 at 20:25, Daniel Mietchen >> > <[email protected]> wrote: >> >> >> >> Perhaps >> >> http://meta.wikimedia.org/wiki/Pywikipediabot/copyright.py >> >> will do the trick, >> > >> > I can't see how to use it for matching a specific string. >> Nor do I - sorry. What I had in mind was to apply it to a page that >> contains your search string, and to restrict the search for "copyright >> violations" to your site. >> But this may indeed be a dead end. >> >> >> or simply >> >> http://meta.wikimedia.org/wiki/Pywikipediabot/replace.py >> >> in -debug mode? >> > >> > Where can I find information on -debug mode? I see there is -verbose >> mode >> > which "may be helpful when debugging", but I don't see how that helps. >> I thought that most PWB scripts had it, but apparently replace.py does >> not. >> >> but if the >> def __init__(self, reader, force, append, summary, minor, autosummary, >> debug): >> line contains "debug" (as in the example above, taken from >> >> http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikipedia/pagefromfile.py?view=markup >> ), >> then -debug is an option with which the script can be run such that it >> performs all its >> actions except editing the pages. >> >> I am not very experienced with Python or PWB either, but since nobody >> had replied so far, I wrote out my ideas as they came to mind. >> Sorry for the confusion, >> >> Daniel >> >> > I may be missing something obvious &-) >> Me too. >> >> > Chris >> > >> > >> >> >> >> Daniel >> >> >> >> On Thu, Apr 1, 2010 at 6:05 AM, Chris Watkins >> >> <[email protected]> wrote: >> >> > I want to generate a list of matches for a search, but not do >> anything >> >> > to >> >> > the page. >> >> > >> >> > E.g. I want to list all pages that contain "redirect[[:Category", but >> I >> >> > don't want to modify the pages. >> >> > >> >> > I guess that it's possible to modify redirect.py (I don't speak >> python, >> >> > but >> >> > it shouldn't be hard) and run it with -log. But maybe there's a >> simpler >> >> > way? >> >> > >> >> > Thanks in advance. >> >> > >> >> > -- >> >> > Chris Watkins >> >> > >> >> > Appropedia.org - Sharing knowledge to build rich, sustainable lives. >> >> > >> >> > blogs.appropedia.org >> >> > community.livejournal.com/appropedia >> >> > identi.ca/appropedia >> >> > twitter.com/appropedia >> >> > >> >> > _______________________________________________ >> >> > Pywikipedia-l mailing list >> >> > [email protected] >> >> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> http://www.google.com/profiles/daniel.mietchen >> >> >> >> _______________________________________________ >> >> Pywikipedia-l mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l >> > >> > >> > >> > -- >> > Chris Watkins >> > >> > Appropedia.org - Sharing knowledge to build rich, sustainable lives. >> > >> > blogs.appropedia.org >> > community.livejournal.com/appropedia >> > identi.ca/appropedia >> > twitter.com/appropedia >> > >> > _______________________________________________ >> > Pywikipedia-l mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l >> > >> > >> >> >> >> -- >> http://www.google.com/profiles/daniel.mietchen >> >> _______________________________________________ >> Pywikipedia-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l >> > > > > -- > Chris Watkins > > Appropedia.org - Sharing knowledge to build rich, sustainable lives. > > blogs.appropedia.org > community.livejournal.com/appropedia > identi.ca/appropedia > twitter.com/appropedia > > _______________________________________________ > Pywikipedia-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l > >
_______________________________________________ Pywikipedia-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
