I just worked it out, mostly... instead of: -exceptinsidetag:header I used: -exceptinside:'=[^\n\r]*=[ \t]*'
And it worked! There might be a small risk of false positives, so I tried various tweaks, e.g. -exceptinside:'^=[^\n\r]*=[ \t]*$' -exceptinside:'[\n\r]=[^\n\r]*=[ \t]*[\n\r]' -exceptinside:'[\n\r]=[^\n\r]*=' But none worked... any suggestions? On Thu, Dec 22, 2011 at 18:21, Chris Watkins <[email protected]>wrote: > I have been using " -exceptinsidetag:header" with replace.py. This was > added by Daniel Herding in response to a request by me: > > On Mon, Jun 30, 2008 at 23:11, Daniel Herding <[email protected]> wrote: > >> >> >> This will exclude wikilinks and URLs. There are some more things that can >> be >> excluded, see the source code of the method replaceExcept() in >> wikipedia.py >> (look at the exceptionRegexes dictionary). I have just added a regular >> expression for section headers for you, so if you're running the SVN >> version, >> you can use this parameter: >> >> -exceptinsidetag:header >> > > > I seem to recall this working in a nightly version a couple of years ago, > but it's not working now - I'm not sure when it stopped. Is it possible to > put it back in? > > Thanks! > > > -- > Chris Watkins > > Appropedia.org - Sharing knowledge to build rich, sustainable lives. > > -- Chris Watkins Appropedia.org - Sharing knowledge to build rich, sustainable lives.
_______________________________________________ Pywikipedia-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
