I just worked it out, mostly... instead of:
 -exceptinsidetag:header

I used:
-exceptinside:'=[^\n\r]*=[ \t]*'

And it worked!

There might be a small risk of false positives, so I tried various tweaks,
e.g.
-exceptinside:'^=[^\n\r]*=[ \t]*$'
-exceptinside:'[\n\r]=[^\n\r]*=[ \t]*[\n\r]'
-exceptinside:'[\n\r]=[^\n\r]*='

But none worked... any suggestions?

On Thu, Dec 22, 2011 at 18:21, Chris Watkins
<[email protected]>wrote:

> I have been using " -exceptinsidetag:header" with replace.py. This was
> added by Daniel Herding in response to a request by me:
>
> On Mon, Jun 30, 2008 at 23:11, Daniel Herding <[email protected]> wrote:
>
>>
>>
>> This will exclude wikilinks and URLs. There are some more things that can
>> be
>> excluded, see the source code of the method replaceExcept() in
>> wikipedia.py
>> (look at the exceptionRegexes dictionary). I have just added a regular
>> expression for section headers for you, so if you're running the SVN
>> version,
>> you can use this parameter:
>>
>> -exceptinsidetag:header
>>
>
>
> I seem to recall this working in a nightly version a couple of years ago,
> but it's not working now - I'm not sure when it stopped. Is it possible to
> put it back in?
>
> Thanks!
>
>
> --
> Chris Watkins
>
> Appropedia.org - Sharing knowledge to build rich, sustainable lives.
>
>


-- 
Chris Watkins

Appropedia.org - Sharing knowledge to build rich, sustainable lives.
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to