Thought I'd point out a couple of useful things I've come across when doing
regex work (in Python, but also in other languages):

1: The re.VERBOSE flag.  Lets you write your regular expressions using
multiline strings (you'll have to escape whitespace, or use \s though), and
also add comments.  Makes it a lot easier to understand what you've been
thinking when you come back to your code two months later to change it.

2: Using functions instead of strings as the replacement in sub().  If
you're looking to do a fair amount of conditional logic in your replacement,
it might be more easily written by having a function do it, rather than
attempt to do it all with a regex.

My $.02.


Cheers,
Morten

On Tue, Jun 28, 2011 at 7:23 AM, Bináris <[email protected]> wrote:

> OK, then I make separate lines. The only issue is that any
> enhacement/correction will be more complicated this way (that is another
> reason to put as many features in one line as possible).
>
>
> 2011/6/28 Marcin Cieslak <[email protected]>
>
>>
>> Given the speed of fetching/storing pages I don't think that speed of the
>> regular expression makes any difference. Running two compiled RE's
>> one after the other in sequence on the page text should be very fast.
>>
>>
>>
> --
> Bináris
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to