On 10Jul2014 08:37, fl <rxjw...@gmail.com> wrote:
This example is from the link:
https://wiki.python.org/moin/RegularExpression
I have thought about it quite a while without a clue yet.
I notice that it uses
double quote ", in contrast to ' which I see more often until now.
With raw strings (r', r") this doesn't matter. I tend to use r' myself.
You want raw strings with regular expressions because otherwise their heavy use
of sloshes "\" overlap with Python's use of sloshes, making everything harder.
It looks very complicated to me. Could you simplified it to a simple example?
import re
split_up = re.split(r"(\(\([^)]+\)\))",
"This is a ((test)) of the ((emergency broadcasting
station.))")
...which produces:
["This is a ", "((test))", " of the ", "((emergency broadcasting station.))" ]
Rip off the python punctuation and get the regexp itself:
(\(\([^)]+\)\))
then start from the inside out:
[^)] Any character except a closing bracket.
+ One or more of the preceeding.
Therefore:
[^)]+ One or more characters which are not closing brackets.
Also phrased: at least one character which is not a closing bracket.
Outside this are \( and \): these are literal opening and closing bracket
characters. So:
\(\([^)]+\)\)
Two opening brackets, then at least one character which is not a
closing bracket, then two closing brackets.
The outermost ( and ) are regexp grouping brackets, not text. On their own you
don't need them, but they mark out the regexp between them for later reference
or for use with a repeating modifier like ?, * or +. So in this instance they
do not add anything special to the regexp.
Given the above inside-to-out explaination, does that explain the re.split
result for you?
Cheers,
Cameron Simpson <c...@zip.com.au>
I thought the DoD was a bunch of licensed squids. The last thing you
need is a bunch of unregulated, amateur squids running loose.
- David Wood <davew...@teleport.com>
--
https://mail.python.org/mailman/listinfo/python-list