Hi, I'm trying to create a regular expression for matching some particular XML strings. I want to extract the contents of a particular XML tag, only if it follows one tag, but not follows another tag. Complicating this, is that there can be any number of other tags in between.
So basically, my regular expression should have 3 parts: - first match - any random text, that should not contain string '<Xds' - second match I have a problem figuring out how to do the second part: a random bit of text, that should _not_ contain the substring '<Xds' ('<Xds' being the start of any tags which should not be in between my first and second match). Because of the variable length of the overal match, I cannot do this with a negative look-behind assertion, and a negative look-ahead assertion doesn't seem to work either. The regular expression that I have now is: r'(?s)<Xds\w*Policy>.*?<ref>(?P<pol_ref>\d+)</ref>' (hopefully without typos) Here '<Xds\w*Policy>' is my first match, and '<ref>(?P<pol_ref>\d+)</ref>' is my second match. In this expression, I want to change the generic '.*?', which matches everything, with something that matches every string that does not include the substring '<Xds'. I know that I could capture the text matched by '.*?' and manually check if it contains that string '<Xds', but that would be very hard to fit into the rest of the code, for a number of reasons. Does anyone have an idea how to do this within one regular expression? Regards, --Tim
-- http://mail.python.org/mailman/listinfo/python-list