could ildg wrote:
> In re, the punctuation "^" can exclude a single character, but I want
> to exclude a whole word now. for example I have a string "hi, how are
> you. hello", I want to extract all the part before the world "hello",
> I can't use ".*[^hello]" because "^" only exclude single char "h" or
> "e" or "l" or "o". Will somebody tell me how to do it? Thanks.

import re

def demonstrate(regex, text):
        pattern = re.compile(regex)
        match = pattern.search(text)

        print " ", text
        if match:
                print "    Matched  '%s'" % match.group(0)
                print "    Captured '%s'" % match.group(1)
        else:
                print "    Did not match"

# Option 1: Match it all, but capture only the part before "hello."  The 
(.*?)
# matches as few characters as possible, so that this pattern would end 
before
# the first hello in "hello hello".

pattern = r"(.*?)hello"
print "Option 1:", pattern
demonstrate( pattern, "hi, how are you. hello" )

# Option 2: Don't even match the "hello," but make sure it's there.
# The first of these calls will match, but the second will not.  The
# (?=...) construct is using a feature called "forward look-ahead."

pattern = r"(.*)(?=hello)"
print "\nOption 2:", pattern
demonstrate( pattern, "hi, how are you. hello" )
demonstrate( pattern, "hi, how are you. ",     )
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to