Duncan Booth wrote: > John Machin wrote: > > >>So here's the mean lean no-flab version -- you don't even need the >>parentheses (sorry, Thomas). >> >> >>>>>rx1=re.compile(r"""\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,""") >>>>>rx1.findall("1234,2222-8888,4567,") >> >>['1234,', '2222-8888,', '4567,'] > > > No flab? What about all that repetition of \d? A less flabby version: > > >>>>rx1=re.compile(r"""\b\d{4}(?:-\d{4})?,""") >>>>rx1.findall("1234,2222-8888,4567,") > > ['1234,', '2222-8888,', '4567,'] >
OK, good idea to factor out the prefix and follow it by optional -1234. However optimising re engines do common prefix factoring, *and* they rewrite stuff like x{4} as xxxx. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list