On 2015-11-05 23:05, Steven D'Aprano wrote: > Oh the shame, I knew that. Somehow I tangled myself in a knot, > thinking that it had to be 1 *followed by* zero or more characters. > But of course it's not a glob, it's a regex.
But that's a good reminder of fnmatch/glob modules too. Sometimes all you need is to express a simple glob, in which case using a regexp can cloud the clarity. The overarching principle is to go for clarity & simplicity, rather than favoring built-ins/glob/regex/parser modules all the time. Want to test for presence in a string? Just use the builtin "a in b" test. At the beginning/end? Use .startswith()/.endswith() for clarity. Need to check if a string is purely digits/alpha/alphanumerics/etc? Use the string .is{alnum,alpha,decimal,digit,identifier,lower,numeric,printable,space,title,upper} methods on the string. For simple wild-carding, use the fnmatch module to do simple globbing. For more complex pattern matching, you've got regexps. Finally, for occasions when you're searching for repeated/nested structures, using an add-on module like pyparsing will give you clearer code. Oh, and with regexps, people should be less afraid of verbose multi-line strings with commenting r = re.compile(r""" ^ # start of the string (?P<year>\d{4}) # capture 4 digits - # a literal dash (?P<month>\d{1,2}) # capture 1-2 digits - # another literal dash (?P<day>\d{1,2}) # capture 1-2 digits _ # a literal underscore (?P<accountnum> # capture the account-number [A-Z]{1,3} # 1-3 letters \d+ # followed by 1+ digits ) \.txt # the extension of the file (ignored) $ # the end of the string """, re.VERBOSE) They are a LOT easier to come back to if you haven't touched the code for a year. -tkc -- https://mail.python.org/mailman/listinfo/python-list