[EMAIL PROTECTED] wrote: > I am scanning text that has identifiers with a constant prefix string > followed by alphanumerics and underscores. I can't figure out, using > pyparsing, how to match for this. The example expression below seems to > be looking for whitespace between the 'atod' and the rest of the > identifier. > > identifier_atod = 'atod' + pp.Word('_' + pp.alphanums) > > How can I get pyparsing to match 'atodkj45k' and 'atod_asdfaw', but not > 'atgdkasdjfhlksj' and 'atod asdf4er', where the first four characters > must be 'atod', and not followed by whitespace?
Here is one way using pyparsing.Combine: >>> from pyparsing import * >>> tests = [ 'atodkj45k', 'atod_asdfaw', 'atgdkasdjfhlksj', 'atod asdf4er'] >>> ident = Combine(Literal('atod') + Word('_' + alphanums)) >>> for t in tests: ... try: ... print ident.parseString(t) ... except: ... print 'No match', t ... ['atodkj45k'] ['atod_asdfaw'] No match atgdkasdjfhlksj No match atod asdf4er >>> Kent -- http://mail.python.org/mailman/listinfo/python-list