Mary Morris wrote: > I'm trying to compile a list of decorators from the source code at my > office. > I did this by doing a > > candidate_line.find("@") > > because all of our decorators start with the @ symbol. The problem I'm > having is that the email addresses that are included in the comments are > getting included in the list that is getting returned. > I was thinking I could do a candidate_line.find(".com") to set the email > addresses apart, but how do I tell the computer to not include the lines > it finds with ".com" in them in the list?
You can use the tokenize module to do the heavy lifting, see http://docs.python.org/library/tokenize.html Here's an example: $ cat find_decos.py import tokenize from collections import defaultdict class Token: def __init__(self, token): self.string = token[1] self.lineno = token[2][0] def find_decos(instream, filename, decos): tokens = (Token(token) for token in tokenize.generate_tokens(instream.readline)) for token in tokens: if token.string == "@": lineno = token.lineno qname = [next(tokens).string] for token in tokens: if token.string == ".": qname.append(next(tokens).string) else: break decos[".".join(qname)].append((lineno, filename)) def main(): import sys files = sys.argv[1:] if not files: # read filenames from stdin files = (line.strip() for line in sys.stdin) decorators = defaultdict(list) for filename in files: with open(filename) as instream: find_decos(instream, filename, decorators) for name in sorted(decorators): print name for location in decorators[name]: print "%8d %s" % location if __name__ == "__main__": main() if False: def f(): """ @not_a_decorator """ return g # @not_a_decorator @alpha def first(x): return "u...@example.com" @beta @gamma . one def second(): pass @delta.two.three.four(*args) @epsilon(42) def third(): pass The if False:... suite is of course not a necessary part of the script, it's just a trick to cram in a few decorators for the script to find when you run it over itself: $ python find_decos.py find_decos.py alpha 50 find_decos.py beta 54 find_decos.py delta.two.three.four 59 find_decos.py epsilon 60 find_decos.py gamma.one 55 find_decos.py Alternatively you can feed filenames via stdin: $ find /usr/lib/python2.6 -name \*.py | python find_decos.py | tail 429 /usr/lib/python2.6/dist- packages/usbcreator/frontends/kde/frontend.py 434 /usr/lib/python2.6/dist- packages/usbcreator/frontends/kde/frontend.py 446 /usr/lib/python2.6/dist- packages/usbcreator/frontends/kde/frontend.py threaded 166 /usr/lib/python2.6/dist- packages/softwareproperties/kde/DialogMirror.py withResolverLog 572 /usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeCache.py 858 /usr/lib/python2.6/dist-packages/DistUpgrade/DistUpgradeCache.py wraps 81 /usr/lib/python2.6/contextlib.py $ Peter _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor