Le Sat, 16 May 2009 21:46:02 -0400, David <da...@abbottdavid.com> s'exprima ainsi:
> I am doing an exercise in Wesley Chun's book. Find files in the standard > library modules that have doc strings. Then find the ones that don't, > "the shame list". I came up with this to find the ones with; > #!/usr/bin/python > import os > import glob > import fileinput > import re > > pypath = "/usr/lib/python2.6/" > fnames = glob.glob(os.path.join(pypath, '*.py')) > > def read_doc(): > pattern = re.compile('"""*\w') > for line in fileinput.input(fnames): > if pattern.match(line): > print 'Doc String Found: ', fileinput.filename(), line > > read_doc() It seems to me that your approach is moderately wrong ;-) > There must have been an easier way :) Not sure. As I see it the problem is slightly more complicated. A module doc is any triple-quoted string placed before any code. But it must be closed, too. You'll have to skip blank and comment lines, then check whether the rest matches a docstring. It could be done with a single complicated pattern, but you could also go for it step by step. Say I have a file 'dummysource.py' with the following text: ============== # !/usr/bin/env python # coding: utf8 # comment # ''' """ ''' foo module doc ''' def foofunc(): ''' foofuncdoc ''' pass ============== Then, the following doc-testing code ============== import re doc = re.compile(r'(""".+?""")|(\'\'\'.+?\'\'\')', re.DOTALL) def checkDoc(sourceFileName): sourceFile = file(sourceFileName, 'r') # move until first 'code' line while True: line = sourceFile.readline() strip_line = line.strip() print "|%s|" % strip_line if (strip_line != '') and (not strip_line.startswith('#')): break # check doc (keep last line read!) source = line + sourceFile.read() result = doc.match(source) if result is not None: print "*** %s *******" % sourceFileName print result.group() return True else: return False sourceFile = file("dummysource.py",'r') print checkDoc(sourceFile) ============== will output: ============== |# !/usr/bin/env python| |# coding: utf8| || |# comment| |# ''' """| || |''' foo module| *** dummysource.py ******* ''' foo module doc ''' True ============== It's just for illustration; you can probably make things simpler or find a better way. > Now I have a problem, I can not figure out how to compare the fnames > with the result fileinput.filename() and get a list of any that don,t > have doc strings. You can use a func like the above one to filter out (or in) files that answer yes/no to the test. I would start with a list of all files, and just populate 2 new lists for "shame" and "fame" files ;-) according to the result of the test. You could use list comprehension syntax, too: fameFileNames = [fileName for fileName in fileNames if checkDoc(fileName)] But if you do this for shame files too, then every file gets tested twice. > thanks Denis ------ la vita e estrany _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor