Re: [Tutor] Find files without doc strings

David Sun, 17 May 2009 08:38:22 -0700

spir wrote:

Le Sat, 16 May 2009 21:46:02 -0400,
David <da...@abbottdavid.com> s'exprima ainsi:

I am doing an exercise in Wesley Chun's book. Find files in the standardlibrary modules that have doc strings. Then find the ones that don't,"the shame list". I came up with this to find the ones with;
#!/usr/bin/python
import os
import glob
import fileinput
import re

pypath = "/usr/lib/python2.6/"
fnames = glob.glob(os.path.join(pypath, '*.py'))

def read_doc():
     pattern = re.compile('"""*\w')
     for line in fileinput.input(fnames):
         if pattern.match(line):
             print 'Doc String Found: ', fileinput.filename(), line

read_doc()


It seems to me that your approach is moderately wrong ;-)

There must have been an easier way :)


Not sure. As I see it the problem is slightly more complicated. A module doc is 
any triple-quoted string placed before any code. But it must be closed, too.
You'll have to skip blank and comment lines, then check whether the rest 
matches a docstring. It could be done with a single complicated pattern, but 
you could also go for it step by step.
Say I have a file 'dummysource.py' with the following text:
==============
# !/usr/bin/env python
# coding: utf8

# comment
# ''' """

''' foo module
        doc
        '''
def foofunc():
        ''' foofuncdoc '''
        pass
==============

Then, the following doc-testing code
==============
import re
doc = re.compile(r'(""".+?""")|(\'\'\'.+?\'\'\')', re.DOTALL)

def checkDoc(sourceFileName):
    sourceFile = file(sourceFileName, 'r')
    # move until first 'code' line
    while True:
        line = sourceFile.readline()
        strip_line = line.strip()
        print "|%s|" % strip_line
        if (strip_line != '') and (not strip_line.startswith('#')):
            break
    # check doc (keep last line read!)
    source = line + sourceFile.read()
    result = doc.match(source)
    if result is not None:
        print "*** %s *******" % sourceFileName
        print result.group()
        return True
    else:
        return False

sourceFile = file("dummysource.py",'r')
print checkDoc(sourceFile)
==============

will output:

==============
|# !/usr/bin/env python|
|# coding: utf8|
||
|# comment|
|# ''' """|
||
|''' foo module|
*** dummysource.py *******
''' foo module
        doc
        '''
True
==============

It's just for illustration; you can probably make things simpler or find a 
better way.

Now I have a problem, I can not figure out how to compare the fnameswith the result fileinput.filename() and get a list of any that don,thave doc strings.


You can use a func like the above one to filter out (or in) files that answer 
yes/no to the test.
I would start with a list of all files, and just populate 2 new lists for "shame" and 
"fame" files ;-) according to the result of the test.

You could use list comprehension syntax, too:
    fameFileNames = [fileName for fileName in fileNames if checkDoc(fileName)]
But if you do this for shame files too, then every file gets tested twice.

thanks


Denis
------
la vita e estrany
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Thanks Denis,
This seems to work OK;
#!/usr/bin/python
import os
import glob
import fileinput
import re

pypath = "/usr/lib/python2.6/"
fnames = glob.glob(os.path.join(pypath, '*.py'))
fnames.sort()
goodFiles = []

def shame_list():
    pattern = re.compile(r'(^""")|(^\'\'\')', re.DOTALL)
    for line in fileinput.input(fnames):
        if pattern.match(line):
            found = fileinput.filename()
            goodFiles.append(found)
            goodFiles.sort()
            for item in fnames:
                if item in goodFiles:
                    fnames.remove(item)
                    print 'Shame List: \n', fnames
shame_list()

<returns>

Shame List:

['/usr/lib/python2.6/__phello__.foo.py','/usr/lib/python2.6/collections.py', '/usr/lib/python2.6/md5.py','/usr/lib/python2.6/pydoc_topics.py', '/usr/lib/python2.6/sha.py','/usr/lib/python2.6/struct.py', '/usr/lib/python2.6/this.py']



--
Powered by Gentoo GNU/Linux
http://linuxcrazy.com
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Find files without __doc__ strings

Reply via email to

Re: [Tutor] Find files without doc strings