spir wrote:
Le Sat, 16 May 2009 21:46:02 -0400,
David <da...@abbottdavid.com> s'exprima ainsi:
I am doing an exercise in Wesley Chun's book. Find files in the standard
library modules that have doc strings. Then find the ones that don't,
"the shame list". I came up with this to find the ones with;
#!/usr/bin/python
import os
import glob
import fileinput
import re
pypath = "/usr/lib/python2.6/"
fnames = glob.glob(os.path.join(pypath, '*.py'))
def read_doc():
pattern = re.compile('"""*\w')
for line in fileinput.input(fnames):
if pattern.match(line):
print 'Doc String Found: ', fileinput.filename(), line
read_doc()
It seems to me that your approach is moderately wrong ;-)
There must have been an easier way :)
Not sure. As I see it the problem is slightly more complicated. A module doc is
any triple-quoted string placed before any code. But it must be closed, too.
You'll have to skip blank and comment lines, then check whether the rest
matches a docstring. It could be done with a single complicated pattern, but
you could also go for it step by step.
Say I have a file 'dummysource.py' with the following text:
==============
# !/usr/bin/env python
# coding: utf8
# comment
# ''' """
''' foo module
doc
'''
def foofunc():
''' foofuncdoc '''
pass
==============
Then, the following doc-testing code
==============
import re
doc = re.compile(r'(""".+?""")|(\'\'\'.+?\'\'\')', re.DOTALL)
def checkDoc(sourceFileName):
sourceFile = file(sourceFileName, 'r')
# move until first 'code' line
while True:
line = sourceFile.readline()
strip_line = line.strip()
print "|%s|" % strip_line
if (strip_line != '') and (not strip_line.startswith('#')):
break
# check doc (keep last line read!)
source = line + sourceFile.read()
result = doc.match(source)
if result is not None:
print "*** %s *******" % sourceFileName
print result.group()
return True
else:
return False
sourceFile = file("dummysource.py",'r')
print checkDoc(sourceFile)
==============
will output:
==============
|# !/usr/bin/env python|
|# coding: utf8|
||
|# comment|
|# ''' """|
||
|''' foo module|
*** dummysource.py *******
''' foo module
doc
'''
True
==============
It's just for illustration; you can probably make things simpler or find a
better way.
Now I have a problem, I can not figure out how to compare the fnames
with the result fileinput.filename() and get a list of any that don,t
have doc strings.
You can use a func like the above one to filter out (or in) files that answer
yes/no to the test.
I would start with a list of all files, and just populate 2 new lists for "shame" and
"fame" files ;-) according to the result of the test.
You could use list comprehension syntax, too:
fameFileNames = [fileName for fileName in fileNames if checkDoc(fileName)]
But if you do this for shame files too, then every file gets tested twice.
thanks
Denis
------
la vita e estrany
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor
Thanks Denis,
This seems to work OK;
#!/usr/bin/python
import os
import glob
import fileinput
import re
pypath = "/usr/lib/python2.6/"
fnames = glob.glob(os.path.join(pypath, '*.py'))
fnames.sort()
goodFiles = []
def shame_list():
pattern = re.compile(r'(^""")|(^\'\'\')', re.DOTALL)
for line in fileinput.input(fnames):
if pattern.match(line):
found = fileinput.filename()
goodFiles.append(found)
goodFiles.sort()
for item in fnames:
if item in goodFiles:
fnames.remove(item)
print 'Shame List: \n', fnames
shame_list()
<returns>
Shame List:
['/usr/lib/python2.6/__phello__.foo.py',
'/usr/lib/python2.6/collections.py', '/usr/lib/python2.6/md5.py',
'/usr/lib/python2.6/pydoc_topics.py', '/usr/lib/python2.6/sha.py',
'/usr/lib/python2.6/struct.py', '/usr/lib/python2.6/this.py']
--
Powered by Gentoo GNU/Linux
http://linuxcrazy.com
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor