Mike Kupfer wrote:
> I've been working on some changes to the ON Developers Reference
> sources, so that it's easy to generate HTML that matches the existing
> structure at the ON community site. The idea is that you'd follow these
> steps when making changes:
>
> (edit the XML source)
> $ make -e <DOCBOOK_path> chunks
> $ hg stat on-chunks
> (upload HTML files that Mercurial reports as changed)
> (edit http://www.opensolaris.org/os/community/on/devref_toc/ if needed)
>
> The "make chunks" step uses a custom table-of-contents file to control
> how the HTML is chunked. Then it runs the HTML through a script to fix
> up links and remove unwanted tags.
>
> I still have some work to do, like fixing the section links in the XML
> source to match the links that are currently in use. But I thought I'd
> post a preliminary webrev in case anyone wants to look at what I have
> now. I'm hoping to have a more complete webrev ready in the next week
> or so.
>
> http://cr.opensolaris.org/~kupfer/devref-chunks/
>
> I'm fairly new to Python, so recommendations on how to improve the
> fixup.py script are particularly welcome.
Makefile:
- I'm curious why ALLFILES and FILES are identical, and how to decide
which one to use? Particularly in the rawchunks rule, which uses
ALLFILES as a dependency, but FILES in the action.
- In the chunks rule, do you need to wait between rawchunks and
$(ONCHUNKS), or to make a dependency for $(ONCHUNKS) on rawchunks?
- It would seem more appropriate to have the dir/file rules depend on
dir, and have a rule to create it, rather than embedding the mkdirs in
the other actions?
fixup.py:
- In this line:
40 targre = re.compile(r'^(\w+)(.html|/)(#\w+)?$')
...does the "." in ".html" need to be quoted, lest it match any character?
- This:
124 def get_class(attrlist):
125 '''Return the div class or an empty string.'''
126 for (attr, val) in attrlist:
127 if attr == "class":
128 return val
129 return ""
...could be more Python-esquely written as
124 def get_class(attrlist):
125 '''Return the div class or an empty string.'''
126 val = [ v for k, v in attrlist if k == "class" ] + [ "" ]
127 return val[0]
(but it's purely a style comment, not correctness).
- Similarly, this:
131 divclass = get_class(attributes)
132 if divclass == "navfooter" or divclass == "navheader" or \
133 divclass == "toc":
...could be
131 if get_class(attributes) in [ "navfooter", "navheader", "toc" ]:
- What does the assertion on line 134 gain you? Are you asserting that
xsltproc is spitting out well-formed html?
- In start_a(), if you simply build a different list as you go, then you
don't need to do the indexing dance.
- You could simply fold the href fixup into unknown_starttag():
166 def unknown_starttag(self, tag, attributes):
167 '''Print the tag and attributes unless it is in a block that
168 is being suppressed.'''
169 if self.swallow:
170 return
171 attrstr = ""
172 if attributes:
173 attrstr = " " + " ".join('%s="%s"' % (a, fix_href(v)) \
174 for a, v in attributes)
175 self.my_output.append("<%s%s>" % (tag, attrstr))
(and then fix_href() would need to properly handle other attributes, too.)
--Mark