Mike Kupfer wrote:
> I've been working on some changes to the ON Developers Reference
> sources, so that it's easy to generate HTML that matches the existing
> structure at the ON community site.  The idea is that you'd follow these
> steps when making changes:
> 
>   (edit the XML source)
>   $ make -e <DOCBOOK_path> chunks
>   $ hg stat on-chunks
>   (upload HTML files that Mercurial reports as changed)
>   (edit http://www.opensolaris.org/os/community/on/devref_toc/ if needed)
> 
> The "make chunks" step uses a custom table-of-contents file to control
> how the HTML is chunked.  Then it runs the HTML through a script to fix
> up links and remove unwanted tags.
> 
> I still have some work to do, like fixing the section links in the XML
> source to match the links that are currently in use.  But I thought I'd
> post a preliminary webrev in case anyone wants to look at what I have
> now.  I'm hoping to have a more complete webrev ready in the next week
> or so.
> 
>   http://cr.opensolaris.org/~kupfer/devref-chunks/
> 
> I'm fairly new to Python, so recommendations on how to improve the
> fixup.py script are particularly welcome.

Makefile:
- I'm curious why ALLFILES and FILES are identical, and how to decide 
which one to use?  Particularly in the rawchunks rule, which uses 
ALLFILES as a dependency, but FILES in the action.
- In the chunks rule, do you need to wait between rawchunks and 
$(ONCHUNKS), or to make a dependency for $(ONCHUNKS) on rawchunks?
- It would seem more appropriate to have the dir/file rules depend on 
dir, and have a rule to create it, rather than embedding the mkdirs in 
the other actions?

fixup.py:
- In this line:
        40     targre = re.compile(r'^(\w+)(.html|/)(#\w+)?$')
...does the "." in ".html" need to be quoted, lest it match any character?

- This:
  124       def get_class(attrlist):
  125           '''Return the div class or an empty string.'''
  126           for (attr, val) in attrlist:
  127               if attr == "class":
  128                   return val
  129           return ""
...could be more Python-esquely written as
  124       def get_class(attrlist):
  125           '''Return the div class or an empty string.'''
  126           val = [ v for k, v in attrlist if k == "class" ] + [ "" ]
  127          return val[0]
(but it's purely a style comment, not correctness).

- Similarly, this:
  131   divclass = get_class(attributes)
  132   if divclass == "navfooter" or divclass == "navheader" or \
  133          divclass == "toc":
...could be
  131   if get_class(attributes) in [ "navfooter", "navheader", "toc" ]:

- What does the assertion on line 134 gain you?  Are you asserting that 
xsltproc is spitting out well-formed html?

- In start_a(), if you simply build a different list as you go, then you 
don't need to do the indexing dance.

- You could simply fold the href fixup into unknown_starttag():
  166 def unknown_starttag(self, tag, attributes):
  167     '''Print the tag and attributes unless it is in a block that
  168     is being suppressed.'''
  169     if self.swallow:
  170         return
  171     attrstr = ""
  172     if attributes:
  173         attrstr = " " + " ".join('%s="%s"' % (a, fix_href(v)) \
  174                            for a, v in attributes)
  175     self.my_output.append("<%s%s>" % (tag, attrstr))
(and then fix_href() would need to properly handle other attributes, too.)

--Mark



Reply via email to