[docs-discuss] [on-discuss] ON DevRef HTML chunking

Mike Kupfer Wed, 26 Nov 2008 12:26:23 -0800

Thanks for the comments!

>>>>> "Mark" == Mark J Nelson <Mark.J.Nelson at Sun.COM> writes:


Mark> Makefile:
Mark> - I'm curious why ALLFILES and FILES are identical, and how to
Mark>   decide which one to use?  

The Makefile originally came with ALLFILES and FILES; I don't know why
it was set up that way, either.  Of course, that excuse doesn't work for
the new chunking rules ;-).   I'll rationalize the file list variables
(everywhere in the makefile).

Mark> - In the chunks rule, do you need to wait between rawchunks and
Mark> $(ONCHUNKS), or to make a dependency for $(ONCHUNKS) on rawchunks?

I don't think so.  Don't forget, this is serial make.

Mark> - It would seem more appropriate to have the dir/file rules depend
Mark>   on dir, and have a rule to create it, rather than embedding the
Mark>   mkdirs in the other actions?

The problem with that approach is that any time the directory mtime
changes (e.g., something gets added to the directory), make(1) will want
to rebuild the contents of the directory.  I'd rather have a setup such
that if you invoke make(1) twice, the second one does nothing.

In the case of rawchunks, this is less compelling an argument because
the rules always fire.  But I'm thinking about adding a "touch
rawchunks" to the rules, so that xsltproc is only run when it's actually
needed.

But I'll see if there's a cleaner place to put the mkdirs.

Mark> fixup.py:
>       40     targre = re.compile(r'^(\w+)(.html|/)(#\w+)?$')
Mark> ...does the "." in ".html" need to be quoted, lest it match any
Mark> character?

Will fix.

>  126           for (attr, val) in attrlist:
>  127               if attr == "class":
>  128                   return val
>  129           return ""
Mark> ...could be more Python-esquely written as
>  126           val = [ v for k, v in attrlist if k == "class" ] + [ "" ]
>  127         return val[0]
Mark> (but it's purely a style comment, not correctness).

Ugh.  I think I'll leave it as-is for the sake of clarity.

But for my own edification: is the interpreter able to recognize the
second form and handle it more efficiently than the first form?

>  131   divclass = get_class(attributes)
>  132   if divclass == "navfooter" or divclass == "navheader" or \
>  133          divclass == "toc":
Mark> ...could be
>  131   if get_class(attributes) in [ "navfooter", "navheader", "toc" ]:

Will change.

Mark> - What does the assertion on line 134 gain you?  Are you asserting
Mark> that xsltproc is spitting out well-formed html?

No, it's more to verify that xsltproc is producing HTML that allows for
a simple state machine that alternates between "swallow" and "don't
swallow".  I'll add some comments.

Mark> - In start_a(), if you simply build a different list as you go,
Mark>   then you don't need to do the indexing dance.

I'm not sure I understand.  Are you suggesting building up a second
attributes list, and then passing it to unknown_starttag() instead of
the original attributes list?  I'll see if that's clearer.

Mark> - You could simply fold the href fixup into unknown_starttag():

I think I prefer the simplicity of the current design:
unknown_starttag() just prints out what it's given (and doesn't try to
be clever about it).

cheers,
mike

[docs-discuss] [on-discuss] ON DevRef HTML chunking

Reply via email to