Re: [l2h] Generating XHTML

Ross Moore Thu, 04 Nov 2004 15:19:04 -0800

Hi Fred,

On 05/11/2004, at 8:24 AM, Fred L. Drake, Jr. wrote:

I know this has come up before, but I haven't seen it discussed recetly, so
perhaps the answers have changed. ;-)

Is there any way to get LaTeX2HTML to generate XHTML instead of classic
SGML-based HTML? I'd really like to move the Python documentation into the
"new world" as much as possible.

No, I've not done any work on this yet.

But HTMLtidy can be used for such a conversion, run as a
post-processor after the LaTeX2HTML job.

The intro page at: http://www.w3.org/People/Raggett/tidy/
describes the boolean option:

output-xhtml: bool
If set to yes, Tidy will generate the pretty printed output writing it as extensible HTML. The default is no. This option causes Tidy to set the doctype and default namespace as appropriate to XHTML. If a doctype or namespace is given they will checked for consistency with the content of the document. In the case of an inconsistency, the corrected values will appear in the output. For XHTML, entities can be written as named or numeric entities according to the value of the "numeric-entities" property. The tags and attributes will be output in the case used in the input document, regardless of other options.

If there's not a way to do this with LaTeX2HTML, pointers to some other
LaTeX-to-XML tool would be appreciated. (Especially if it doesn't involve
TeXML!)

You can configure LaTeX2HTML to run this automatically on every page,
after all other processing has been completed.

There are 3 places in lateX2HTML where you could install such an extra
post-processing step, by defining your own Perl subroutine:

&post_post_process

&document_post_post_process

These two are Perl subroutines that will be called (if defined), to act on the
contents of the $_ container, before being written to the .html files.
( &document_post_post_process acts a little later than &post_post_process
*after* the <ADDRESS> tags have been added.)

&html_validate

This subroutine is called subject to the values of certain variables.
Currently it is defined to act on the completed HTML pages; viz.

sub html_validate {
my($extn) = $EXTN;
if (!($EXTN =~ /^\.html?$/i)) {
$extn =~ s/^[^\.]*(\.html?)$/$1/;
}
print "\n *** Validating ***\n";
system("$HTML_VALIDATOR *$extn");
}

Indeed this makes a system call to a program that acts on all files
having the right extension that happen to live in the $DESTDIR
directory where the web-pages are being built.

You can easily write an alternative subroutine to act instead,
placing it in a .latex2html-init file.

Thanks!

Hope this helps,

Ross

-Fred

--
Fred L. Drake, Jr. <fdrake at acm.org>

_______________________________________________
latex2html mailing list
[EMAIL PROTECTED]
http://tug.org/mailman/listinfo/latex2html

_______________________________________________
latex2html mailing list
[EMAIL PROTECTED]
http://tug.org/mailman/listinfo/latex2html

Re: [l2h] Generating XHTML

Reply via email to