[htdig] htdig parsing

Fri, 06 Oct 2000 05:24:28 -0700

We are extensively converting our "legacy" html to  xhtml,
using a combination of  htdig,  JTidy and locally written
JChemTidy.

One element we have focused on much is  <object>. We 
replace all instances of  <embed> by< object>, because

a) <embed> is not well formed  (ie it should be <embed />
b) it is not validatable.  This is because the attributes of 
<embed> are not defined by a DTD, but are instead implicit
in whatever attributes the plugin that  <embed> resolves to 
supports. Thus two users with different plugins may well
be running implicitly different DTDs for their document.
This is not good. 

<object> solves both these problems.

Our only problem is that  htdig 3.2  does not parse object.

A long time ago, we hacked htdig 3.1 to parse  <embed> and
<object>, but these mods do  not appear to have been incorporated
into htdig 3.2.

If someone could rescue them, we would be very grateful.
On this point, if htdig could also be persuaded to index the
title attribute of elements such as  <object> it would be a great
help. As part of the xhtml conversion process, we build a title
if none exists, and it would be nice to have htdig pick it up!

Thanks.  
-- 

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0)20 7594 5804 (Fax)
Dept. Chemistry, Imperial College, London, SW7  2AY, UK. 
http://www.ch.ic.ac.uk/rzepa/


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to