You can try jTidy it has xHtml formatter

Markus Wiederkehr wrote:

Sorry if this question is a bit off topic, but I thought people on
this mailing list might know...

I'm working on a Tapestry application that has to display arbitrary
HTML files inline (like GMail does when you receive an HTML e-mail,
for example). So basically I want to develop a Tapestry component that
can render an HTML file.

The obvious requirement is that I don't want my page to be corrupted
by the HTML file. So I would have to remove /bad/ elements like
SCRIPT, /bad/ attributes like ID or NAME, etc. But in addition the
result also has to be compliant to XHTML 1.0 Transitional, no matter
how sloppy the original HTML file is.

I tried to run the HTML through NekoHTML to create a DOM. Then I tried
to remove bad elements and attributes from that DOM, but still the
result might not be XHTML compliant due to missing or misplaced
elements. So that approach looks like a lot of work...

Has anyone done something like this or has anyone a better idea how to
accomplish this?

Thanks,

Markus

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to