I think you can put stuff like and other character codes in a DTD (not sure though,
but ought to work) as they are not XML nodes.
<IMG ...> though will never ever be XML. The appropriate string (xhtml) is <IMG ... />, and the browsers I use all accept such an image tag.
You really do need to use a HTMLGenerator if you want to process html, cuase it must
be XML and HTML is just not strict enough for XML.
If you do not want your documents to be served that way you can use a serializer which
serializes to HTML (not XHTML). Don't know if a serializer is available which removes
things like </img> and </input> though. An transformer is naturally not good enough, as that
by definition transforms xml to another xml.
Leon
Upayavira wrote:
[EMAIL PROTECTED] wrote:
Dear Upayavira,
this is not a question of generator but more a question of serializer I think
Well, what you want to do is use the HTMLGenerator to read in the HTML, and then serialize it as XML or as XHTML. You can't serialize HTML directly, as you can't use it within a pipeline because it isn't valid XML. Therefore you convert it to valid XML with the HTMLGenerator, then serialize it.
Hope that makes sense.
Upayavira
*Upayavira <[EMAIL PROTECTED]>*
22/04/2004 10:42 Please respond to users
To: [EMAIL PROTECTED]
cc: Subject: Re: Transofrmer from Invalid HTML to XHTML or XML
[EMAIL PROTECTED] wrote:
> > Dear Members, > > I have some HTML with special characters as <IMG > .... that > I want to save as valid XML or XHTML.
Look at the HTMLGenerator in the HTML block.
Upayavira
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
