On Apr 21, 2005, at 10:51, Xoan wrote:

Anyway I don't visualize how to decouple the plain text and design
info (like bold, paragraphs, links, ...) int the html documents stored
as elements of the xml document.I need this separation in order to
permit full text searching in those html documents and to allow
retrieve the document and show it again in a browser (without format
loss).

If you're doing this for searching, there's always the lazy solution: make a copy of the document, strip all markup and store it separately so it can be searched.


If you're trying to keep the markup and convert it to XML, I would suggest using Chaperon -- already in Cocoon as a block -- to parse the (possibly very loose) HTML and generate clean XML just the way you want it.

HTH,

Andre.


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to