On 8/6/05, Bertrand Delacrétaz <[EMAIL PROTECTED]> wrote:
...
> Cocoon (http://cocoon.apache.org) will allow you to build pipelines to
> parse the HTML (using JTidy or the NekoHTML parser), process it via
> XSLT transforms to clean it up and feed it to java objects for storage,
> or go directly to SQL statements via its SQLTransformer which executes
> SQL statements embedded in XML documents.

I like the idea of using JTidy since that's what I'm most familiar
with.  I'll take a look at Cocoon.  The immediate goal is to get the
HTML inserted into a database by any means.

> An alternative, especially if it's a one-off job, would be to build
> your own pipeline using NekoHTML, Xalan, and commons Digester or
> another XML-to-beans mapper to build your java objects, using ant to
> combine these tools.
> 
> -Bertrand

Where does Xalan fit into this?  Xalan is an XLST processor, but what
does that really mean?  Xalan is the "engine" which does the actual
transform from HTML to XML, based on what the XSLT form specifies?

I'm trying to find a "transforms 101" manual or example where some
XLST is used to transform HTML to XML.  I imagine that this isn't so
unusual.

If  "Xalan-Java is an XSLT processor for transforming XML documents
into HTML, text, or other XML document types." then don't I want the
inverse of Xalan, HTML to XML?  is that Xerces?


Thanks,

Thufir

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to