Le 6 août 05, à 09:34, THUFIR HAWAT a écrit :
On my hard drive are a multitude of HTML files which I'd like to enter into a database such as hibernate. What would be the tool to "extract" the data from the HTML into XML in order to insert the data?
Cocoon (http://cocoon.apache.org) will allow you to build pipelines to parse the HTML (using JTidy or the NekoHTML parser), process it via XSLT transforms to clean it up and feed it to java objects for storage, or go directly to SQL statements via its SQLTransformer which executes SQL statements embedded in XML documents.
An alternative, especially if it's a one-off job, would be to build your own pipeline using NekoHTML, Xalan, and commons Digester or another XML-to-beans mapper to build your java objects, using ant to combine these tools.
-Bertrand
smime.p7s
Description: S/MIME cryptographic signature