Ok, i will answer for myself, because its quite simple:
tidy.setTidyMark(false);
thats all ;)
carsten nichte wrote:
I have converted my html file to xhtml with tidy. It works great, but tidy inserts a metatag in the header:
<head> <meta name="generator" content="HTML Tidy, see www.w3.org" /> </head>
Now itext has problems to handle this (see Exception). When i remove the meta-tag all works fine. Has anybody an idea how to: 1.) remove the meta-tag / or head during parsing.
org.w3c.tidy.Tidy tidy = new org.w3c.tidy.Tidy(); tidy.setXHTML(true); tidy.setMakeClean(true); tidy.setDropEmptyParas(true); tidy.setXmlOut(false);
tidy.setErrout(new PrintWriter(new java.io.FileWriter("d:/output-err.html"), true));
in = new BufferedInputStream(new java.io.FileInputStream("D:/test.html"));
out = new java.io.FileOutputStream("d:/test_tidy.html");
tidy.parse(in, out);
in.close();
out.close();
Document document = new Document(); document.open(); PdfWriter pdfWriter = PdfWriter.getInstance( document, new FileOutputStream( "d:/RS7.00_tidy.pdf" ) );
com.lowagie.text.html.HtmlParser.parse(document, "d:/test_tidy.html"); document.close( );
ExceptionConverter: com.lowagie.text.DocumentException: The document is open; you can only add Elements with content.
at com.lowagie.text.Document.add(Document.java:254)
at com.lowagie.text.xml.SAXiTextHandler.handleStartingTags(SAXiTextHandler.java:417)
at com.lowagie.text.html.SAXmyHtmlHandler.startElement(SAXmyHtmlHandler.java:146)
at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:434)
at org.apache.xerces.impl.XMLNamespaceBinder.startElement(XMLNamespaceBinder.java:571)
at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:796)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:752)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1454)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:333)
at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:529)
at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:585)
at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:147)
at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1148)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:223)
at com.lowagie.text.html.HtmlParser.go(HtmlParser.java:109)
at com.lowagie.text.html.HtmlParser.parse(HtmlParser.java:125)
at limbus.apps.seminareditor.export.PdfExport.<init>(PdfExport.java:97)
at limbus.apps.seminareditor.export.PdfExport.main(PdfExport.java:231)
------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions
