Ok, i will answer for myself, because its quite simple:

tidy.setTidyMark(false);

thats all ;)

carsten nichte wrote:
I have converted my html file to xhtml with tidy.
It works great, but tidy inserts a metatag in the header:

<head>
<meta name="generator" content="HTML Tidy, see www.w3.org" />
</head>


Now itext has problems to handle this (see Exception). When i remove the meta-tag all works fine. Has anybody an idea how to: 1.) remove the meta-tag / or head during parsing.




org.w3c.tidy.Tidy tidy = new org.w3c.tidy.Tidy(); tidy.setXHTML(true); tidy.setMakeClean(true); tidy.setDropEmptyParas(true); tidy.setXmlOut(false);

tidy.setErrout(new PrintWriter(new java.io.FileWriter("d:/output-err.html"), true));
in = new BufferedInputStream(new java.io.FileInputStream("D:/test.html"));
out = new java.io.FileOutputStream("d:/test_tidy.html");
tidy.parse(in, out);
in.close();
out.close();


Document document = new Document();
document.open();
PdfWriter pdfWriter =
PdfWriter.getInstance( document,
new FileOutputStream( "d:/RS7.00_tidy.pdf" ) );

com.lowagie.text.html.HtmlParser.parse(document, "d:/test_tidy.html");
document.close( );


ExceptionConverter: com.lowagie.text.DocumentException: The document is open; you can only add Elements with content.


at com.lowagie.text.Document.add(Document.java:254)
at com.lowagie.text.xml.SAXiTextHandler.handleStartingTags(SAXiTextHandler.java:417)


at com.lowagie.text.html.SAXmyHtmlHandler.startElement(SAXmyHtmlHandler.java:146)

at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:434)

at org.apache.xerces.impl.XMLNamespaceBinder.startElement(XMLNamespaceBinder.java:571)

at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:796)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:752)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1454)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:333)

at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:529)

at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:585)

at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:147)
at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1148)


at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:223)
at com.lowagie.text.html.HtmlParser.go(HtmlParser.java:109)
at com.lowagie.text.html.HtmlParser.parse(HtmlParser.java:125)
at limbus.apps.seminareditor.export.PdfExport.<init>(PdfExport.java:97)
at limbus.apps.seminareditor.export.PdfExport.main(PdfExport.java:231)




-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to