[iText-questions] Re: Exception wile parsing html with meta-tag

carsten nichte Wed, 19 May 2004 23:49:50 -0700

Ok, i will answer for myself, because its quite simple:

tidy.setTidyMark(false);

thats all ;)

carsten nichte wrote:

I have converted my html file to xhtml with tidy.
It works great, but tidy inserts a metatag in the header:
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org" />
</head>
Now itext has problems to handle this (see Exception).
When i remove the meta-tag all works fine.
Has anybody an idea how to:
1.) remove the meta-tag / or head during parsing.
org.w3c.tidy.Tidy tidy = new org.w3c.tidy.Tidy();
tidy.setXHTML(true);
tidy.setMakeClean(true);
tidy.setDropEmptyParas(true);
tidy.setXmlOut(false);
tidy.setErrout(new PrintWriter(new java.io.FileWriter("d:/output-err.html"), true)); in = new BufferedInputStream(new java.io.FileInputStream("D:/test.html")); out = new java.io.FileOutputStream("d:/test_tidy.html"); tidy.parse(in, out); in.close(); out.close();
Document document = new Document();
document.open();
PdfWriter pdfWriter =
PdfWriter.getInstance( document,
new FileOutputStream( "d:/RS7.00_tidy.pdf" ) );
com.lowagie.text.html.HtmlParser.parse(document, "d:/test_tidy.html");
document.close( );
ExceptionConverter: com.lowagie.text.DocumentException: The document is open; you can only add Elements with content.

at com.lowagie.text.Document.add(Document.java:254) at com.lowagie.text.xml.SAXiTextHandler.handleStartingTags(SAXiTextHandler.java:417)

at com.lowagie.text.html.SAXmyHtmlHandler.startElement(SAXmyHtmlHandler.java:146)

at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:434)

at org.apache.xerces.impl.XMLNamespaceBinder.startElement(XMLNamespaceBinder.java:571)

at org.apache.xerces.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:796)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:752)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1454)

at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:333)

at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:529)

at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:585)

at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:147) at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1148)

at javax.xml.parsers.SAXParser.parse(SAXParser.java:345) at javax.xml.parsers.SAXParser.parse(SAXParser.java:223) at com.lowagie.text.html.HtmlParser.go(HtmlParser.java:109) at com.lowagie.text.html.HtmlParser.parse(HtmlParser.java:125) at limbus.apps.seminareditor.export.PdfExport.<init>(PdfExport.java:97) at limbus.apps.seminareditor.export.PdfExport.main(PdfExport.java:231)
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click

------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ iText-questions mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/itext-questions

[iText-questions] Re: Exception wile parsing html with meta-tag

Reply via email to