-----Mensagem original-----
De: Shankar Narayanan.K,ASDC Chennai [mailto:[EMAIL PROTECTED]]
Enviada em: terça-feira, 25 de junho de 2002 09:50
Para: [EMAIL PROTECTED]
Assunto: RE: [iText-questions] Parsing HTML to PDFUse HTMLDoc..it's free and very good
-----Original Message-----
From: Glauco Cesar de Castro [SMTP:[EMAIL PROTECTED]]
Sent: Tuesday, June 25, 2002 6:08 PM
To: [EMAIL PROTECTED]
Subject: [iText-questions] Parsing HTML to PDFHas anyone succeed in parsing an HTML file to PDF File? I have tried to use Tiny HTML (JTidy) but it doesnt work. It produces the right (X)Html but the IText returns me an error:
ExceptionConverter: org.xml.sax.SAXException: String index out of range: -30
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:878)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:900)
at com.lowagie.text.html.HtmlParser.go(Unknown Source)
at com.lowagie.text.html.HtmlParser.parse(Unknown Source)
at Start.main(Start.java:90)
String index out of range: -30
This error occurs in the following HTML code and seens to be on the second line, because if I delete this line (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" " <http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd>>), JBuilder will return another error:
ExceptionConverter: org.xml.sax.SAXParseException: The entity "Ccedil" was referenced, but not declared.
at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:969)
at org.apache.xerces.readers.DefaultEntityHandler.startReadingFromEntity(DefaultEntityHandler.java:596)
at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1315)
at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:380)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:861)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:900)
at com.lowagie.text.html.HtmlParser.go(Unknown Source)
at com.lowagie.text.html.HtmlParser.parse(Unknown Source)
at Start.main(Start.java:90)
The entity "Ccedil" was referenced, but not declared.
And if I replace all & entities, another error occurs:
ExceptionConverter: org.xml.sax.SAXParseException: The element type "p" must be terminated by the matching end-tag "</p>".at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:969)
at org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(XMLDocumentScanner.java:634)
at org.apache.xerces.framework.XMLDocumentScanner.abortMarkup(XMLDocumentScanner.java:683)
at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1187)
at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:380)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:861)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:900)
at com.lowagie.text.html.HtmlParser.go(Unknown Source)
at com.lowagie.text.html.HtmlParser.parse(Unknown Source)
at Start.main(Start.java:90)
The element type "p" must be terminated by the matching end-tag "</p>".
Thanks a lot for any help!
Glauco
The Html code I am trying to parse:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" " <http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd>>
<html
xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<p
style="text-align: center"><strong><img
border="0"
height="61"
src="http://localhost:8080/protocolo/docs_upload/modelos/brasao_alagoas.jpg"
width="49" /></strong></p>
<p
style="text-align: center"><strong>SECRETARIA DA FAZENDA DO
ESTADO DE ALAGOAS<br />
PROJETO DE MODERNIZAÇÃO FAZENDÁRIA -
PROMOFAZ.<br />
UNIDADE DE CONTROLE ESTADUAL - UCE<br />
COMPONENTE ORGANIZAÇÃO E
GESTÃO</strong></p>
<p
style="text-align: center"> </p>
<p
style="text-align: center"><strong> OFÍCIO teste2
35/2002
Maceió, 25 de Junho de 2002</strong></p>
<p
style="text-align: center"> </p>
<p
style="text-align: left">Senhor Coordenador,</p>
<p
style="text-align: left"> </p>
<p
style="text-align: left"> Servimo-nos do
presente para submeter a superior consideração de
V.Sa. a solicitação de diárias que nos
encaminha o coordenador Regional da 3° CRAF através
do ofício çlkj lkj lk de klj lkj lk
próximo passado, em anexo.<br />
Sendo só para o momento,
aproveitamos para reiterar nossos protestos da<br />
Mais alta consideração e
apreço.
.</p>
<p
style="text-align: center"><strong>Administrador do
Protocolo<br />
</strong>LÍDER DO COMPONENTE O&G<br />
</p>
<p
style="text-align: left"> <br />
Ao<br />
ILMO. SR.<br />
MARCOS ANTONIO GARCIA<br />
DD.Coordenador Geral da União de
Coordenação Estadual - UCE/AL.<br />
</p>
<p
style="text-align: left">
</p>
<table
border="1"
cellpadding="0"
cellspacing="0"
style="HEIGHT: 81px; WIDTH: 638px">
<tbody>
<tr>
<td>
<p
style="text-align: center"><strong>SEFAZ/
AL</strong></p>
<p
style="text-align: center"><strong>" EXCELÊNCIA
NA GESTÃO FAZENDÁRIA, PROPICIANDO
MELHOR QUALIDADE DE VIDA EM
ALAGOAS"</strong></p>
</td>
</tr>
</tbody>
</table>
<br />
<br />
<br />
</body>
</html>
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.370 / Virus Database: 205 - Release Date: 05/06/2002________________________________________________________________________________________________________
Disclaimer:
This document is intended for transmission to the named recipient only. If you are not that person, you should note that legal rights reside in this document and you are not authorized to access, read, disclose, copy, use or otherwise deal with it and any such actions are prohibited and may be unlawful. The views expressed in this document are not necessarily those of HCL Technologies Ltd. Notice is hereby given that no representation, contract or other binding obligation shall be created by this e-mail, which must be interpreted accordingly. Any representations, contractual rights or obligations shall be separately communicated in writing and signed in the original by a duly authorized officer of the relevant company. If you received this email in error, please immediately notify: [EMAIL PROTECTED]________________________________________________________________________________________________________
---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.370 / Virus Database: 205 - Release Date: 05/06/2002
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.370 / Virus Database: 205 - Release Date: 05/06/2002