Pasted content from word, ( /me shivers ). Word aint the best thing for
producing html :)
it's word then that uses <font> tag. Maybe word can be configured to
output pt instead of px and in, and to produce fixed width tables, or
not to nest tables. But I fear for it.
I think your best option is to implement a TagProcessor for font, or add
it to the TagFactory so that it maps to something existing.
You can't use the default settings for that. see
http://demo.itextsupport.com/xmlworker/itextdoc/flatsite.html#itextdoc-menu-7
for an example of how to setup the whole thing yourself. (This
documentation is still a work in progress but certainly good enough to
get you started )
From
Tags.getHtmlTagProcessorFactory()
You get a DefaultTagProcessorFactory there you can map font to
http://api.itextpdf.com/xml/com/itextpdf/tool/xml/html/Span.html a
thanks to
http://api.itextpdf.com/xml/com/itextpdf/tool/xml/html/DefaultTagProcessorFactory.html#addProcessor(java.lang.String,%20com.itextpdf.tool.xml.html.TagProcessor)
this way the text in the font tag will be in the PDF and handled as the
span tag. Bear in mind, the attributes from the font tag won't be taken
into account, the same counts for some of the attributes from the table
tag. ( Check the CSS Support section of the documentation to be sure )
You could write your own TagProcessor and add your created Chunk or
Paragraph to the ProcessObject.
Normal output from CKEditor should work, in the demo we used TinyMCE,
but we also noticed that HTML pasted from word usually ended up not like
you would want.
The initial intention of the XMLWorker was not transforming word
documents (be it exported to HTML). We didn't think of Word HTML being a
good HTML reference :) It's usually not even valid html.
We based our self on the w3c spec of XHTML and bit HTML5 and that of
course resulted in a more restrictive framework then internet browsers.
Perhaps you can arrange some of the things by adding your own CSS file
where you set certain css values yourself if they are not overridden by
css properties added later.
I hope I gave you some helpful directions and ideas
Kind Regards
Balder
On 15/10/2011 16:04, Mark Ramos wrote:
Thanks again Balder, The only challenge we had is that the input of
all these html is from a CKEditor to make contents in liferay. Also,
by using CKEditor, one of the scenarios/cases is to paste contents
from a word document directly to CKEditor. Then when the content is
rendered in html we have to export it in PDF. We also tried using
flying-saucer which also uses iText 2.0.8 but there are items that are
also not rendered properly. I appreciate giving your time to us.
Thank you very much!
Mark
On Sat, Oct 15, 2011 at 9:51 PM, Balder VC <li...@redlab.be
<mailto:li...@redlab.be>> wrote:
Hi,
A PDF is not a browser, while creating your HTML you should still
bare in mind that the end result will be a PDF.
Couple tips:
It's better to write measures in points (pt). Then no conversion
is done by the XMLWorker.
It's a good idea to check the supported tags (in the documentation
or inside com.itextpdf.tool.xml.html.Tags there are the defaults
listed).
The <font> tag, used in the htmlfile, is not supported, that is
why among others the 'Test' text is not there. You can easily
write a TagProcessor that does support the font tag as you see fit.
If I'm correct it is better to define a width for your tables,
then the XMLWorker does not have to try and fit text in it.
Nesting tables is possible, but it makes it harder for XMLWorker
to fit tables on the page.
Regards
Balder
On 14/10/2011 8:33, Mark Ramos wrote:
Hi,
Thanks for the links Balder.
I tried to render the enclosed html file to pdf and I did not get
a good result. Please check the attachments.
I used this code snippet:
Document document = new Document(PageSize.LETTER);
PdfWriter instance = PdfWriter.getInstance(document, new
FileOutputStream("/home/mramos/html3.pdf"));
document.open();
FileReader br = new
FileReader("/home/mramos/pdf_cfadmin2.html");
XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
worker.parseXHtml(instance, document, br);
document.close();
Any help is much appreciated.
Many thanks!
-
--
twitter <http://twitter.com/redlabbe>
redlab-log <http://www.redlab.be/blog/>
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php