Re: [iText-questions] HTML to PDF with XMLWorker

Mark Ramos Sat, 15 Oct 2011 10:33:16 -0700

Hi Balder,

I most certainly agree with that. It was not our expected scenario before.
:) Thanks for all the info and guide you've provided. I'll check on them and
see what I can come up with. Will keep in touch once I have update on it.
Your help is very much appreciated!



Sincerely,

Mark

On Sun, Oct 16, 2011 at 1:03 AM, Balder VC <li...@redlab.be> wrote:

>  Pasted content from word, ( /me shivers ). Word aint the best thing for
> producing html :)
>  it's word then that uses <font> tag. Maybe word can be configured to
> output pt instead of px and in, and to produce fixed width tables, or not to
> nest tables. But I fear for it.
>
> I think your best option is to implement a TagProcessor for font, or add it
> to the TagFactory so that it maps to something existing.
> You can't use the default settings for that. see
> http://demo.itextsupport.com/xmlworker/itextdoc/flatsite.html#itextdoc-menu-7for
>  an example of how to setup the whole thing yourself. (This documentation
> is still a work in progress but certainly good enough to get you started )
>
> From
>
> Tags.getHtmlTagProcessorFactory()
>
>
> You get a DefaultTagProcessorFactory there you can map font to
> http://api.itextpdf.com/xml/com/itextpdf/tool/xml/html/Span.html a thanks
> to
> http://api.itextpdf.com/xml/com/itextpdf/tool/xml/html/DefaultTagProcessorFactory.html#addProcessor(java.lang.String,%20com.itextpdf.tool.xml.html.TagProcessor)this
>  way the text in the font tag will be in the PDF and handled as the span
> tag. Bear in mind, the attributes from the font tag won't be taken into
> account, the same counts for some of the attributes from the table tag. (
> Check the CSS Support section of the documentation to be sure )
> You could write your own TagProcessor and add your created Chunk or
> Paragraph to the ProcessObject.
>
> Normal output from CKEditor should work, in the demo we used TinyMCE, but
> we also noticed that HTML pasted from word usually ended up not like you
> would want.
> The initial intention of the XMLWorker was not transforming word documents
> (be it exported to HTML). We didn't think of Word HTML being a good HTML
> reference :) It's usually not even valid html.
>
> We based our self on the w3c spec of XHTML and bit HTML5 and that of course
> resulted in a more restrictive framework then internet browsers.
>
> Perhaps you can arrange some of the things by adding your own CSS file
> where you set certain css values yourself if they are not overridden by css
> properties added later.
>
>
> I hope I gave you some helpful directions and ideas
>
>
> Kind Regards
> Balder
>
>
> On 15/10/2011 16:04, Mark Ramos wrote:
>
> Thanks again Balder, The only challenge we had is that the input of all
> these html is from a CKEditor to make contents in liferay. Also, by using
> CKEditor, one of the scenarios/cases is to paste contents from a word
> document directly to CKEditor. Then when the content is rendered in html we
> have to export it in PDF. We also tried using flying-saucer which also uses
> iText 2.0.8 but there are items that are also not rendered properly. I
> appreciate giving your time to us.
>
>  Thank you very much!
>
>  Mark
>
> On Sat, Oct 15, 2011 at 9:51 PM, Balder VC <li...@redlab.be> wrote:
>
>>  Hi,
>>
>> A PDF is not a browser, while creating your HTML you should still bare in
>> mind that the end result will be a PDF.
>>
>> Couple tips:
>> It's better to write measures in points (pt). Then no conversion is done
>> by the XMLWorker.
>> It's a good idea to check the supported tags (in the documentation or
>> inside com.itextpdf.tool.xml.html.Tags there are the defaults listed).
>> The <font> tag, used in the htmlfile, is not supported, that is why among
>> others the 'Test' text is not there. You can easily write a TagProcessor
>> that does support the font tag as you see fit.
>> If I'm correct it is better to define a width for your tables, then the
>> XMLWorker does not have to try and fit text in it. Nesting tables is
>> possible, but it makes it harder for XMLWorker to fit tables on the page.
>>
>>
>>
>> Regards
>> Balder
>>
>>
>> On 14/10/2011 8:33, Mark Ramos wrote:
>>
>> Hi,
>>
>> Thanks for the links Balder.
>>
>> I tried to render the enclosed html file to pdf and I did not get a good
>> result. Please check the attachments.
>>
>> I used this code snippet:
>>
>>         Document document = new Document(PageSize.LETTER);
>>         PdfWriter instance = PdfWriter.getInstance(document, new
>> FileOutputStream("/home/mramos/html3.pdf"));
>>         document.open();
>>         FileReader br = new FileReader("/home/mramos/pdf_cfadmin2.html");
>>         XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
>>         worker.parseXHtml(instance, document, br);
>>         document.close();
>>
>> Any help is much appreciated.
>>
>>
>> Many thanks!
>>
>>
>> -
>>
>>
> --
> twitter <http://twitter.com/redlabbe>
> redlab-log <http://www.redlab.be/blog/>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples:
> http://itextpdf.com/themes/keywords.php
>



-- 
Mark A. Ramos
Senior Software Engineer
Novare Technologies Inc.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct

_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Re: [iText-questions] HTML to PDF with XMLWorker

Reply via email to