On Aug 22, 2011, at 11:52 PM, Jeremias Maerki wrote: > Some hopefully useful comments inline below... (a little late since I've > only just joined the mailing list now)
Not late at all. Some of us are just starting with the ODF Toolkit. Yegor and I have done quite a bit of work in Apache POi with PPT / PPTX rendering. Our main use is from PS. We comment our PS with layout information. It is not a general format it is from a proprietary layout system that I have developed and maintained for over 30 years - originally inspired by references to TeK, Metafont, XICS, and Interpress (Knuth on one side and Warnock on the other.) At quick glance there is a good fit with the Intermediate format. I'd like to checkout the Knuth line breaking algorithm. This is something critical that MS has never understood. For example centering a title that is just a little too wide for the width. In a Knuth based scoring system you would split near the middle and in PPT you break at the first spot that works. I'll introduce more fully later. Regards, Dave > > On 16.08.2011 03:14:05 Biao Han wrote: >> >> >> FYI. About an ODF->PDF convertor contribution. >> >> Regards >> >> Biao Han (Devin) >> SOA Standards Growth, Emerging Technology Institute(ETI), IBM China >> Software Development Laboratory >> Tel:(86-10)82450541 >> Email: [email protected] >> Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software Park, >> No. 8 Dong Bei Wang West Road, ShangDi, Haidian District, Beijing, >> P.R.C.100193 >> ----- Forwarded by Biao Han/China/IBM on 2011-08-16 09:13 ----- >> >> From: Biao Han/China/IBM >> To: Angelo zerr <[email protected]> >> Cc: [email protected], [email protected] >> Date: 2011-08-12 17:41 >> Subject: Re: [odfdom-dev] Status of the Simple Java API for ODF and >> ODFDOM - 08/10/2011 >> >> >> Angelo zerr <[email protected]> wrote on 2011-08-12 17:26:24: >> >>> From: Angelo zerr <[email protected]> >>> To: Biao Han/China/IBM@IBMCN >>> Cc: [email protected], [email protected], >>> [email protected], [email protected] >>> Date: 2011-08-12 17:27 >>> Subject: Re: [odfdom-dev] Status of the Simple Java API for ODF and >>> ODFDOM - 08/10/2011 >>> >>> Hi Biao, >> >>> >>> Thanks for your contribution intention. But we found iText uses the >>> AGPL license: http://itextpdf.com/terms-of-use/index.php >>> So it would be difficult to use that in an Apache 2.0 licensed project. >>> >>> Yes I thought that. That's very shame -( >>> >>> >>> Do you have plan to supply a version using PDFBox pr FOP? Both of >>> them will be OK for Apache 2.0 license. >>> And as far as I know, PDFBox may be easier. Its API is similar with >> iText. >>> >>> XDocReport uses iText because ODT->PDF processes like this : >>> >>> 1) load ODT with ODFDOM >>> 2) visit ODFDOM and generate iText structure PDF per ODFDOM structure >>> >>> Problem with FOP is that you must have XML FO to generate PDF. I >>> have tried to do that (without ODFDOM) with XSL-FO, but performance >>> are very bad (even with XSLT cache, use xsl:key to cache compute of >>> styles....). Perhaps it's possible to use FOP with pur Java (without >>> XML FO) but I have not found samples. > > Generating XSL-FO from ODF is certainly something that has some benefit > on its own. But I wouldn't recommend to use XSL-FO when the goal is to > convert ODF to PDF. > > But something else (as you suspected): Apache FOP has its own PDF > library which is highly optimized for writing PDFs with very little > memory consumption. It is also rather fast. And you don't need to use > XSL-FO. In contrast to Apache PDFBox, Apache FOP has a Graphics2D/Java2D > implementation (PDFGraphics2D and PDFDocumentGraphics2D) that can make > generating PDFs easier. The downside is that the PDF library itself is > not separately documented and you'd have to look into FOP's source code > for hints on how to use it. I can help with pointers if desired. For now, > I can recommend looking at PDFDocumentGraphics2D for hints on how to > create a PDF document with FOP's PDF library from scratch: > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/svg/PDFDocumentGraphics2D.java?view=markup > > And then I'd like to point out another possible direction, one that > would allow you not only to generate PDF but actually all of FOP's > supported output formats (PDF, PostScript, AFP, PNG/TIFF, PCL, SVG > etc.). FOP has this so-called Intermediate Format (IF) which is a > low-level representation of a set of rendered pages (i.e. after layout!). > The set of instructions is relatively easy and can be produced via Java > calls or an XML stream. I've attached a sample file that shows the IF > format in XML representation. Some information on the format is found > here: > http://xmlgraphics.apache.org/fop/1.0/intermediate.html#usage-if > > The obvious advantage of using XSL-FO is that you don't have to write > your own layout engine for line breaking and stuff. But ODF also doesn't > map 1:1 to XSL-FO (page headers and footers work differently, for > example). The underlying concepts simply don't match. > > With Graphics2D or FOP's IF format, you'll need at least some kind basic > layout engine to do line and page breaking, footnote handling etc. I > don't know how much of that iText took from you. And I don't know if > PDFBox could match iText in the layout department. And FOP's layout > engine is too FO-oriented to be any useful, except maybe for the basic > implementation of the Knuth line breaking algorithm was is abstracted to > a reasonable level. > > Just pointing out possible routes. Obviously, you'll have to decide > which one fits best. > >>> For PDFBox, I have never used. Do you think this library manage the >>> same thing than iText (Table, table row, images widget...) and with >>> the same peformance? I must study it to see if it's possible to >>> implement a new converter with PDFBox. >> Please reference the cookbook >> http://pdfbox.apache.org/userguide/cookbook.html >> Or we can request help from their user mail list >> http://pdfbox.apache.org/mail-lists.html#users >>> >>> Thank a lot for your information;, >>> >>> Regards Angelo >>> >>> Whatever, thank you for your eager contribution intention! >>> >>> >>>> >>>> Regards Angelo >>>> >>> >>>> 2011/8/10 Biao Han <[email protected]> >>>> (We should send this to project mailing list, but we don't have one >>>> yet. so sorry for interrupt those guys in incubator general mailing >> list) >>>> >>>> ODF Toolkit move to Apache >>>> 1. SVN account has been created and is now available for use. We >>>> will discuss and start the code move after mail lists are ready; >>>> 2. The first board meeting is scheduled for Wed, 17 August 2011, 10 >>>> am Pacific. We have submitted a quarterly board report to here. >>>> 3. As we have been an Apache incubator project, so we will discuss >>>> and release ODF Toolkit in the new community. The original release >>>> plan have to be cancelled. >>>> >>>> Simple ODF >>>> 1. Reviewed and pushed a bug about TextProperties (#bug 357). >>>> 2. Reviewed and pushed three unit test coverage enhancement patches >>>> (#bug 241) . >>>> 3. The downloads of Simple ODF 0.6.5 has been to 204. This number >>>> equals with Simple ODF 0.4. But version 0.4 uses more 6 months get >>>> it, while version 0.6.5 uses only 40 days. >>>> >>>> ODFDOM >>>> 1. Working on data signature. There are two issues caused by >>>> OpenOffice block the process. >>>> (1) OpenOffice.org generate a Namespace unaware signature document. >>>> ODFDOM loads it fails. >>>> (2) OpenOffice.org creates multiple X509Certificates instead of the >>>> correct certification chain under ds:KeyInfo. >>>> see also: >>>> https://bugs.freedesktop.org/show_bug.cgi?id=39657 (ds namespace in >>>> LibreOffice) >>>> http://openoffice.org/bugzilla/show_bug.cgi?id=107864 (ds namespace in >> OOo) >>>> http://openoffice.org/bugzilla/show_bug.cgi?id=66276 (multiple >>>> X509Certificate in OOo) >>>> http://openoffice.org/bugzilla/show_bug.cgi?id=108286 >>>> We have to supply two modes to fix it. One follows ODF >>>> specification, the other follows Open Office. The question is which >>>> is the default? >>>> 2. A new user: XDocReport uses ODFDOM to load and manipulate ODF >>>> document. It's Java API to merge XML document created with MS Office >>>> (docx) or OpenOffice (odt), LibreOffice (odt) with a Java model to >>>> generate report and convert it if you need to another format (PDF, >>> XHTML...). >>>> Regards >>>> >>>> Biao Han (Devin) >>>> SOA Standards Growth, Emerging Technology Institute(ETI), IBM China >>>> Software Development Laboratory >>>> Tel:(86-10)82450541 >>>> Email: [email protected] >>>> Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software >>>> Park, No. 8 Dong Bei Wang West Road, ShangDi, Haidian District, >>>> Beijing, P.R.C.100193 > > > > > Jeremias Maerki
