Re: Replace itext with PDFBox for XDocReport project?

Kévin Sailly Fri, 14 Oct 2011 08:32:23 -0700

Nothing at the moment, I was working on the ability to get the text
justified on a page.
I am trying to apply that style to PDPageContentStream, but this could be
suitable for something like "PDParagraphContentStream" as a style for
PDFBoxParagraph.


I will try to finish this development this week if I found enough time.



2011/10/14 Angelo zerr <[email protected]>

> That's cool.
>
> Ok Kevin, I will tell you when I will commit my work.
> Do you have developped something? If yes, which widgets do you have
> managed?
> Table, Paragraph, etc
>
> In my case I have developped the case with paragraph (which seems simple
> but
> complex for me a newbie with PDFBox) because x and y must be computed every
> time.
>
> Regards Angelo
>
> 2011/10/14 Kévin Sailly <[email protected]>
>
> > It sounds good...
> >
> >
> >
> > 2011/10/14 Angelo zerr <[email protected]>
> >
> > > Ok I understand. I think we have the same goal :
> > >
> > > 1) My goal: docx -> PDF.
> > > 2) Your goal : XML -> PDF.
> > >
> > > My idea is to provide a Java DOM-like PDFBox.(When I say DOM, its not
> w3c
> > > DOM). So you can use it just with Java code liek this:
> > >
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > > PDFBoxDocument document = new PDFBoxDocument();
> > > PDFBoxParagraph paragraph = document.addParagraph();
> > > paragraph.addRun("AAAA");
> > > paragraph.addRun(" ");
> > > paragraph.addRun("BBB");
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > >
> > > For my case (docx) I will load my docx with POI to have docx Java
> > scrutcure
> > > and I will loop to thoses structure to createPDFBoxDocument ,
> > > PDFBoxParagraph instances.
> > >
> > > For your case (HTML) you could implement SAX handler like :
> > >
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > > public void startElement(String uri, String localName, String name,
> > > Attributes atts) throws SAXException {
> > > if ("p".localName) {
> > > PDFBoxParagraph paragraph = document.addParagraph();
> > > }
> > > }
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > >
> > > docx and HTML (your sample) manages too styles. So I would like to
> manage
> > a
> > > style method like this:
> > >
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > > ParagraphStyle style=new ParagraphStyle();
> > > style.setMargin(10);
> > > PDFBoxParagraph paragraph = document.addParagraph();
> > > paragraph.applyStyle(style);
> > >
> > >
> >
> -----------------------------------------------------------------------------------------------
> > >
> > > ParagraphStyle could be populated with CSS or declared styles.
> > >
> > > What do you think?
> > >
> > > Regards Angelo
> > >
> > > 2011/10/14 Kévin Sailly <[email protected]>
> > >
> > > > Angelo,
> > > >
> > > > My goal is to get some text like:
> > > > <p>some text</p>
> > > > <p style="margin-left: 30px;">this one positionned!</p>
> > > >
> > > > And then produce the PDF with style applied:
> > > > some text
> > > >      this one positionned!
> > > >
> > > > Regards,
> > > > Kéivn
> > > >
> > > >
> > > >
> > > >
> > > > 2011/10/14 Angelo zerr <[email protected]>
> > > >
> > > > > Hi Srinivaas,
> > > > >
> > > > > At first my PDFBox-DOM like can be used without docx (I must manage
> > odt
> > > > > too). And my idea is to use that for another application.
> > > > > Your idea to convert docx 2 PDF with FOP, I have already
> implemented
> > > but
> > > > I
> > > > > don't like that :
> > > > >
> > > > > 1) FOP converter is very more slowly than iText converter. The
> > > > explanation
> > > > > is simple :
> > > > > => FOP process :  docx -> XSLT -> XSL-FO -> PDF
> > > > > => iText process : docx -> (POI to get Java model) -> iText
> > > > >
> > > > > As you can notice with FOP process are more steps than IText
> process.
> > > FOP
> > > > > process for docx is less powerfull than iText process and I have a
> > lot
> > > > > optimized my XSL (XSL Template is in a cahe, I'm using xsl:key,
> etc).
> > > > >
> > > > > 2) with FOP converter you manage the conversion with XSL. With
> IText
> > > > > converter you manage teh conversion with Java.
> > > > > IMHO, I prefer developping Java than XSL. Debug XSL is very hard
> > > compare
> > > > > debug Java code.
> > > > > More docx use  styles.xml where style A can extends style B. With
> XSL
> > I
> > > > > compute style every time (how to manage some cache with XSL?)
> > although
> > > > with
> > > > > Java I compute one time.
> > > > >
> > > > > Our iText converter works great but for license problem we can give
> > our
> > > > > code
> > > > > to Apacahe. So I'm investiagting PDFBox.
> > > > >
> > > > > Regards Angelo
> > > > >
> > > > > 2011/10/14 Srinivaas_Venkatarayan <
> > > > > [email protected]
> > > > > >
> > > > >
> > > > > > Hi Angelo, If the source is going to be docx file, can you not
> use
> > > xslt
> > > > > and
> > > > > > FOP to convert the xml (provided by docx file) to PDF?
> > > > > >
> > > > > > Srinivaas
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Angelo zerr [mailto:[email protected]]
> > > > > > Sent: Friday, October 14, 2011 5:07 PM
> > > > > > To: [email protected]
> > > > > > Subject: Re: Replace itext with PDFBox for XDocReport project?
> > > > > >
> > > > > > Hi Kevin,
> > > > > >
> > > > > > What do you mean with richtext? In my case I would like create a
> > PDF
> > > > from
> > > > > > scratch with PDF DOM-like with the same mean than iText.
> > > > > > I have started to create a project (not commited for the moment
> > > > > >
> > > > > > Here my (basic sample) to generate paragraph with several text
> > > content
> > > > > > (that
> > > > > > I have called run like docx)
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------
> > > > > > PDFBoxDocument document = new PDFBoxDocument();
> > > > > >
> > > > > > PDFBoxParagraph paragraph = document.addParagraph();
> > > > > > paragraph.addRun("AAAA");
> > > > > > paragraph.addRun(" ");
> > > > > > paragraph.addRun("BBB");
> > > > > >
> > > > > > document.save("test.pdf");
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------
> > > > > >
> > > > > > This code generate a PDF with :
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------
> > > > > > AAAA BBB
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------
> > > > > >
> > > > > > If you are interested (working together?) I could commit my work
> on
> > > our
> > > > > > XDocReport git.
> > > > > >
> > > > > > Regards Angelo
> > > > > >
> > > > > > 2011/10/14 Kévin Sailly <[email protected]>
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am planning to build some code to create text from richtext
> > (text
> > > > > from
> > > > > > > rech text editor), is that what you are planning to do?
> > > > > > >
> > > > > > > Regards,
> > > > > > > Kévin
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2011/10/13 Angelo zerr <[email protected]>
> > > > > > >
> > > > > > > > Hi PDFBox Team,
> > > > > > > >
> > > > > > > > I have started to investigate time with PDFBox to try to
> > provides
> > > > > > > > High-Level
> > > > > > > > API to manage paragraph, table widgets etc....
> > > > > > > > I recall my problem : we are using iText for our ODT->PDF and
> > > > > Docx->PDF
> > > > > > > > converter and we wish provides our code for Apache. Problem
> is
> > > > iText
> > > > > > > > license.
> > > > > > > > So I'm searching PDF API (PDFBox?FOP?) to manage PDF with
> Java
> > > > model
> > > > > > (not
> > > > > > > > with XSL-FO). It seems that FOP provides this feature, but
> it's
> > > > very
> > > > > > hard
> > > > > > > > to
> > > > > > > > understand how to manage that?
> > > > > > > >
> > > > > > > > I have tried to manage a simple case : a paragraph with some
> > > text.
> > > > I
> > > > > > > would
> > > > > > > > like generate PDF with this content:
> > > > > > > >
> > > > > > > > ----------------------------
> > > > > > > > AAAA BBBB
> > > > > > > > ----------------------------
> > > > > > > >
> > > > > > > > But not with one String but with 3 Strings (call 3 times
> > > > > > > > contentStream.drawString(...);)
> > > > > > > > The solution that I have found to manage that is to store the
> > > last
> > > > X
> > > > > of
> > > > > > > the
> > > > > > > > added Stringby using Stringlenght+Font + Font size. Here my
> > code
> > > > > > > >
> > > > > > > > ------------------------------------------------------
> > > > > > > > import java.io.IOException;
> > > > > > > >
> > > > > > > > import org.apache.pdfbox.exceptions.COSVisitorException;
> > > > > > > > import org.apache.pdfbox.pdmodel.PDDocument;
> > > > > > > > import org.apache.pdfbox.pdmodel.PDPage;
> > > > > > > > import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
> > > > > > > > import org.apache.pdfbox.pdmodel.font.PDFont;
> > > > > > > > import org.apache.pdfbox.pdmodel.font.PDType1Font;
> > > > > > > >
> > > > > > > > public class Test3 {
> > > > > > > >
> > > > > > > >    private static float lastX = 0;
> > > > > > > >
> > > > > > > >    public static void main(String[] args) throws IOException,
> > > > > > > >            COSVisitorException {
> > > > > > > >        PDDocument doc = null;
> > > > > > > >        try {
> > > > > > > >            doc = new PDDocument();
> > > > > > > >
> > > > > > > >            PDPage page = new PDPage();
> > > > > > > >            doc.addPage(page);
> > > > > > > >
> > > > > > > >            PDPageContentStream contentStream = new
> > > > > > > PDPageContentStream(doc,
> > > > > > > >                    page);
> > > > > > > >
> > > > > > > >            PDFont font = PDType1Font.HELVETICA_BOLD;
> > > > > > > >            long fontSize = 5;
> > > > > > > >            addText("AAAA", font, fontSize, page,
> > contentStream);
> > > > > > > >            addText(" ", font, fontSize, page, contentStream);
> > > > > > > >            addText("BBBB", font, fontSize, page,
> > contentStream);
> > > > > > > >
> > > > > > > >            contentStream.close();
> > > > > > > >
> > > > > > > >            doc.save("test.pdf");
> > > > > > > >
> > > > > > > >        } finally {
> > > > > > > >            if (doc != null) {
> > > > > > > >                doc.close();
> > > > > > > >            }
> > > > > > > >        }
> > > > > > > >    }
> > > > > > > >
> > > > > > > >    public static void addText(String text, PDFont font, long
> > > > > fontSize,
> > > > > > > >            PDPage page, PDPageContentStream contentStream)
> > throws
> > > > > > > > IOException {
> > > > > > > >
> > > > > > > >        // Compute x
> > > > > > > >        float x = lastX;
> > > > > > > >        float nextX = lastX + font.getStringWidth(text) *
> > fontSize
> > > /
> > > > > > > 1000f;
> > > > > > > >
> > > > > > > >        // Compute Y
> > > > > > > >        float y = page.getMediaBox().getHeight()
> > > > > > > >                - (font.getFontHeight("A".getBytes(), 0, 1) *
> > > > fontSize
> > > > > /
> > > > > > > > 1000f);
> > > > > > > >
> > > > > > > >        contentStream.beginText();
> > > > > > > >        contentStream.setFont(font, fontSize);
> > > > > > > >        contentStream.moveTextPositionByAmount(x, y);
> > > > > > > >        contentStream.drawString(text);
> > > > > > > >        contentStream.endText();
> > > > > > > >
> > > > > > > >        // Recompute lastX
> > > > > > > >        lastX = nextX;
> > > > > > > >    }
> > > > > > > >
> > > > > > > > }
> > > > > > > > ------------------------------------------------------
> > > > > > > >
> > > > > > > > I would like know if it's the correct mean? If it's OK, I
> would
> > > > like
> > > > > > know
> > > > > > > > if
> > > > > > > > it's possible to retrieve the default Font of the document,
> > > because
> > > > > in
> > > > > > my
> > > > > > > > case I have setted the Font?
> > > > > > > >
> > > > > > > > My code doesn't manage wrap text and I would like know how to
> > > > manage
> > > > > > > that?
> > > > > > > >
> > > > > > > > I'm very interested to provides and contribute Hight level
> API
> > > for
> > > > > > PDFBox
> > > > > > > > (paragraph, table...) but if I have no support I will give up
> > my
> > > > idea
> > > > > > > (hope
> > > > > > > > you will understand).
> > > > > > > >
> > > > > > > > Thank a lot for your help!
> > > > > > > >
> > > > > > > > Regards Angelo
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2011/9/5 Angelo zerr <[email protected]>
> > > > > > > >
> > > > > > > > > Hi Jeremias,
> > > > > > > > >
> > > > > > > > > I'm sorry I have not seen your answer
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-odf-dev/201108.mbox/%[email protected]%3E
> > > > > > > > > When I have studied FOP to manage PDF just with Java PDF
> > widget
> > > I
> > > > > > have
> > > > > > > > not
> > > > > > > > > found documentation so I believed that it was not possible,
> > but
> > > > it
> > > > > > > seems
> > > > > > > > > that is possible.
> > > > > > > > > That's very cool. I will study that.
> > > > > > > > >
> > > > > > > > > Many thanks!
> > > > > > > > >
> > > > > > > > > Regards Angelo
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2011/9/5 Jeremias Maerki <[email protected]>
> > > > > > > > >
> > > > > > > > >> Angelo,
> > > > > > > > >> as I explained in [1], you don't have to use XSL-FO when
> > using
> > > > > > Apache
> > > > > > > > >> FOP. It supports alternative means to create PDFs.
> > > > > > > > >>
> > > > > > > > >> [1]
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-odf-dev/201108.mbox/%[email protected]%3E
> > > > > > > > >>
> > > > > > > > >> But of course, Apache PDFBox would profit a lot from a
> > > > > higher-level
> > > > > > > PDF
> > > > > > > > >> production API. Any contributions are more than welcome.
> > > > > > > > >>
> > > > > > > > >> On 05.09.2011 10:56:20 Angelo zerr wrote:
> > > > > > > > >> > Hi Jukka,
> > > > > > > > >> >
> > > > > > > > >> > Thank a lot for you answer. I have already implemented a
> > > > > docx->PDF
> > > > > > > and
> > > > > > > > >> > odt->PDF converters with FOP but I decided to give up
> for
> > :
> > > > > > > > >> >
> > > > > > > > >> > * performance reason. I have used XSLT cache, use
> xsl:key
> > to
> > > > > > compute
> > > > > > > > the
> > > > > > > > >> > odt/docx styles but the FOP implementation is less
> > > performant
> > > > > than
> > > > > > > > iText
> > > > > > > > >> > implementation, because :
> > > > > > > > >> >   * FOP process : odt -> XSLT -> FO -> FOP
> > > > > > > > >> >   * iText process : odt -> ODFDOM (Java) -> iText
> > > > > > > > >> > * xslt vs Java model : with the iText process, your
> model
> > is
> > > > > Java,
> > > > > > > > >> although
> > > > > > > > >> > with FOP your model is XML. I prefer develop Java
> instead
> > of
> > > > > XSLT.
> > > > > > > > >> >
> > > > > > > > >> > That's why I'm searching Java PDF API like PDFBox to
> > replace
> > > > > iText
> > > > > > > to
> > > > > > > > >> > provides our code to Apache.
> > > > > > > > >> >
> > > > > > > > >> > Regards Angelo
> > > > > > > > >> >
> > > > > > > > >> > 2011/9/5 Jukka Zitting <[email protected]>
> > > > > > > > >> >
> > > > > > > > >> > > Hi Angelo,
> > > > > > > > >> > >
> > > > > > > > >> > > On Mon, Sep 5, 2011 at 10:11 AM, Angelo zerr <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > > > I suppose that my post was not well explained as I
> > have
> > > no
> > > > > > > answer.
> > > > > > > > I
> > > > > > > > >> will
> > > > > > > > >> > > be
> > > > > > > > >> > > > very happy to use PDFBox in our XDocReport converter
> > > > (docx->
> > > > > > PDF
> > > > > > > > and
> > > > > > > > >> > > odt->
> > > > > > > > >> > > > PDF) but develop converter is a big work and I can
> not
> > > > > > > investiaget
> > > > > > > > >> time
> > > > > > > > >> > > if I
> > > > > > > > >> > > > have no support.
> > > > > > > > >> > >
> > > > > > > > >> > > There's been some interest in making it easier to use
> > > PDFBox
> > > > > to
> > > > > > > > >> > > generate complex new PDF documents, but so far the
> main
> > > use
> > > > > > cases
> > > > > > > > have
> > > > > > > > >> > > been simpler. You might want to look at Apache FOP
> > > > > > > > >> > > (http://xmlgraphics.apache.org/fop/) for a
> higher-level
> > > PDF
> > > > > > > > >> generation
> > > > > > > > >> > > tool.
> > > > > > > > >> > >
> > > > > > > > >> > > BR,
> > > > > > > > >> > >
> > > > > > > > >> > > Jukka Zitting
> > > > > > > > >> > >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Jeremias Maerki
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > > DISCLAIMER:
> > > > > > This email (including any attachments) is intended for the sole
> use
> > > of
> > > > > the
> > > > > > intended recipient/s and may contain material that is
> CONFIDENTIAL
> > > AND
> > > > > > PRIVATE COMPANY INFORMATION. Any review or reliance by others or
> > > > copying
> > > > > or
> > > > > > distribution or forwarding of any or all of the contents in this
> > > > message
> > > > > is
> > > > > > STRICTLY PROHIBITED. If you are not the intended recipient,
> please
> > > > > contact
> > > > > > the sender by email and delete all copies; your cooperation in
> this
> > > > > regard
> > > > > > is appreciated.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Replace itext with PDFBox for XDocReport project?

Reply via email to