Re: PDFBox capabilities

Gary Wong Tue, 01 Feb 2011 07:11:01 -0800

Hey Jeremias,

Right now it is just plain text. I guess I got some learning to do. For your
point 5 (stream PDF to user), that's write to file, right?


I'll google that as I have never done that. Hopefully there are some good
cheat sheets on how to convert text -> XML -> XSLT-> XSL-FO -> PDF.

Thanks again!

g




Re: PDFBox capabilities
From:
Jeremias Maerki <[email protected]>
To:[email protected]
------------------------------
Hi Gary

What do you mean by "just text"? Is that HTML markup, some kind of Wiki
syntax (since you talk about links) or plain text? Anyway, if you
generate HTML you also have to process the text in some way, right? Going
towards XSL-FO is then pretty similar. XSLT could also be used to
convert the HTML to XSL-FO (with the HTML having to be converted to
XHTML by an HTML pretty printer like http://jtidy.sourceforge.net/).

I'm sure Apache PDFBox could do it but I'm convinced that it would take
more effort and will be harder to maintain. Of course, the XSLT/XSL-FO
approach will also take some initial effort, especially the learning
process.

To summarize, it looks to me like the processing pipeline would look
like this:

1. retrieve the "text" from the DB
2. if necessary, convert it to some XML-based markup (XHTML or whatever).
3. use XSLT to convert the markup to XSL-FO.
4. use Apache FOP to convert the markup to PDF.
5. stream the PDF to the user

For the browser display, I guess the process could be quite similar (not
sure what you have planned:

1. retrieve the "text" from the DB
2. if necessary, convert it to some XML-based markup (XHTML or whatever).
3. insert it into a (X)HTML-based page template (ex. with XSLT)
4. stream the HTML to the user

Note: IMO it's important to keep XSLT and XSL-FO apart even though
together they make up XSL. Each can be used independently of each other,
so the collective term "XSL" is usually not very useful.

Re: PDFBox capabilities

Reply via email to