PDF library refactoring

Jeremias Maerki Wed, 19 Mar 2003 12:42:08 -0800

Hi there

A little heads up: As I've announced I did some refactoring on the PDF
library. As life goes it got pretty extensive. I've cleaned a lot of
stuff, more centralized und uniform behaviour, improved logging
(PDFDocument makes the logger available to all child objects) etc. I
also hope that my changes are somewhat preparatory for the "parse PDF"
task.


The refactoring allows me now to easily implement on-the-fly stream
encoding. That means the stream doesn't have to be encoded from a
StreamCache (byte buffer) to another byte buffer. That was previously
necessary to calculate the length of the encoded stream. This value was
included in the object's dictionary (/Length). By using an indirect
object (4.10 in the PDF 1.3 specs) I can encode the raw stream on-the-fly
using a chain of FilteredOutputStreams to the PDF file. The length is
later added as a separate PDFNumber object.

I quickly tested the new runtime behaviour and I got from 4350KB down to
2277KB (Total memory used as calculated in the LayoutHandler) on a 1.1MB,
two-page PDF (image-intensive). Time used went up from average 2.5sec to
3.1sec, but when I used a BufferedOutputStream it was around 2.4sec.
(Disclaimer: I didn't account for JVM warm-up! It was just a quick test.)

Theoretically, it would now also be possible to pipe a (JPEG image)
stream right through FOP without having to buffer the raw stream in
memory or in a temp file. But doing that is not on my todo list right
now.

I still need to tackle the last few methods. Then it's testing time. And
when I'm done that'll be a rather big commit.

Jeremias Maerki

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

PDF library refactoring

Reply via email to