Re: [poppler] Speed improvements - chapter eleven

Krzysztof Kowalczyk Wed, 06 Sep 2006 11:22:59 -0700

On 9/6/06, Leonard Rosenthol <[EMAIL PROTECTED]> wrote:

At 03:48 AM 9/6/2006, Krzysztof Kowalczyk wrote:
>Frankly, I was disappointed that it's only ~~5%. I was expecting much
>more. It turns out that the culprit is current implementation of flate
>stream, which is frequently used to compress streams inside PDFs. It
>decompresses data in very small chunks (e.g. 8 bytes on average per
>getBuf() call in my test) so we don't save nearly as much as if we
>were getting, say, 256 bytes at a time. I'm working on improving that
>as well, but this change lays the necessary foundation.


         Given these two things, why not consider reading an ENTIRE
PDF Stream into memory and decompressing it - thus turning what is
now a FlateStream->FileStream path with getChar() logic into a single
MemStream with getBuf() logic??   Yes, it will mean having the entire
stream in memory - but assuming a "PC" and not an embedded device,
it's pretty safe to assume memory is present.  You could make it a
document load option and you could dispose the memory when the stream
is closed.


That should also improve speed. Although I don't think I'm gonna
attempt this optimization myself. All my optimization attempts are
guided by profiler output in order to try to get the biggest bang for
my time.

BTW: it looks like all my changes combined give me about 50% speedup
when loading PDFs and a measurable, but not dramatic, speedups when
rendering (i.e. between 1-15%, depending on the page).

-- kjk
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Re: [poppler] Speed improvements - chapter eleven

Reply via email to