On 9/6/06, Leonard Rosenthol <[EMAIL PROTECTED]> wrote:
At 03:48 AM 9/6/2006, Krzysztof Kowalczyk wrote: >Frankly, I was disappointed that it's only ~~5%. I was expecting much >more. It turns out that the culprit is current implementation of flate >stream, which is frequently used to compress streams inside PDFs. It >decompresses data in very small chunks (e.g. 8 bytes on average per >getBuf() call in my test) so we don't save nearly as much as if we >were getting, say, 256 bytes at a time. I'm working on improving that >as well, but this change lays the necessary foundation.Given these two things, why not consider reading an ENTIRE PDF Stream into memory and decompressing it - thus turning what is now a FlateStream->FileStream path with getChar() logic into a single MemStream with getBuf() logic?? Yes, it will mean having the entire stream in memory - but assuming a "PC" and not an embedded device, it's pretty safe to assume memory is present. You could make it a document load option and you could dispose the memory when the stream is closed.
That should also improve speed. Although I don't think I'm gonna attempt this optimization myself. All my optimization attempts are guided by profiler output in order to try to get the biggest bang for my time. BTW: it looks like all my changes combined give me about 50% speedup when loading PDFs and a measurable, but not dramatic, speedups when rendering (i.e. between 1-15%, depending on the page). -- kjk _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
