On 9/2/06, Jeff Muizelaar <[EMAIL PROTECTED]> wrote:
> I think there are more opportunities for improvements like 2) although
> even more improvements would come from improving various
> Stream::getChar() methods (currently Lexter::getChar(),
> EmbedStream::getChar() and FlateStream::getChar() are in top 5 of most
> exensive methods during loading. I haven't yet found a way to improve
> that.
One of the ways that I looked at optimizing these was by adding a read()
method to the Stream class that reads multiple bytes instead of a single
one. I have a patch from a long time ago that adds something like this
at:
http://people.freedesktop.org/~jrmuizel/patches/poppler-read-stream.patch
The problem I ran into was with things like inline images (EmbedStream).
With these streams there is no way of knowing ahead of time how long the
stream is so you have to be very careful not read more than you are
supposed to. This is also the source of the current problem with the
zlib based version of FlateStream.
The solution, it seems, is just to be careful not to read more than 1
byte when the stream does not have a limited length. I have a patch
around that fixes the zlib-based FlateStream to only read as much as it
needs so the read() method should be feasible. This should help drop the
FlateStream::getChar() overhead as long as it isn't reading from an
EmbedStream.
Yeah, I tried that and hit the same problems. Also, when creating
embedded or filtered streams, they are created starting at current
position in the stream and caching breaks this.
I did manage to get speedups by improving looChar()/getChar() to cache
the latest value (they're frequently re-doing getting a char from the
substream) but my attempts at read() interface only generated
spectacular crashes. I'm still hopefull it's possible but need to
spend more time understanding the code.
-- kjk
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler