Re: [iText-questions] performance follow up

Mike Marchywka Fri, 23 Apr 2010 10:50:22 -0700








----------------------------------------
> Date: Fri, 23 Apr 2010 10:29:43 -0700
> From: forum_...@trumpetinc.com
> To: itext-questions@lists.sourceforge.net
> Subject: Re: [iText-questions] performance follow up
>
>
> This tells us that the focus needs to be on PRTokeniser and RAFOA. For what
> it's worth, these have also shown up as the bottlenecks in profiling I've
> done in the parser package (not too surprising).
>
> I'll discuss each in turn:
>
> RandomAccessFileOrArray - I've done a lot of thinking on this one in the
> past. This is an interesting class that can have massive impact on
> performance, depending on how you use it. For example, if you load your
> source PDF entirely into memory, and pass the bytes into RAFOA, it will
> remove IO bottlenecks.
>

I mentioned IO early on as a neglected task where you just pull some
generic thing out and let it hand you a byte at a time, clearly
if you know your access pattern and can make it coherent you stand
to gain a lot. Caching and VM can only guess what you will do next,
you may know better :)
 
 
 
> The one problem with the memory mapped strategy (in it's current
> implementation) is that really big source PDFs still can't be loaded into
> memory. This could be addressed by using a paging strategy on the mapped

You can have alt implementations in the mean time if you know
size a priori. Ideally you would
like to be able to operate on a stream and scrap random access.
 
 
> will use memory mapped IO is the Document.plainRandomAccess static public
> variable (shudder).

As if 2 people would use this at the same time ROFL :)
Its bad enough you don't have globals...
 
 
>
>
>
>
> So what about the code paths in PRTokeniser.nextToken()?
>
> We've got a number of tight loops reading individual characters from the
> RAFOA. If the backing source isn't buffered, this would be a problem, but I
> don't know that is really the issue here (it would be worth measuring
> though...)
>
> The StringBuffer could definitely be replaced with a StringBuilder, and it
> could be re-used instead of re-allocating for each call to nextTokeen()
> (this would probably help quite a bit, as I'll bet the default size of the
> backing buffer has to keep growing during dictionary reads).
>
 
Again, why even do this? Find the delimeters and, if you must make
a string make it only when you know start and end, don't keep
looping with append(char) no matter how nice the source code looks.
If you can use anything with "[]" in the sig it stands a chance
of being faster. Pass as much a priori info to the library classes
as you can- append means "whoops I found ANOTHER thing to add" when
you already have the data just say " here is the string I need."
And unless you actually need the tokens as strings, consider
just returning indexes or something. You may or may not
need strings all the time, it may be possible to use int[] or something. 
 
 

> Another thing that could make a difference is ordering of the case and if's
> - for example, the default: branch turns around and does a check for (ch ==
> '-' || ch == '+' || ch == '.' || (ch>= '0' && ch <= '9'). Changing this to
> be:
>
> case '-':
> case '+':
> case '.':
> case '0':
> ...
> case '9':
>
 
Actually optimizing compilers do stuff like this- you have some
known, assumed, or measured branching probability and make
the common ones faster ( minimize expectation value of execution time).
I haven't checked lately but IIRC the compiler tries to make
a switch into a jump table.
 

> May be better.
>
>
> The loops that check for while (ch != -1 && ((ch>= '0' && ch <= '9') || ch
> == '.')) could also probably be optimized by removing the && ch != -1 check
> - the other conditions ensure that the loop will escape if ch==-1
>
>
> It might be interesting to break the top level parsing branches into
> separate functions so the profiler tell us which of these main branches is
> consuming the bulk of the run time.
>
>
> Those are the obvious low hanging fruit that I see.
>
> Final point: I've seen some comments suggesting inlining of some code.
> Modern VMs are quite good at doing this sort of inlining automatically - a
> test would be advisable before worrying about it too much. Having things
> split out actually makes it easier to use a profiler to determine where the
> bottleneck is.
>
>
> One thing that is quite clear here is that we need to have some sort of
> benchmark that we can use for evaluation - for example, if I had a good
> benchmark test, I would have just tried the ideas above to see how they
> fared.

If you have a set of benchmarks, you can afford to "measure" them
according to things you think will impact execution time. then, with
enough, you can fit execution time to your measurements using alt
pieces of code. This is where you start find statistics helpful
( " pdf with value X for attribute A incurs a time penalty of 
n seconds per increment of X" ). FWIW, there is also something
about identical, indepdent entities for a statistical sample. If you can measure
these they aren't identical. 
 
 
>> The top are now:
>>
>> PRTokeniser.nextToken 8% 77s 19'268'000
>> invocations
>> RandomAccessFileOrArray.read 6% 53s 149'047'680 invocations
>> MappedRandomAccessFile.read 3% 26s 61'065'680 invocations
>> PdfReader.removeUnusedCode 1% 15s 6000 invocations
>> PdfEncodings.convertToBytes 1% 15s 5'296'207 invocations
>> PRTokeniser.nextValidToken 1% 12s 9'862'000 invocations
>> PdfReader.readPRObject 1% 10s 5'974'000 invocations
>> ByteBuffer.append(char) 1% 10s 19'379'382
>> invocations
>> PRTokeniser.backOnePosition 1% 10s 17'574'000 invocations
>> PRTokeniser.isWhitespace 1% 8s 35'622'000 invocations
>>
>> A bit further down there is ByteBuffer.append_i that often needs to
>> reallocate and do an array copy thus the expensive ByBuffer.append(char)
>> above ... I am playing right now with bigger initial sizes e.g. 512
>> instead of 127 ...
>>
>> Best regards,
>> Giovanni
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> iText-questions mailing list
>> iText-questions@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>
>> Buy the iText book: http://www.itextpdf.com/book/
>> Check the site with examples before you ask questions:
>> http://www.1t3xt.info/examples/
>> You can also search the keywords list:
>> http://1t3xt.info/tutorials/keywords/
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/performance-follow-up-tp28322800p28344041.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.itextpdf.com/book/
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/  
>                                   
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3
------------------------------------------------------------------------------
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up

Reply via email to