Hello Mike,

> Mike Marchywka [mailto:[email protected]] wrote:
> ... snip ...
> that's fine. If you want someone here to figure out why itext
> is slower, the pointing to a hogging method would help. 
> 
I ran the same iText workload as described before, this time connected to a 
profiler and the results are:

=======================================================================
Method                                                                          
                                     # Invocations     Time spent (seconds)
=======================================================================
com.itextpdf.text.pdf.PRTokeniser.nextToken                                     
          19268000          187.262432
com.itextpdf.text.pdf.RandomAccessFileOrArray.read                              
  149340018         140.111802
com.itextpdf.text.pdf.MappedRandomAccessFile.read                               
  61386018            76.73099
com.itextpdf.text.pdf.PdfReader.removeUnusedNode                                
  6000                39.623488
com.itextpdf.text.pdf.PdfEncodings.convertToBytes                               
  5748313             31.844394
com.itextpdf.text.pdf.PRTokeniser.nextValidToken                                
  9862000             28.819224
com.itextpdf.text.pdf.PdfReader.readPRObject                                    
  5974000             26.932437
com.itextpdf.text.pdf.ByteBuffer.append(char)                                   
          20201312            26.754906
com.itextpdf.text.pdf.PRTokeniser.backOnePosition                               
  17564000            24.265245
com.itextpdf.text.pdf.PRTokeniser.isWhitespace                                  
  35622000            23.604517
com.itextpdf.text.pdf.PdfName.encodeName                                        
          2552663             23.450355
com.itextpdf.text.pdf.PdfChunk.isAttribute                                      
          10646000            19.019851
com.itextpdf.text.pdf.ByteBuffer.append_i                                       
                  29071312              17.793865
com.itextpdf.text.pdf.PRTokeniser.readString                                    
          64000               16.489225
com.itextpdf.text.pdf.PdfArray.toPdf                                            
                  158000                14.463853
com.itextpdf.text.pdf.PdfReader.getStreamBytesRaw                               
  60000                 12.833419
com.itextpdf.text.pdf.PdfReader.readDictionary                                  
  374000              12.716459
com.itextpdf.text.pdf.PdfObject.<init>                                          
          9636672               12.412647
com.itextpdf.text.pdf.RandomAccessFileOrArray.readChar            5827711       
        12.252164
com.itextpdf.text.pdf.ByteBuffer.<init>                                         
          2810663               12.024495
com.itextpdf.text.pdf.PdfNumber.<init>(java.lang.String)                        
  3388000               11.667655
com.itextpdf.text.pdf.OutputStreamCounter.write(byte[ ])                        
  8558000               10.957901
com.itextpdf.text.pdf.RandomAccessFileOrArray.pushBack            17556000      
        10.898216
com.itextpdf.text.pdf.PdfObject.type                                            
                  16882000              10.613727
com.itextpdf.text.pdf.BidiLine.createArrayOfPdfChunks                     94000 
                10.133254
com.itextpdf.text.pdf.RandomAccessFileOrArray.getFilePointer      5615817       
        10.114153
com.itextpdf.text.pdf.BidiLine.processLine                                      
          96000                 10.026792
com.itextpdf.text.DocWriter.getISOBytes                                         
          1600013                 9.973701

For my specific use-case it seems that doing any small improvement among those 
top ranking most expensive methods could have a huge positive performance 
impact e.g. 

com.itextpdf.text.pdf.PRTokeniser.nextToken()
http://itext.svn.sourceforge.net/viewvc/itext/trunk/src/core/com/itextpdf/text/pdf/PRTokeniser.java?revision=4300&view=markup#l_290

Cheap improvements would be e.g.
 
- move away from using StringBuffer but use StringBuilder instead. 
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuffer.html "The 
StringBuilder class should generally be used in preference to this one 
(StringBuffer), as it supports all of the same operations but it is faster, as 
it performs no synchronization."

- since nextToken() is called from a loop from nextValidToken() maybe would be 
best to move the outBuf variable to a final member attribute and do 
outBuf.reset() instead of creating a new instance each time.

Best regards,
Giovanni





------------------------------------------------------------------------------
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to