[ 
https://issues.apache.org/jira/browse/PDFBOX-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628339#comment-14628339
 ] 

Tim Allison edited comment on PDFBOX-2882 at 7/15/15 4:57 PM:
--------------------------------------------------------------

Tried both via InputStream and via file when I did the single shot run (useless 
from true benchmark perspective, I know :( ) and got roughly the same #s.

Running multiple times shows much more variation with no-scratch-file on my 
system.  As in your numbers, there's a drop off after the first run...Java has 
had its coffee by the second load.

On file, with pdfbox-2.0.0-20150715.011346-1543.jar (probably doesn't include 
your latest/dev versions??)

||No Scratch||Scratch||
|2034|53510|
|1000|44791|
|1581|44990|
|846|43852|
|826|43559|
|1055|42974|
|625|43865|
|910|43049|
|632|44795|
|767|44112|

|| ||No Scratch||Scratch||
|Mean|1027.6|44949.7|
|Median|878|43988.5|


With PDFBox 1.8.9 
||No Scratch Classic||Scratch Classic||Scratch NonSeq||
|864|1719|3290|
|393|687|1105|
|372|680|981|
|351|632|881|
|290|550|849|
|187|495|778|
|1078|592|764|
|214|474|772|
|306|471|764|
|234|535|908|

|| ||No Scratch Classic||Scratch Classic||Scratch NonSeq||
|Mean|428.9|683.5|1109.2|
|Median|328.5|571|865|

My take from this is that as of pdfbox-2.0.0-20150715.011346-1543.jar, the 
2.0.0's non-scratch runs on this file are equivalent to 1.8.9's nonseq with 
scratch (as required). 



was (Author: [email protected]):
Tried both via InputStream and via file when I did the single shot run (useless 
from true benchmark perspective, I know :( ) and got roughly the same #s.

Running multiple times shows much more variation with no-scratch-file on my 
system.  As in your numbers, there's a drop off after the first run...Java has 
had its coffee by the second load.

On file, with pdfbox-2.0.0-20150715.011346-1543.jar (probably doesn't include 
your latest/dev versions??)

||No Scratch||Scratch||
|2034|53510|
|1000|44791|
|1581|44990|
|846|43852|
|826|43559|
|1055|42974|
|625|43865|
|910|43049|
|632|44795|
|767|44112|

||No Scratch||Scratch||
|Mean|1027.6|44949.7|
|Median|878|43988.5|


With PDFBox 1.8.9 
||No Scratch Classic||Scratch Classic||Scratch NonSeq||
|864|1719|3290|
|393|687|1105|
|372|680|981|
|351|632|881|
|290|550|849|
|187|495|778|
|1078|592|764|
|214|474|772|
|306|471|764|
|234|535|908|

||No Scratch Classic||Scratch Classic||Scratch NonSeq||
|Mean|428.9|683.5|1109.2|
|Median|328.5|571|865|




> Improve performance when using scratch file
> -------------------------------------------
>
>                 Key: PDFBOX-2882
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2882
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>    Affects Versions: 2.0.0
>            Reporter: Timo Boehme
>            Assignee: Timo Boehme
>            Priority: Minor
>         Attachments: ScratchFile.java, ScratchFileBuffer.java
>
>
> The current scratch file implementation uses many direct I/O calls which 
> slows down parsing compared with in-memory scratch buffer considerably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to