[ 
https://issues.apache.org/jira/browse/PDFBOX-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147342#comment-17147342
 ] 

Andreas Lehmkühler commented on PDFBOX-4883:
--------------------------------------------

[~Faltiska] You are correct that aspect got lost. But your patch isn't that 
strict too. It would set the string even if the value was changed within the 
constructor. Now it is set only if the parsing went well and the check doesn't 
replace the value. I've included the other changes as well

> COSFloat is extremely slow
> --------------------------
>
>                 Key: PDFBOX-4883
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4883
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.20, 3.0.0 PDFBox
>            Reporter: Alfred
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>              Labels: display, optimization, parsing, textextraction
>             Fix For: 3.0.0 PDFBox
>
>         Attachments: After.png, Before.png, PDFBOX-4883-b.patch, 
> PDFBOX-4883.patch, extreme-values-out.pdf
>
>
> I am testing text extraction from PDF and profiling the execution.
> I found that biggest time consumer is the COSFloat class.
>  
> All other improvements I suggested so far are small compared to this.
> But this is the also the most complex one.
>  
> I have attached te profiler output for the same text extraction, with and 
> without the COSFloat changes.
> The time to extract the same text was 4 times long with the original COSFlow, 
> because of its use of BigDecimal.
> I will try to write extra tests for all cases I see in the original COSFLoat 
> code, if they are not already tested.
> Then I will submit for review a new COSFloat version.
>  
> I think this affects parsing and displaying PDFs too, not just text 
> extraction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to