[ 
https://issues.apache.org/jira/browse/PDFBOX-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2425:
------------------------------------
    Summary: Extracted OCR text has extra spaces  (was: Extracted text has 
extra spaces)

> Extracted OCR text has extra spaces
> -----------------------------------
>
>                 Key: PDFBOX-2425
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2425
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.7, 1.8.10, 1.8.11, 2.0.0
>            Reporter: John Hewson
>         Attachments: WooLam93c-Visible-p1.pdf, WooLam93c.pdf
>
>
> This is a very old issue, originally from PDFBOX-37. The attached file has 
> extra spaces inserted in the title text by PDFTextStripper.
> {code}
> A Framework  for D i s t r i bu t ed  Au thor i z a t i on*  
> (Extended Abstract) 
> Thoma s  Y .C .  Woo  S imon  S. L am  
> Depa r tmen t  of  Compu t e r  Sc i ences  
> Th e  Un i v e r s i t y  of  T ex a s  a t  Au s t i n  
> Au s t i n ,  T exa s  78712-1188  
> 1 In t r oduc t i on  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to