[ 
https://issues.apache.org/jira/browse/PDFBOX-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-3370.
-----------------------------------
    Resolution: Not A Problem
      Assignee: Tilman Hausherr

It's not a bug, it's a feature:
Please open the file in Adobe Reader and try to mark an "l" of the file I 
uploaded (or any glyph). You will notice that the rectangle is much larger. 
PDFBox has a feature to suppress overlapping identical characters and that one 
is on by default. You can switch that off with 
stripper.setSuppressDuplicateOverlappingText(false).

{code}
     * By default the text stripper will attempt to remove text that overlapps 
each other. Word paints the same
     * character several times in order to make it look bold. By setting this 
to false all text will be extracted, which
     * means that certain sections will be duplicated, but better performance 
will be noticed.
{code}
If you have control over the generation of the file, use a font that has 
correct font metrics (horiAdvance, horiBearingX).


>  Error reading the double L
> ---------------------------
>
>                 Key: PDFBOX-3370
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3370
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.7, 1.8.8, 1.8.9, 1.8.10, 1.8.11, 1.8.12, 2.0.0, 2.0.1
>         Environment: Netbeans 8.1
> Java 7 and Java 8
>            Reporter: José Jiménez
>            Assignee: Tilman Hausherr
>            Priority: Critical
>              Labels: extraction, ll, text
>         Attachments: PDFBOX-3370-double-l.pdf
>
>
> When trying to read some pdf with words containing LL , the library draws 
> only one of the L. Perform a test with the iTextPDF library and working 
> properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to