Jaume Xaus created PDFBOX-4782:
----------------------------------

             Summary: PDFLayoutTextStripper(), error on getText() method
                 Key: PDFBOX-4782
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4782
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 2.0.18
         Environment: Windows 10
Java Version 1.8.0_241
            Reporter: Jaume Xaus
         Attachments: AN20-0149-0602201842.pdf

I have this code:   
PDDocument pdDoc = PDDocument.load(file);
PDFTextStripper stripper = new PDFLayoutTextStripper();
string = stripper.getText(pdDoc);

In some PDF documents when execute de getText() method I get this error:
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
 at java.lang.String.charAt(String.java:658)
 at 
com.sagedillepasa.gestion.TextLine.isSpaceCharacterAtIndex(PDFLayoutTextStripper.java:269)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to