[jira] [Commented] (PDFBOX-1502) Not Extracting Text from PDF Document

JIRA Sat, 15 Jun 2013 04:21:36 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684147#comment-13684147
 ]


Andreas Lehmkühler commented on PDFBOX-1502:
--------------------------------------------

As Maruan already said, everything works as expected. You have to use some 
other ways (there are some examples on how to handle annotations) if you want 
to extract the text you're looking for.
                
> Not Extracting Text from PDF Document
> -------------------------------------
>
>                 Key: PDFBOX-1502
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1502
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 0.8.0-incubator, 1.7.1, 1.8.0
>         Environment: Mac OS , jdk 1.7
>            Reporter: deepak
>            Assignee: Andreas Lehmkühler
>         Attachments: PDFBOX1502-RenewalAdvice.txt, 
> Renewal_Advice_Edited_Extracted_Text.txt, Renewal_Advice_Edited.pdf, Renewal 
> Advice .pdf
>
>
> PDDocument  document = PDDocument.load(Inputstream);
> PDFTextStripper stripper = new PDFTextStripper();
> stripper.getText(document)   is not returning some text content in the 
> attached PDF Document . It is just returning the form fields but the values 
> are empty .  The bug is reproducible both in 1.8.0-Snapshot and 1.7.1 
> codebase.
> Please help in resolving the issue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PDFBOX-1502) Not Extracting Text from PDF Document

Reply via email to