[ 
https://issues.apache.org/jira/browse/PDFBOX-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Engle updated PDFBOX-694:
-------------------------

    Description: 
1. When I call PDFTextStripper to extract text from the PDF file 
(000001_2005_1_9.pdf). I get the title at the end of the text document. The 
result is 'beforetrim.txt'.
2. The bug are: 
    2.1 The title text is at the end of the text. It is in the begin of the 
document (Snapshot.png).
    2.2 There is white space between the number, but in the adobe read show 
(Snapshot.png), there is no any space.
    2.3 The page footer is at the start of the text (beforetim.txt).

  was:When I call PDFTextStripper to extract text from the PDF file. I get the 
title at the end of the text document. I


> When extract text, the title showing in incorrectness position. 
> ----------------------------------------------------------------
>
>                 Key: PDFBOX-694
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-694
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>         Environment: jdk 1.6.0 u 18
> fedora 12
>            Reporter: Engle
>             Fix For: 1.1.0, 1.2.0
>
>         Attachments: 000001_2005_1_9.pdf, beforetrimed.txt, Screenshot-1.png, 
> Screenshot.png
>
>
> 1. When I call PDFTextStripper to extract text from the PDF file 
> (000001_2005_1_9.pdf). I get the title at the end of the text document. The 
> result is 'beforetrim.txt'.
> 2. The bug are: 
>     2.1 The title text is at the end of the text. It is in the begin of the 
> document (Snapshot.png).
>     2.2 There is white space between the number, but in the adobe read show 
> (Snapshot.png), there is no any space.
>     2.3 The page footer is at the start of the text (beforetim.txt).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to