[
https://issues.apache.org/jira/browse/PDFBOX-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler closed PDFBOX-694.
-------------------------------------
Resolution: Not A Problem
Assignee: Andreas Lehmkühler
The space between the single numbers is to big and it's recognized as space. I
guess we can't change that. If you extract the text using adobe reader (select
the text, copy and paste it to an unicode capable editor) you'll get very the
same.
Closed as "Not A Problem"
> When extract text, the title showing in incorrectness position.
> ----------------------------------------------------------------
>
> Key: PDFBOX-694
> URL: https://issues.apache.org/jira/browse/PDFBOX-694
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Environment: jdk 1.6.0 u 18
> fedora 12
> Reporter: Engle
> Assignee: Andreas Lehmkühler
> Attachments: 000001_2005_1_9.pdf, beforetrimed.txt, Screenshot-1.png,
> Screenshot.png
>
>
> 1. When I call PDFTextStripper to extract text from the PDF file
> (000001_2005_1_9.pdf). I get the title at the end of the text document. The
> result is 'beforetrim.txt'.
> 2. The bug are:
> 2.1 The title text is at the end of the text. It is in the begin of the
> document (Snapshot.png).
> 2.2 There is white space between the number, but in the adobe read show
> (Snapshot.png), there is no any space.
> 2.3 The page footer is at the start of the text (beforetim.txt).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira