[jira] Created: (PDFBOX-781) PDFBOX 1.2.1 Text parsing issue

Mahesh (JIRA) Tue, 20 Jul 2010 10:01:46 -0700

PDFBOX 1.2.1 Text parsing issue
-------------------------------

                 Key: PDFBOX-781
                 URL: https://issues.apache.org/jira/browse/PDFBOX-781
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 1.2.1
         Environment: Windows XP, Java
            Reporter: Mahesh
            Priority: Trivial



Hi,

Thanks a lot for PDFBOX.
I have been using pdfbox 1.2.1 for text parsing.I have customized my Text 
parsing class by extending PDFTextStripper class.
The issue is : Though i am able to get all required string data (such as x/y 
position,width ,height,font name,font size) , the text that is extracted using 
Textposition object's getCharacter() returns the full text line except for the 
last charater.This last character appears as next line text.

        Ex: (Line in PDF ): "My name is Mahesh"
               (Parsed data): "My name is Mahes"
                                           "h"
Please help me in this regard.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PDFBOX-781) PDFBOX 1.2.1 Text parsing issue

Reply via email to