[
https://issues.apache.org/jira/browse/PDFBOX-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brian Carrier resolved PDFBOX-61.
---------------------------------
Resolution: Fixed
This was resolved with the patch for PDFBOX-349. Note that the attached file
in this bug still has some minor spacing issues, but they are basically the
same issues that Adobe Reader has.
> Spaces in extracted file
> ------------------------
>
> Key: PDFBOX-61
> URL: https://issues.apache.org/jira/browse/PDFBOX-61
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1208824
> Originally submitted by nobody on 2005-05-25 16:40.
> In trying to integrate with lucene, I was having
> problems. The Lucene people suggested that I check
> the output of extract utility against one of my test pdf's.
> When I did, I saw spaces placed inside many of the
> words. I was on version 0.7.0. So I downloaded 0.7.1
> and see the same results.
> One of the test files where I see this issue is attached.
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1208824&file_id=135995
> Tom_3.pdf (application/pdf), 10145 bytes
> Test pdf file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.