[
https://issues.apache.org/jira/browse/PDFBOX-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064465#comment-13064465
]
Funfel edited comment on PDFBOX-1061 at 7/13/11 10:09 AM:
----------------------------------------------------------
I've spotted this problem during working with this file -
PageFrom_DURP2011_115_0666_01_p7.pdf
I've attached one page with bullet punctuation.
was (Author: funfel):
I've spotted this problem during working with this file.
I've attached one page with bullet punctuation.
> PDFBox can't correctly extract text after bullet punctuation
> ------------------------------------------------------------
>
> Key: PDFBOX-1061
> URL: https://issues.apache.org/jira/browse/PDFBOX-1061
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing, Text extraction
> Affects Versions: 1.5.0, 1.6.0
> Environment: jdk.1.6
> Reporter: Funfel
> Fix For: 1.7.0
>
> Attachments: PageFrom_DURP2011_115_0666_01_p7.html,
> PageFrom_DURP2011_115_0666_01_p7.pdf
>
>
> PDFBox can't correctly extract text after bullet punctuation.
> After a bullet punctuation whole line gets strange encoding, but the next
> line is correct.
> Probably some font/encoding problem.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira