[
https://issues.apache.org/jira/browse/PDFBOX-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Hewson updated PDFBOX-317:
-------------------------------
Component/s: (was: Text extraction)
> PDFont.getStringWidth() returns incorrect values
> ------------------------------------------------
>
> Key: PDFBOX-317
> URL: https://issues.apache.org/jira/browse/PDFBOX-317
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.6.0, 2.0.0
> Fix For: 2.0.0
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1819754
> Originally submitted by brettpowley on 2007-10-24 23:59.
> For some text in some documents, getStringWidth() returns an incorrect value.
> In some cases it returns zero, which is clearly not correct. In others, it
> returns something that is too short. An example of this follows:
> On the page, this text is part of text that reads "Cash flows from". The
> text in question is delivered to flushText in PDFTextStripper as multiple
> TextPositions, and the ones below are those containing "w" and the next one
> containing "s fr".
> The first one looks like this:
> TextPosition: "w"
> getX=62.824474
> getWidth=6.731968
> getWordSpacing=0.000000
> getWidthOfSpace=2.224000
> getXScale=1.000000
> glyphFactor=999.999939, getXScale=1.000000, getStringWidth=814.000000,
> calculatedFontWidth=0.814000
> averageWidth=0.546769,
> widthUsingSpaces=2.224000
> widthUsingFont=0.546769
> Note that, according to getStringWidth(), the width of this text is 0.841
> meaning it would end at 62.82 + 0.841 = 63.66.
> According to getWidth(), it ought to end at 62.82 + 6.73 = 69.55.
> When we look at the next chunk of text:
> TextPosition: "s fr"
> getX=69.336563 getWidth=12.518410
> we see that it does in fact start immediately after the previous one -- so
> the width from getStringWidth() for the first one was incorrect.
> The font is a PDType1Font and its name appears to be
> "YOTPKO+HelveticaNeue-Bold*1".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)