[ 
https://issues.apache.org/jira/browse/PDFBOX-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299993#comment-14299993
 ] 

John Hewson edited comment on PDFBOX-2649 at 1/31/15 9:28 PM:
--------------------------------------------------------------

PDFont.getStringWidth is one of the few 1.8 methods which hasn't been rewritten 
yet in 2.0 and is known to be broken. There's a "todo" comment to that effect 
in the code. Obviously this is something which we need to address.

{quote}
I can see that there is a confusion in the process between GID and CID values. 
The reason may not be entirely clear to me, but 
PDCIDFontType2Embedder.buildWidths(COSDictionary cidFont) seems to name "cid" 
something that in my opinion is still a glyph id. 
{quote}

You're probably not going to find these sorts of basic mistakes in the newer 
2.0 code, it's been fairly well tested. That method builds an _Identity_ 
CIDToGIDMap, where CID = GID. It's also not related to the glyph widths you get 
from getStringWidth().

{code}
when it comes to PDCIDFont.getWidth(int), the "widths" map that should 
presumably contain cid->width values in reality contains gid->width.
{code}

No, the widths array maps CIDs to widths. This is correct.

In summary, the line with the "todo" is the case of problem:

{code}
public float getStringWidth(String text) throws IOException
{
    float width = 0;
    int offset = 0, length = text.length();
    while (offset < length)
    {
        int codePoint = text.codePointAt(offset);
        offset += Character.charCount(codePoint);
        width += getWidth(codePoint); // todo: *no* getWidth expects a PDF char 
code, not a Unicode code point
    }
    return width;
}
{code}


was (Author: jahewson):
PDFont.getStringWidth is one of the few 1.8 methods which hasn't been rewritten 
yet in 2.0 and is known to be broken. There's a "todo" comment to that effect 
in the code. Obviously this is something which we need to address.

{quote}
I can see that there is a confusion in the process between GID and CID values. 
The reason may not be entirely clear to me, but 
PDCIDFontType2Embedder.buildWidths(COSDictionary cidFont) seems to name "cid" 
something that in my opinion is still a glyph id. 
{quote}

You're probably not going to find these sorts of basic mistakes in the newer 
2.0 code, it's been fairly well tested. That method builds an _Identity_ 
CIDToGIDMap, where CID = GID. It's also not related to the glyph widths you get 
from getStringWidth().

{code}
when it comes to PDCIDFont.getWidth(int), the "widths" map that should 
presumably contain cid->width values in reality contains gid->width.
{code}

No, the widths array maps CIDs to widths. This is correct.

In summary, the line with the "todo" is the case of problem:

{quote}
public float getStringWidth(String text) throws IOException
{
    float width = 0;
    int offset = 0, length = text.length();
    while (offset < length)
    {
        int codePoint = text.codePointAt(offset);
        offset += Character.charCount(codePoint);
        width += getWidth(codePoint); // todo: *no* getWidth expects a PDF char 
code, not a Unicode code point
    }
    return width;
}
{quote}

> Character widths incorrect in a loaded font
> -------------------------------------------
>
>                 Key: PDFBOX-2649
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2649
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.0
>            Reporter: Alex Nevidomsky
>
> {code}
> @Test
> testRelativeWidth() {
> PDFont font = PDType0Font.load(document, 
> this.getClass().getResourceAsStream("/LiberationSans-Regular.ttf"));
> float wO = font.getStringWidth("O");
> float wP = font.getStringWidth("P");
> float wN = font.getStringWidth("N");
> Assert.assertTrue("O must be wider than P", wO>wP);
> Assert.assertTrue("O must be wider than N", wO>wN);
> }
> {code}
> I can see that there is a confusion in the process between GID and CID 
> values. The reason may not be entirely clear to me, but 
> PDCIDFontType2Embedder.buildWidths(COSDictionary cidFont) seems to name "cid" 
> something that in my opinion is still a glyph id. And when it comes to 
> PDCIDFont.getWidth(int), the "widths" map that should presumably contain 
> cid->width values in reality contains git->width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to