[ 
https://issues.apache.org/jira/browse/PDFBOX-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074889#comment-15074889
 ] 

Tilman Hausherr commented on PDFBOX-3175:
-----------------------------------------

The file from PDFBOX-1001 works fine with DrawPrintTextLocations (i.e. the 
marks are as expected), but your own file does have problems so I won't close 
this issue now :-) The red marks are too small, and the blue ones are 
completely wrong, probably because of the rotation. I'll need to compare this 
with 1.8 and/or find out why this is so small, and fix DrawPrintTextLocations, 
and then we'll see... (Sometimes bad red marks come from incorrect data in the 
PDF itself)

I'll be busy with other stuff today, so be patient. And if you do write your 
own text extraction, compare it with the existing tests if you intend to 
extract on different types of files.

> PDFTextStreamEngine probably miscalculates text height
> ------------------------------------------------------
>
>                 Key: PDFBOX-3175
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3175
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 2.0.0
>            Reporter: Leo
>
> When parsing a PDF document, TextPosition is created with constant text 
> height, about 2 time smaller than character width, regardless of font size.
> The following workaround to calculate dyDisplay fixes the issue:
>         float verticalScaling = 1/1000f;
>         if (font instanceof PDType3Font) {
>             Matrix fontMatrix = font.getFontMatrix();
>             verticalScaling = fontMatrix.getValue(1, 1);
>         }
>         float dyDisplay = bbox.getHeight() * fontSize * verticalScaling;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to