Am 29.06.2017 um 16:20 schrieb Esteban R:
Hello. I'm using DrawPrintTextLocations.java (from 
http://svn.apache.org/repos/asf/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/util/DrawPrintTextLocations.java)

to find the boxes for a character in the following pdf (a simplified version of 
a real life pdf):

http://wikisend.com/download/727732/boxissue_type3_font.pdf

but none of the boxes matches the printed character (see near the bottom of the 
page).

This is the output that I get:

http://wikisend.com/download/129910/boxissue_type3_font-marked-1.png


I have noticed that some values in the font have negative values... so I don't 
know if the file (or font) is corrupted or there is an issue in pdfbox (or in 
DrawPrintTextLocations). How can I get a tightly enclosing box for the 
character?


Esteban


The charproc has this in the first line:

1 0 -10 -10 10 10 d1


PDF specification:

====

wx wy llx lly urx ury

Set width and bounding box information for the glyph and declare that the glyph description specifies only shape, not colour.

wx denotes the horizontal displacement in the glyph coordinate system; it shall be consistent with the corresponding width in the font’s Widths array. wy shall be 0 (see 9.2.4, "Glyph Positioning and Metrics"). llx and lly denote the coordinates of the lower-left corner, and urx and ury denote the upper-right corner, of the glyph bounding box. The glyph bounding box is the smallest rectangle, oriented with the axes of the glyph coordinate system, that completely encloses all marks placed on the page as a result of executing the glyph’s description. The declared bounding box shall be correct—in other words, sufficiently large to enclose the entire glyph. If any marks fall outside this bounding box, the result is unpredictable.

====

So the bounding box here is too large. It is even larger than the total bounding box that can be found at

Root/Pages/Kids/[0]/Resources/XObject/TPL1/Resources/Font/R20/FontBBox

To get a perfect bounding box for the cyan rectangle, the only thing one could do would be to evaluate the content stream in the charproc:

1 0 -10 -10 10 10 d1
0.01 0 0 0.01 0 0 cm
11.7188 0 m
2.05078 58.4961 l
8.44922 58.4961 l
16.1602 8.59375 l
24.5117 58.4961 l
32.3242 58.4961 l
41.3086 8.59375 l
48.3398 58.4961 l
54.6406 58.4961 l
45.0703 0 l
37.9883 0 l
28.5664 48.2422 l
19.4805 0 l
11.7188 0 l
f

This would be quite tricky... I don't know how Adobe and PDF.js do it. Maybe they drop the individual bounding box and just use the width as height.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to