Am 29.06.2017 um 16:20 schrieb Esteban R:
Hello. I'm using DrawPrintTextLocations.java (from
http://svn.apache.org/repos/asf/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/util/DrawPrintTextLocations.java)
to find the boxes for a character in the following pdf (a simplified version of
a real life pdf):
http://wikisend.com/download/727732/boxissue_type3_font.pdf
but none of the boxes matches the printed character (see near the bottom of the
page).
This is the output that I get:
http://wikisend.com/download/129910/boxissue_type3_font-marked-1.png
I have noticed that some values in the font have negative values... so I don't
know if the file (or font) is corrupted or there is an issue in pdfbox (or in
DrawPrintTextLocations). How can I get a tightly enclosing box for the
character?
Esteban
The charproc has this in the first line:
1 0 -10 -10 10 10 d1
PDF specification:
====
wx wy llx lly urx ury
Set width and bounding box information for the glyph and declare that
the glyph description specifies only shape, not colour.
wx denotes the horizontal displacement in the glyph coordinate system;
it shall be consistent with the corresponding width in the font’s Widths
array. wy shall be 0 (see 9.2.4, "Glyph Positioning and Metrics").
llx and lly denote the coordinates of the lower-left corner, and urx and
ury denote the upper-right corner, of the glyph bounding box. The glyph
bounding box is the smallest rectangle, oriented with the axes of the
glyph coordinate system, that completely encloses all marks placed on
the page as a result of executing the glyph’s description. The declared
bounding box shall be correct—in other words, sufficiently large to
enclose the entire glyph. If any marks fall outside this bounding box,
the result is unpredictable.
====
So the bounding box here is too large. It is even larger than the total
bounding box that can be found at
Root/Pages/Kids/[0]/Resources/XObject/TPL1/Resources/Font/R20/FontBBox
To get a perfect bounding box for the cyan rectangle, the only thing one
could do would be to evaluate the content stream in the charproc:
1 0 -10 -10 10 10 d1
0.01 0 0 0.01 0 0 cm
11.7188 0 m
2.05078 58.4961 l
8.44922 58.4961 l
16.1602 8.59375 l
24.5117 58.4961 l
32.3242 58.4961 l
41.3086 8.59375 l
48.3398 58.4961 l
54.6406 58.4961 l
45.0703 0 l
37.9883 0 l
28.5664 48.2422 l
19.4805 0 l
11.7188 0 l
f
This would be quite tricky... I don't know how Adobe and PDF.js do it.
Maybe they drop the individual bounding box and just use the width as
height.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]