Hi,
Please retry with the current version, which is PDFBox 2.0.18, soon 2.0.19.
Then use the DrawPrintTextLocations.java example to see if the cyan
bounds are correct. If not, please open an issue for that one. Don't
reuse font subsets.
Tilman
Am 20.02.2020 um 04:00 schrieb jd9...@rit.edu:
Hello,
I am currently a researcher at RIT's DPRL, using PDFBox 2.0.7 with
MHVHUS+CMR10 Type 1 font and PDFTextStripper. I am interested in
finding the matrix (or values) used to translate diacritic elements,
or a similar way to find the positioning of diacritic elements.
In my example, the Type 1 font is an embedded subset within the pdf
document using Type1Encoding. When I access the glyph for the
diacritic element eg. dieresis, through getPath, the position of the
path is above the lowercase characters. For uppercase characters, I
can get the diacritic, however the position of the path is the same as
lowercase characters, as opposed to placed above the uppercase
character. In addition, the name is the combining diacritic. E.G.
dieresiscmb, which isn't available in getCharStringsDict or getCharSet.
On a side note, combining diacritical names cause problems when using
the PDPageContentStream class to showText of the unicode; resulting in
an IllegalArgumentException that the combining diacritic does not
exist in the font, even when the character's TextPosition and font
were parsed using PDFTextStripper. Let me know if I should open a
ticket for this issue.
How are the diacritical accents for Type 1 fonts translated from their
stored location into place?
diacriticdieresis.png
(I have cc'd my advisor)
Thank you,
Jessica Diehl
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org