Hi,
Please retry with the current version, which is PDFBox 2.0.18, soon 2.0.19.
Then use the DrawPrintTextLocations.java example to see if the cyan bounds are correct. If not, please open an issue for that one. Don't reuse font subsets.
Tilman

Am 20.02.2020 um 04:00 schrieb jd9...@rit.edu:
Hello,

I am currently a researcher at RIT's DPRL, using PDFBox 2.0.7 with MHVHUS+CMR10 Type 1 font and PDFTextStripper.  I am interested in finding the matrix (or values) used to translate diacritic elements, or a similar way to find the positioning of diacritic elements.

In my example, the Type 1 font is an embedded subset within the pdf document using Type1Encoding.  When I access the glyph for the diacritic element eg. dieresis, through getPath, the position of the path is above the lowercase characters.  For uppercase characters, I can get the diacritic, however the position of the path is the same as lowercase characters, as opposed to placed above the uppercase character.  In addition, the name is the combining diacritic. E.G. dieresiscmb, which isn't available in getCharStringsDict or getCharSet.

On a side note, combining diacritical names cause problems when using the PDPageContentStream class to showText of the unicode; resulting in an IllegalArgumentException that the combining diacritic does not exist in the font, even when the character's TextPosition and font were parsed using PDFTextStripper.  Let me know if I should open a ticket for this issue.

How are the diacritical accents for Type 1 fonts translated from their stored location into place?
diacriticdieresis.png
(I have cc'd my advisor)

Thank you,
Jessica Diehl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org


Reply via email to