[
https://issues.apache.org/jira/browse/PDFBOX-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375103#comment-17375103
]
Daniel Gredler commented on PDFBOX-5230:
----------------------------------------
Yep, that seems to be the behavior for PDFKit.
I also tried the same using standard Java2D, and the ZWNJ chars are also not
displayed – code below, output attached (zwnj.png).
{code:java}
public class ZwnjG2dTest {
public static void main(String[] args) throws Exception {
BufferedImage img = new BufferedImage(500, 300,
BufferedImage.TYPE_INT_ARGB);
Graphics2D g2d = img.createGraphics();
g2d.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS,
RenderingHints.VALUE_FRACTIONALMETRICS_ON);
g2d.setRenderingHint(RenderingHints.KEY_ANTIALIASING,
RenderingHints.VALUE_ANTIALIAS_ON);
g2d.setColor(Color.WHITE);
g2d.fillRect(0, 0, img.getWidth(), img.getHeight());
g2d.setColor(Color.BLACK);
Font tahoma = Font.createFont(Font.TRUETYPE_FONT, new
File("C:/Windows/Fonts/tahoma.ttf")).deriveFont(50f);
g2d.setFont(tahoma);
g2d.drawString("t\u200Ce\u200Cs\u200Ct\u200C \u200C1", 50, 50); //
U+200C = zero width non-joiner
Font arial = Font.createFont(Font.TRUETYPE_FONT, new
File("C:/Windows/Fonts/ARIALUNI.TTF")).deriveFont(50f);
g2d.setFont(arial);
g2d.drawString("t\u200Ce\u200Cs\u200Ct\u200C \u200C2", 50, 100); //
U+200C = zero width non-joiner
Font noto = Font.createFont(Font.TRUETYPE_FONT, new
File("noto-sans-regular.ttf")).deriveFont(50f);
g2d.setFont(noto);
g2d.drawString("t\u200Ce\u200Cs\u200Ct\u200C \u200C3", 50, 150); //
U+200C = zero width non-joiner
ImageIO.write(img, "png", new File("zwnj.png"));
}
}
{code}
> Zero-width non-joiner characters visible in generated PDF
> ---------------------------------------------------------
>
> Key: PDFBOX-5230
> URL: https://issues.apache.org/jira/browse/PDFBOX-5230
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox, PDModel, Writing
> Affects Versions: 2.0.16
> Reporter: Daniel Gredler
> Priority: Major
> Attachments: Af.pdf, zwnj-pdfkit.pdf, zwnj.pdf, zwnj.png
>
>
> I'd like to use the [zero-width
> non-joiner|https://en.wikipedia.org/wiki/Zero-width_non-joiner] (ZWNJ)
> character to prevent character shaping in some cases when using Arabic and
> Indic scripts. This works correctly using some fonts like Arial Unicode
> (character shaping is prevented and no ZWNJ glyph is visible in the PDF), but
> does not work correctly when using fonts like Tahoma or Google Noto Sans
> Regular, where the ZWNJ character is visible in the PDF. The ZWNJ glyph is
> not visible when using these fonts in other programs, like Microsoft Word.
> I suspect that the `advanceWidth` settings in the `hmtx` table should be
> taken into account somehow but are not, because the `advanceWidth` for this
> glyph is 0 in both of these fonts which are erroneously generating visual
> artifacts for the ZWNJ character (Tahoma and Google Noto Sans Regular).
> Test case generating the attached PDF file:
> {code:java}
> public class ZwnjTest {
> public static void main(String[] args) throws IOException {
> try (PDDocument document = new PDDocument()) {
> PDPage page = new PDPage(PDRectangle.LETTER);
> document.addPage(page);
> try (PDPageContentStream stream = new
> PDPageContentStream(document, page)) {
> // Tahoma: ZWNJ glyph is a vertical bar, but advanceWidth in
> hmtx table is 0 -> shown in PDF anyway (unexpected)
> PDFont tahoma = PDType0Font.load(document, new
> File("C:/Windows/Fonts/tahoma.ttf"));
> stream.beginText();
> stream.setFont(tahoma, 20);
> stream.newLineAtOffset(50, 650);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C1"); //
> U+200C = zero width non-joiner
> stream.endText();
> // Arial Unicode: ZWNJ glyph contains no outline -> not shown
> in PDF (as expected)
> PDFont arialu = PDType0Font.load(document, new
> File("C:/Windows/Fonts/ARIALUNI.TTF"));
> stream.beginText();
> stream.setFont(arialu, 20);
> stream.newLineAtOffset(50, 600);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C2"); //
> U+200C = zero width non-joiner
> stream.endText();
> // Google Noto Sans Regular: ZWNJ glyph is a vertical bar,
> but advanceWidth in hmtx table is 0 -> shown in PDF anyway (unexpected)
> PDFont gnotos = PDType0Font.load(document, new
> File("noto-sans-regular.ttf"));
> stream.beginText();
> stream.setFont(gnotos, 20);
> stream.newLineAtOffset(50, 550);
> stream.showText("t\u200Ce\u200Cs\u200Ct\u200C \u200C3"); //
> U+200C = zero width non-joiner
> stream.endText();
> }
> document.save("zwnj.pdf");
> }
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]