[ https://issues.apache.org/jira/browse/PDFBOX-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212647#comment-17212647 ]
Volker Kunert commented on PDFBOX-4951: --------------------------------------- 1 The font must be loaded twice - yes we have to load it twice because we use the positioning features using FOP's MultiByteFont. We can't store a reference to MultibyteFont in PDType0Font because it is stored in a COSDictionary and recreated - loosing extra attributes in this process. 2 Width of A̋ or Ž̧ return the same size as A or Z for me, there is no new code involved. PDType0Font font = PDType0Font.load(pdDocument, new FileInputStream(fontFile), false); System.out.printf("%f %f%n", font.getStringWidth("A"), font.getStringWidth("A̋")); System.out.printf("%f %f%n", font.getStringWidth("Z"), font.getStringWidth("Ž̧")); 639,000000 639,000000 572,000000 572,000000 3 Which variant of Z plus accent is not OK? They look good to me. 4 The bug in FOP (FOP-2969) means e.g. that the accent is not located above the current letter, instead e.g. above the following letter. 5 Bengali processing and FOP-positioning do both reorder the glyphs -- so they can't work together at the moment. Integration on the base of the current implementation or based on FOP seems possible but needs a programmer who knows Bengali language and script. 6 IMHO the user should be required to explicitly enable FOP-positioning, in order not to break other algorithms. Possibly it could be enabled for script latn. 7 I am preparing little corrections to my code. > Sequences with combining letters are rendered incorrectly > --------------------------------------------------------- > > Key: PDFBOX-4951 > URL: https://issues.apache.org/jira/browse/PDFBOX-4951 > Project: PDFBox > Issue Type: Bug > Components: Rendering > Affects Versions: 2.0.21 > Reporter: Volker Kunert > Priority: Major > Attachments: DIN_SPEC_91379_Sequences-aa.pdf, > DIN_SPEC_91379_Sequences-ab.pdf, DIN_SPEC_91379_Sequences-ac.pdf, > DIN_SPEC_91379_Sequences.txt, DefaultScriptProcessor.java, > ExamplePdfboxFopPos.java, ExamplePdfboxFopPos.pdf, > ExamplePdfboxFopPosForm.java, ExamplePdfboxFopPosForm.pdf, TestPdfbox.java, > TestPdfboxFop2.java, TestPdfboxFop2.pdf, TestPdfboxJava2D.java, > TestPdfboxJava2D.pdf, patch-2020-10-02.txt, pdfbox.pdf, screenshot-1.png > > > Accented Letters composed of Unicode base letter and combining accent are > rendered wrong. E.g. with 0041 030B LATIN CAPITAL LETTER A WITH COMBINING > DOUBLE ACUTE ACCENT the accent appears at the right hand side of the letter > A, not above the letter A. > The position is wrong for most of the sequences defined in the following spec: > DIN SPEC 91379: Characters in Unicode for the electronic processing of names > and data > exchange in Europe; with digital attachment > [https://www.xoev.de/downloads-2316#StringLatin] > [https://www.din.de/de/wdc-beuth:din21:301228458] > > The correct rendering should look like the output of hb-view 2.6.8, see files > DIN_SPEC_91379_Sequences*.pdf. > The output of PDFBox is appended in pdfbox.pdf, which is created by running > TestPdfbox.java. The sequences are read from file > DIN_SPEC_91379_Sequences.txt. > > Font used for testing: NotoSansMono-Regular.ttf, see > [https://www.google.com/get/noto/] > download: > [https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMono-hinted.zip] > See also FOP-2969 > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org