[jira] [Commented] (PDFBOX-4951) Sequences with combining letters are rendered incorrectly

Maruan Sahyoun (Jira) Tue, 13 Oct 2020 12:23:14 -0700


    [ 
https://issues.apache.org/jira/browse/PDFBOX-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213323#comment-17213323
 ]


Maruan Sahyoun commented on PDFBOX-4951:
----------------------------------------

I'm also in favour of (currently) putting it into the example subproject. 
Reason is that fop's language/script support and PDFBox's language/script 
support are currently exclusive e.g. PDFBox supports Bengali where fop doesn't. 
AFAIU fop also does a lot of additional font parsing on top of fontbox so we'd 
need to integrate the Bengali changes into fop so we can reuse that. Or some of 
the fop core functionality moves into fontbox to benefit both or ...

I fear that accepting the patch as is might lock us into a direction which we 
can not revert (as it might become a public API/functionality). Having said 
that I do see a benefit in fop and PDFBox joining some of the functionality 
(maybe only benefitting PDFBox functionality wise) as solving some text 
positioning issues (I'm avoiding the term layout here) is something which might 
open PDFBox for a wider set of users as of today things like form filling can 
only be done for a small set of languages.

 

> Sequences with combining letters are rendered incorrectly
> ---------------------------------------------------------
>
>                 Key: PDFBOX-4951
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4951
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.21
>            Reporter: Volker Kunert
>            Priority: Major
>         Attachments: DIN_SPEC_91379_Sequences-aa.pdf, 
> DIN_SPEC_91379_Sequences-ab.pdf, DIN_SPEC_91379_Sequences-ac.pdf, 
> DIN_SPEC_91379_Sequences.txt, DefaultScriptProcessor.java, 
> ExamplePdfboxFopPos-By-Tilman.pdf, ExamplePdfboxFopPos.java, 
> ExamplePdfboxFopPos.pdf, ExamplePdfboxFopPosForm.java, 
> ExamplePdfboxFopPosForm.pdf, TestPdfbox.java, TestPdfboxFop2.java, 
> TestPdfboxFop2.pdf, TestPdfboxJava2D.java, TestPdfboxJava2D.pdf, 
> patch-2020-10-02.txt, pdfbox.pdf, screenshot-1.png
>
>
> Accented Letters composed of Unicode base letter and combining accent are 
> rendered wrong. E.g. with 0041 030B LATIN CAPITAL LETTER A WITH COMBINING 
> DOUBLE ACUTE ACCENT the accent appears at the right hand side of the letter 
> A, not above the letter A.
> The position is wrong for most of the sequences defined in the following spec:
> DIN SPEC 91379: Characters in Unicode for the electronic processing of names 
> and data 
>  exchange in Europe; with digital attachment
>  [https://www.xoev.de/downloads-2316#StringLatin]
>  [https://www.din.de/de/wdc-beuth:din21:301228458]
>  
> The correct rendering should look like the output of hb-view 2.6.8, see files 
> DIN_SPEC_91379_Sequences*.pdf.
> The output of PDFBox is appended in pdfbox.pdf, which is created by running 
> TestPdfbox.java. The sequences are read from file 
> DIN_SPEC_91379_Sequences.txt.
>  
> Font used for testing: NotoSansMono-Regular.ttf, see 
> [https://www.google.com/get/noto/] 
> download: 
> [https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMono-hinted.zip]
>  See also FOP-2969
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-4951) Sequences with combining letters are rendered incorrectly

Reply via email to