[
https://issues.apache.org/jira/browse/PDFBOX-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236091#comment-17236091
]
Maruan Sahyoun commented on PDFBOX-5023:
----------------------------------------
Hi Richard,
for text extraction the feature is not really needed as the layout should have
been done correctly by the application creating the PDF (glyph substitution,
repositioning ...). I think two years ago or so there has been some activity
getting code in for being able to extract Arabic text. At that time the person
looking into that was happy with the result. So when the text extraction is
correct you should be fine. If that reveals a general issue please let us know.
Now for getting the feature into fontbox to handle the tables you're welcome to
help of course. Keep in mind that we have two use cases - extracting text and
putting text into PDFs (mainly for forms filling). That would also need some
work on the pdfbox side to utilize the fontbox part.
BR
Maruan
> OpenType Layout tables used in font ArabicTransparent-ARABIC are not
> implemented in PDFBox and will be ignored
> --------------------------------------------------------------------------------------------------------------
>
> Key: PDFBOX-5023
> URL: https://issues.apache.org/jira/browse/PDFBOX-5023
> Project: PDFBox
> Issue Type: Wish
> Components: FontBox
> Affects Versions: 2.0.8
> Reporter: Richard Azar
> Priority: Major
> Labels: fop-teaming
> Attachments: image-2020-11-20-13-34-12-306.png, log PDFbox.txt,
> sc1.PNG
>
>
> I am loading a PDF document with TrueType and TrueType CID Fonts (both within
> same document) and Only TrueType font texts are extracted using
> tStripper.getText.
> Getting the below error in logs (full logs attached)
> OpenType Layout tables used in font ArabicTransparent-ARABIC are not
> implemented in PDFBox and will be ignored.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]