[ 
https://issues.apache.org/jira/browse/PDFBOX-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236091#comment-17236091
 ] 

Maruan Sahyoun commented on PDFBOX-5023:
----------------------------------------

Hi Richard,

for text extraction the feature is not really needed as the layout should have 
been done correctly by the application creating the PDF (glyph substitution, 
repositioning ...). I think two years ago or so there has been some activity 
getting code in for being able to extract Arabic text. At that time the person 
looking into that was happy with the result. So when the text extraction is 
correct you should be fine. If that reveals a general issue please let us know.

Now for getting the feature into fontbox to handle the tables you're welcome to 
help of course. Keep in mind that we have two use cases - extracting text and 
putting text into PDFs (mainly for forms filling). That would also need some 
work on the pdfbox side to utilize the fontbox part.

BR
Maruan

 

> OpenType Layout tables used in font ArabicTransparent-ARABIC are not 
> implemented in PDFBox and will be ignored
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5023
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5023
>             Project: PDFBox
>          Issue Type: Wish
>          Components: FontBox
>    Affects Versions: 2.0.8
>            Reporter: Richard Azar
>            Priority: Major
>              Labels: fop-teaming
>         Attachments: image-2020-11-20-13-34-12-306.png, log PDFbox.txt, 
> sc1.PNG
>
>
> I am loading a PDF document with TrueType and TrueType CID Fonts (both within 
> same document) and Only TrueType font texts are extracted using 
> tStripper.getText.
> Getting the below error in logs (full logs attached)
> OpenType Layout tables used in font ArabicTransparent-ARABIC are not 
> implemented in PDFBox and will be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to