[ 
https://issues.apache.org/jira/browse/PDFBOX-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237525#comment-17237525
 ] 

Michael Klink commented on PDFBOX-5023:
---------------------------------------

[~tilman]

{quote}It isn't implemented, either because it is difficult or because people 
think it is difficult.{quote}

I think in particular because it does not really match the current text 
stripper architecture which mostly works on a glyph-by-glyph base, has font and 
positioning information per glyph, and maps back and forth between glyph and 
Unicode code point. For an *ActualText* you only have the position and size of 
the full string (length of string and number of drawn glyphs need not match), 
multiple fonts and other parameters might be used in drawing it, ...

So it's not difficult per se but simply does not fit in well.

> OpenType Layout tables used in font ArabicTransparent-ARABIC are not 
> implemented in PDFBox and will be ignored
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5023
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5023
>             Project: PDFBox
>          Issue Type: Wish
>          Components: FontBox, Text extraction
>    Affects Versions: 2.0.8
>            Reporter: Richard Azar
>            Priority: Major
>              Labels: fop-teaming
>         Attachments: ExtractText.txt, log PDFbox.txt, pdfsample.pdf, sc1.PNG, 
> sc2.PNG, sc3.PNG
>
>
> I am loading a PDF document with TrueType and TrueType CID Fonts (both within 
> same document) and Only TrueType font texts are extracted usingĀ 
> tStripper.getText.
> Getting the below error in logs (full logs attached)
> OpenType Layout tables used in font ArabicTransparent-ARABIC are not 
> implemented in PDFBox and will be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to