[jira] [Commented] (PDFBOX-1919) Font descriptor flags are not implemented

John Hewson (JIRA) Thu, 12 Jun 2014 09:30:17 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029332#comment-14029332
 ]


John Hewson commented on PDFBOX-1919:
-------------------------------------

{quote}
Furthermore the pdf uses span tags which could be the real culprit here. I've 
to admit that I didn't understand the purpose of that tag yet.
{quote}

Andreas, span tags are probably not relevant - they're really for adding alt 
text to images for screen readers, but they can be used as a hack to add alt 
text for unicode symbols which (really) old screen readers don't recognise. The 
website you link to gives an example of "≥" having a span tag with "greater 
than or equal to" as the text. However, we're trying to extract the Unicode 
text, not the assistive description of it for a screen reader - we really do 
want "≥" as the text for that character, not the string "greater than or equal 
to".

However I'm still not sure what's causing Acrobat to produce two different 
results.

> Font descriptor flags are not implemented
> -----------------------------------------
>
>                 Key: PDFBOX-1919
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1919
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.5, 1.8.6, 2.0.0
>            Reporter: Corentin Regal
>         Attachments: PDFBOX-1919.AdobeReader.txt, PDFBOX-1919.pdf, 
> PDFBOX-1919.txt
>
>
> The font descriptor flags are not set.
> They are described in the document "PDF reference 1.7" at : 5.7.1 Font 
> Descriptor Flags
> The methods in PDFontDescriptor are ready but never called :
> setFlags()
> setSerif()
> setAllCap() which is used in a lot of PDF
> ...
> I saw some TODO that relate to that issue in the code, is it planned to be 
> implemented soon?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1919) Font descriptor flags are not implemented

Reply via email to