[ 
https://issues.apache.org/jira/browse/FOP-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602625#comment-15602625
 ] 

Simone Rondelli commented on FOP-1969:
--------------------------------------

The project is already using org.apache.pdfbox:fontbox thus I thought that 
using org.apache.pdfbox:pdfbox as test dependency would not be a big deal.

And yes I did not consider ANT build, but I can easily add the jar.

PDDocument do not contain a load() method. The current implementation of 
extractTextFromPDF() works as expected.

> Surrogate pairs not treated as single unicode codepoint for display purposes
> ----------------------------------------------------------------------------
>
>                 Key: FOP-1969
>                 URL: https://issues.apache.org/jira/browse/FOP-1969
>             Project: FOP
>          Issue Type: Improvement
>          Components: unqualified
>    Affects Versions: trunk
>         Environment: Operating System: All
> Platform: All
>            Reporter: Glenn Adams
>         Attachments: Urdu.zip, pcltest.zip, single-byte.zip, testing.fo, 
> testing.fo, testing.pdf, testing.pdf, testing.xml, testing.xsl, tiffttc.zip
>
>
> unicode codepoints outside of the BMP (base multilingual plane), i.e., whose 
> scalar value is greater than 0xFFFF (65535), are coded as UTF-16 surrogate 
> pairs in Java strings, which pair should be treated as a single codepoint for 
> the purpose of mapping to a glyph in a font (that supports extra-BMP 
> mappings);
> at present, FOP does not correctly handle this case in simple (non complex 
> script) rendering paths;
> furthermore, though some support has been added to handle this in the complex 
> script rendering path, it has not yet been tested, so is not necessarily 
> working there either;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to