[jira] [Commented] (FOP-1969) Surrogate pairs not treated as single unicode codepoint for display purposes

Simone Rondelli (JIRA) Tue, 26 Jul 2016 12:19:37 -0700

    [ 
https://issues.apache.org/jira/browse/FOP-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394370#comment-15394370
 ]


Simone Rondelli commented on FOP-1969:
--------------------------------------

Hi Glenn,

I got a first proof of concept that renders the emoji. The first problem is 
that the some emoji is composed by more then one codepoint like flags 
({{&#x1f1ee;&#x1f1f9;}}) and families 
({{&#x1f468;&#x200d;&#x1f468;&#x200d;&#x1f466;}}) . I found that the 
information one how to merge more codepoints (or glyph) into a unique glyph is 
described in the ligatures table in the font.

The ligatures table is associated to a script and in the font that I'm using 
(EmojiOne) this table is associated with {{latn}} script. The problem is in 
{{GlyphMapping.processWordMapping}} where the script of the text is retrieved 
using {{String script = text.getScript();}}. The value returned is {{zyyy}}, 
{{SCRIPT_UNDEFINED}}, for text composed by just emojies and {{auto}} for mixed 
text (latn/cjk + emoji). 

# Is this a bug of the font or ApacheFOP? 
# What would be a good approach to fix it?

I thought that I could modify the logic inside {{GliphTable.matchLookups}} to 
select {{*}} when the script is {{zyyy}} or {{auto}}. But I jhave te feeling 
that this could break something. Am I right?

> Surrogate pairs not treated as single unicode codepoint for display purposes
> ----------------------------------------------------------------------------
>
>                 Key: FOP-1969
>                 URL: https://issues.apache.org/jira/browse/FOP-1969
>             Project: FOP
>          Issue Type: Improvement
>          Components: unqualified
>    Affects Versions: trunk
>         Environment: Operating System: All
> Platform: All
>            Reporter: Glenn Adams
>         Attachments: testing.fo, testing.fo, testing.pdf, testing.pdf, 
> testing.xml, testing.xsl
>
>
> unicode codepoints outside of the BMP (base multilingual plane), i.e., whose 
> scalar value is greater than 0xFFFF (65535), are coded as UTF-16 surrogate 
> pairs in Java strings, which pair should be treated as a single codepoint for 
> the purpose of mapping to a glyph in a font (that supports extra-BMP 
> mappings);
> at present, FOP does not correctly handle this case in simple (non complex 
> script) rendering paths;
> furthermore, though some support has been added to handle this in the complex 
> script rendering path, it has not yet been tested, so is not necessarily 
> working there either;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FOP-1969) Surrogate pairs not treated as single unicode codepoint for display purposes

Reply via email to