Re: supporting non-BMP character content

Glenn Adams Thu, 30 Jun 2016 14:29:45 -0700

Yes, this is the basic approach one would take. You will need to track down
all of the methods/fields where char is is used and change it to int or add
a new method with an int signature (for methods). Then you will need to
find all call sites for these use sites and change them to extract Unicode
code points (integers in the range [0,1114111]) to pass to the changed/new
methods.


You will need to do this while not breaking the current tests and you will
need to add new tests to cover non-BMP use cases.

You will probably want to create a fork of the FOP repository in github and
do your work on a branch of that fork.

Good Luck,
Glenn

On Thu, Jun 30, 2016 at 9:40 AM, Simone Rondelli <mone.j...@gmail.com>
wrote:

> Hi FOP Users,
>
> I am working on a project that uses Apache FOP and, as part of that
> project, need to fix FOP-1969 [1], which has to do with supplementary
> character support (surrogate pairs). I have obtained approval to contribute
> these changes back to the community. I want to run my design past the list
> (and especially Glenn Adams) and ask a few questions before proceeding:
>
> 1. Read the CMAP from OpenFont.readCMAP() implementing the case: cmapPID
> == 3 && cmapEID == 10 and cmapFormat == 12. This way I could fill correctly
> the unicodeMappings List.
>
>  2. Fix the class GLyphMapping to support non-BMP code points (there are
> already some TODO in the class for the support of the non-BMP code points)
>
> 3. The class GlyphMapping uses the org.apache.fop.fonts.Font class methods
> like Font.hasChar(char c), Font.getCharWidth(char c), Font.mapChar(char c)
> etc.. since they accept a single char and a surrogate pair is composed by
> two chars I will need to modify the Font class as well. I think that we
> should add overloaded methods that accept int so that we can pass the code
> points. An alternative is to create a different set of method with the
> Codepoint suffix: Font.hasCodepoint(int cp), Font.getCodePointWidth(int
> cp), Font.mapCodepoint(int cp) etc.
>
> 4. The class Font uses the interface Typeface that has the same problem:
> methods that accept char. We should either change this interface or one of
> its subclasses like MultiByteFont or CIDFont (which denote font with a
> large set of code points.
>
> So far my research stopped at this point and before to proceed I would
> like some feedback to know whether I'm taking a good direction and If I'm
> missing something.
>
> Thanks,
> Simone Rondelli
>

Re: supporting non-BMP character content

Reply via email to